[BUG] Facing Too many errors for endpoint 'https://orchestrator.api.datadoghq.com/api/v2 error for datadog cluster agent #24000

pochavan · 2024-03-22T13:22:34Z

Agent Environment

Cluster Agent version: 7.52.0
Datadog agentversion:7.50.3

Describe what happened:

I have installed datadog operator and agent on my Kubernetes cluster using operator[Deploy an Agent with the Operator]
following is my agent yaml file

apiVersion: datadoghq.com/v2alpha1
kind: DatadogAgent
metadata:
  name: datadog
spec:
  global:
    clusterName: microservice-demo-app
    registry: public.ecr.aws/datadog
    site: api.datadoghq.com
    credentials:
      apiSecret:
        secretName: datadog-secret
        keyName: api-key
  features:
    logCollection:
      enabled: true
      containerCollectAll: true
    orchestratorExplorer: 
      enabled: true
 override:
    clusterAgent:
      image:
        name: gcr.io/datadoghq/cluster-agent:latest
    nodeAgent:
      image:
        name: gcr.io/datadoghq/agent:latest

agent is installed successfully but I am not able to see any data in Kubernetes explorer section (https://app.datadoghq.com/orchestration/overview/cluster)

Getting following error in datadog-cluster-agent pod

2024-03-22 13:10:56 UTC | CLUSTER | INFO | (pkg/clusteragent/admission/controllers/secret/controller.go:78 in Run) | Starting secrets controller for datadog/webhook-certificate
2024-03-22 13:10:56 UTC | CLUSTER | INFO | (client-go@v0.28.6/tools/leaderelection/leaderelection.go:260 in func1) | successfully acquired lease datadog/datadog-leader-election
2024-03-22 13:10:56 UTC | CLUSTER | INFO | (pkg/util/kubernetes/apiserver/leaderelection/leaderelection_engine.go:152 in func1) | New leader "datadog-cluster-agent-67c4bc5bb-ss2vx"
2024-03-22 13:10:56 UTC | CLUSTER | INFO | (pkg/util/kubernetes/apiserver/leaderelection/leaderelection_engine.go:158 in func2) | Started leading as "datadog-cluster-agent-67c4bc5bb-ss2vx"...
2024-03-22 13:10:56 UTC | CLUSTER | INFO | (comp/core/workloadmeta/collectors/internal/kubeapiserver/kubeapiserver.go:138 in startReadiness) | All (2) K8S reflectors synced to workloadmeta
2024-03-22 13:10:57 UTC | CLUSTER | INFO | (pkg/collector/worker/check_logger.go:40 in CheckStarted) | check:orchestrator | Running check...
2024-03-22 13:10:57 UTC | CLUSTER | INFO | (pkg/collector/worker/check_logger.go:40 in CheckStarted) | check:kubernetes_apiserver | Running check...
2024-03-22 13:10:57 UTC | CLUSTER | INFO | (client-go@v0.28.6/rest/request.go:697 in Infof) | Waited for 1.071776711s due to client-side throttling, not priority and fairness, request: GET:https://172.20.0.1:443/api/v1/persistentvolumes?limit=500&resourceVersion=0
2024-03-22 13:10:58 UTC | CLUSTER | INFO | (pkg/util/kubernetes/apiserver/leaderelection/leaderelection.go:218 in EnsureLeaderElectionRuns) | Leader election running, current leader is "datadog-cluster-agent-67c4bc5bb-ss2vx"
2024-03-22 13:10:58 UTC | CLUSTER | ERROR | (comp/forwarder/defaultforwarder/worker.go:191 in process) | Error while processing transaction: error "404 Not Found" while sending transaction to "https://orchestrator.api.datadoghq.com/api/v2/orch", rescheduling it: "{\"errors\":[\"Not found\"]}"
2024-03-22 13:10:58 UTC | CLUSTER | ERROR | (comp/forwarder/defaultforwarder/worker.go:187 in process) | Too many errors for endpoint 'https://orchestrator.api.datadoghq.com/api/v2/orch': retrying later
2024-03-22 13:10:58 UTC | CLUSTER | ERROR | (comp/forwarder/defaultforwarder/worker.go:187 in process) | Too many errors for endpoint 'https://orchestrator.api.datadoghq.com/api/v2/orch': retrying later
2024-03-22 13:10:58 UTC | CLUSTER | ERROR | (comp/forwarder/defaultforwarder/worker.go:187 in process) | Too many errors for endpoint 'https://orchestrator.api.datadoghq.com/api/v2/orch': retrying later
2024-03-22 13:10:58 UTC | CLUSTER | ERROR | (comp/forwarder/defaultforwarder/worker.go:187 in process) | Too many errors for endpoint 'https://orchestrator.api.datadoghq.com/api/v2/orch': retrying later
2024-03-22 13:10:58 UTC | CLUSTER | ERROR | (comp/forwarder/defaultforwarder/worker.go:187 in process) | Too many errors for endpoint 'https://orchestrator.api.datadoghq.com/api/v2/orch': retrying later
2024-03-22 13:10:58 UTC | CLUSTER | ERROR | (comp/forwarder/defaultforwarder/worker.go:187 in process) | Too many errors for endpoint 'https://orchestrator.api.datadoghq.com/api/v2/orch': retrying later
2024-03-22

Describe what you expected:
should able to see all data like pods in Kubernetes explorer section. also logs should be clean

Steps to reproduce the issue:

Additional environment details (Operating System, Cloud provider, etc):

The text was updated successfully, but these errors were encountered:

pochavan added the team/triage label Mar 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Facing Too many errors for endpoint 'https://orchestrator.api.datadoghq.com/api/v2 error for datadog cluster agent #24000

[BUG] Facing Too many errors for endpoint 'https://orchestrator.api.datadoghq.com/api/v2 error for datadog cluster agent #24000

pochavan commented Mar 22, 2024 •

edited

[BUG] Facing Too many errors for endpoint 'https://orchestrator.api.datadoghq.com/api/v2 error for datadog cluster agent #24000

[BUG] Facing Too many errors for endpoint 'https://orchestrator.api.datadoghq.com/api/v2 error for datadog cluster agent #24000

Comments

pochavan commented Mar 22, 2024 • edited

pochavan commented Mar 22, 2024 •

edited