Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: SPM monitor tab does not show Latency data #2088

Open
chkp-talron opened this issue Jan 4, 2024 · 4 comments
Open

[Bug]: SPM monitor tab does not show Latency data #2088

chkp-talron opened this issue Jan 4, 2024 · 4 comments
Labels

Comments

@chkp-talron
Copy link

What happened?

Installed jaeger deployment on k8s cluster, jaeger works properly. I was able to enable the monitor tab by setting various flags in jaeger deployment yamls, but I'm able to see only Request rate data for all of my services , and Error rate data for a single service out of 15 services. but Latency shows "No Data" and the top of the page reads: "No data yet! Please see these instructions on how to set up your span metrics."

i assume that if at least one service show errors data, than the other services does not have errors (the service which shows error if kind of a gateway into our cluster, so it might be reasonable it shows errors and the other dont.)

however, latency is empty all the time for all services. i cant tell if this is due to how metrics are sent to prom, there are tone sof attributes on each on of the metrics, so maybe aggregation needs to be on only a few attributes to be able to show latency data?

I'm sending app metrics using otel SDK,
Collector is configured with spanmetrcis connector

Steps to reproduce

  1. install jaeger helm chart
  2. update deploy env variables
  3. generate app data
  4. check monitor tab

Expected behavior

SPM monitor should show latency data

Relevant log output

No response

Screenshot

SPM

Additional context

No response

Jaeger backend version

Helm Chart: jaeger-0.73.1 Jaeger version: 1.51.0

SDK

Otel nodejs SDK:
"@opentelemetry/api": "^1.4.1",
"@opentelemetry/core": "^1.15.2",
"@opentelemetry/exporter-metrics-otlp-http": "^0.41.2",
"@opentelemetry/exporter-trace-otlp-http": "^0.41.2",
"@opentelemetry/sdk-metrics": "^1.15.2",
"@opentelemetry/sdk-trace-base": "^1.15.2",
"@opentelemetry/sdk-trace-node": "^1.15.2"

Pipeline

OTel SDK -> Otel Collector -> jaeger Collector -> ES

Stogage backend

ES 7

Operating system

Linux

Deployment model

EKS

Deployment configs

collector configMap:

    connectors:
      spanmetrics:
        namespace: spanmetrics
    exporters:
      otlphttp:
        endpoint: http://jaeger-collector:4318
      prometheus:
        endpoint: 0.0.0.0:8765
        resource_to_telemetry_conversion:
          enabled: true
    extensions:
      health_check: {}
      memory_ballast:
        size_in_percentage: 20
    processors:
      batch: {}
      memory_limiter:
        check_interval: 1s
        limit_percentage: 70
        spike_limit_percentage: 30
      k8sattributes:
        auth_type: "serviceAccount"
        passthrough: true
        extract:
          metadata:
          - container.id
          - k8s.pod.name
          - k8s.pod.uid
          - k8s.deployment.name
          - k8s.namespace.name
          - k8s.node.name
        pod_association:
        - sources:
          - from: resource_attribute
            name: container.id
        - sources:
          - from: resource_attribute
            name: k8s.namespace.name
        - sources:
          - from: connection
    receivers:
      otlp:
        protocols:
          grpc:
            endpoint: ${env:MY_POD_IP}:4317
          http:
            endpoint: ${env:MY_POD_IP}:4318
      prometheus:
        config:
          scrape_configs:
          - job_name: otelcol
            scrape_interval: 10s
            static_configs:
            - targets:
              - ${env:MY_POD_IP}:8888
            metric_relabel_configs:
            - source_labels: [ __name__ ]
              regex: '.*grpc_io.*'
              action: drop
            - action: labeldrop
              regex: "service_instance_id|service_name"
    service:
      extensions:
      - health_check
      - memory_ballast
      pipelines:
        metrics:
          exporters:
          - prometheus
          processors:
          - memory_limiter
          - k8sattributes
          - batch
          receivers:
          - otlp
          - prometheus
          - spanmetrics
        traces:
          exporters:
          - otlphttp
          - spanmetrics
          processors:
          - memory_limiter
          - k8sattributes
          - batch
          receivers:
          - otlp
      telemetry:
        metrics:
          address: ${env:MY_POD_IP}:8888


jaeger-query deployment env:
      - env:
        - name: PROMETHEUS_QUERY_DURATION_UNIT
          value: s
        - name: PROMETHEUS_QUERY_NAMESPACE
          value: spanmetrics
        - name: PROMETHEUS_QUERY_NORMALIZE_CALLS
          value: "true"
        - name: METRICS_STORAGE_TYPE
          value: prometheus
        - name: PROMETHEUS_SERVER_URL
          value: redacted
        - name: PROMETHEUS_QUERY_SUPPORT_SPANMETRICS_CONNECTOR
          value: "true"
        - name: SPAN_STORAGE_TYPE
          value: elasticsearch
        - name: ES_SERVER_URLS
          value: redacted
        - name: ES_USERNAME
          value: redacted
        - name: ES_PASSWORD
          value: redacted
        - name: QUERY_BASE_PATH
          value: /
        - name: JAEGER_AGENT_PORT
          value: "6831"

jaeger-collector deployment env (not sure it needs the extra vars i've set, was not sure from the docs):
      - env:
        - name: PROMETHEUS_QUERY_SUPPORT_SPANMETRICS_CONNECTOR
          value: "true"
        - name: COLLECTOR_OTLP_ENABLED
          value: "true"
        - name: SPAN_STORAGE_TYPE
          value: elasticsearch
        - name: ES_SERVER_URLS
          value: redacted
        - name: ES_USERNAME
          value: redacted
        - name: ES_PASSWORD
          value: redacted
@chkp-talron chkp-talron added the bug label Jan 4, 2024
@MinaMohammadi
Copy link

MinaMohammadi commented Feb 20, 2024

Summary

Hello, I'm experiencing an issue with the Jaeger Service Performance Monitor (SPM) dashboard where the latency data consistently shows "No Data" for all services. Upon inspecting the query in my web browser, I found that the metrics for "service_latencies" were empty. Here's the information from the query inspection:

{
"name": "service_latencies",
"type": "GAUGE",
"help": "0.50th quantile latency, grouped by service",
"metrics": []
}

I checked the Jaeger documentation and found that the relevant metric for this dashboard is "latency_bucket," which is deprecated on the OpenTelemetry side.

Application Version

jaeger version: 1.54
opentelemetry collector: 0.94.0

image

@anand3493
Copy link

anand3493 commented May 8, 2024

@chkp-talron @MinaMohammadi Am experiencing the same issue. have you found any fixes for this issue??

@ jaeger team @yurishkuro : Can you please look into this issue if not already.

@yurishkuro
Copy link
Member

The documentation has quite detailed troubleshooting steps https://www.jaegertracing.io/docs/latest/spm/#troubleshooting. Which of them did you try and what were the results?

@anand3493
Copy link

anand3493 commented May 10, 2024

The documentation has quite detailed troubleshooting steps https://www.jaegertracing.io/docs/latest/spm/#troubleshooting. Which of them did you try and what were the results?

@yurishkuro
Yes, we have indeed followed the well written troubleshooting steps. To be precise we followed the ENV vars mentioned here: https://www.jaegertracing.io/docs/1.57/spm/#viewing-logs
PROMETHEUS_QUERY_NORMALIZE_CALLS=true
PROMETHEUS_QUERY_NORMALIZE_DURATION=true

Initially we did not see any metrics on the SPM. But after the above inclusion of ENV, the error rate and request rate started to appear, while the latency metrics were not appearing.

Please do let us know if you require any other information, will be glad if you resolve our issue here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants