Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add documentation for kubernetesSDConfigs usage to scrape node targets #6517

Open
1 task done
oleksii-kalinin opened this issue Apr 16, 2024 · 4 comments
Open
1 task done

Comments

@oleksii-kalinin
Copy link

oleksii-kalinin commented Apr 16, 2024

Is there an existing issue for this?

  • I have searched the existing issues

What happened?

Description

When I configure ScrapeConfig to scrape nodes or cadvisor metrics, the target returns a 403 error.

Steps to Reproduce

Use manifests attached to the issue

Expected Result

According to the documentation, prometheus should use the default in-cluster token and ca to communicate with the API

Actual Result

server returned HTTP status 403 Forbidden

Prometheus Operator Version

v0.73.0

Kubernetes Version

v1.28.7-eks-b9c9ed7

Kubernetes Cluster Type

EKS

How did you deploy Prometheus-Operator?

prometheus-operator/kube-prometheus

Manifests

apiVersion: v1
kind: ServiceAccount
metadata:
  name: prometheus
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: prometheus
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
- kind: ServiceAccount
  name: prometheus
  namespace: prometheus
apiVersion: monitoring.coreos.com/v1alpha1
kind: ScrapeConfig
metadata:
  name: cadvisor
  labels:
    prometheus: system-monitoring-prometheus
spec:
  scheme: HTTPS
  relabelings:
    - action: labelmap
      regex: __meta_kubernetes_node_label_(.+)
    - targetLabel: __address__
      replacement: kubernetes.default.svc:443
    - sourceLabels: [__meta_kubernetes_node_name]
      regex: (.+)
      targetLabel: __metrics_path__
      replacement: /api/v1/nodes/${1}/proxy/metrics/cadvisor
  kubernetesSDConfigs:
  - role: Node

prometheus-operator log output

level=info ts=2024-04-16T07:02:40.311852686Z caller=main.go:186 msg="Starting Prometheus Operator" version="(version=0.73.0, branch=refs/tags/v0.73.0, revision=d70313bd17cf2a4b911222062608f793be146548)"
level=info ts=2024-04-16T07:02:40.311897365Z caller=main.go:187 build_context="(go=go1.22.1, platform=linux/amd64, user=Action-Run-ID-8551873288, date=20240404-08:50:01, tags=unknown)"
level=info ts=2024-04-16T07:02:40.311908351Z caller=main.go:198 msg="namespaces filtering configuration " config="{allow_list=\"\",deny_list=\"\",prometheus_allow_list=\"\",alertmanager_allow_list=\"\",alertmanagerconfig_allow_list=\"\",thanosruler_allow_list=\"\"}"
level=info ts=2024-04-16T07:02:40.407398652Z caller=main.go:227 msg="connection established" cluster-version=v1.28.7-eks-b9c9ed7
level=info ts=2024-04-16T07:02:40.510754118Z caller=operator.go:335 component=prometheus-controller msg="Kubernetes API capabilities" endpointslices=true
level=info ts=2024-04-16T07:02:40.527533967Z caller=operator.go:320 component=prometheusagent-controller msg="Kubernetes API capabilities" endpointslices=true
level=info ts=2024-04-16T07:02:40.704777786Z caller=server.go:298 msg="starting insecure server" address=:8080
level=info ts=2024-04-16T07:02:41.705515824Z caller=operator.go:429 component=prometheusagent-controller msg="successfully synced all caches"
level=info ts=2024-04-16T07:02:41.705725357Z caller=operator.go:563 component=prometheusagent-controller key=prometheus/prometheus-agent msg="sync prometheus"
level=info ts=2024-04-16T07:02:41.805364487Z caller=operator.go:283 component=thanos-controller msg="successfully synced all caches"
level=info ts=2024-04-16T07:02:42.00562483Z caller=operator.go:313 component=alertmanager-controller msg="successfully synced all caches"
level=info ts=2024-04-16T07:02:42.005642809Z caller=operator.go:392 component=prometheus-controller msg="successfully synced all caches"
level=info ts=2024-04-16T07:02:42.011162612Z caller=operator.go:766 component=prometheus-controller key=prometheus/prometheus msg="sync prometheus"
level=info ts=2024-04-16T07:02:42.151025532Z caller=operator.go:563 component=prometheusagent-controller key=prometheus/prometheus-agent msg="sync prometheus"
level=info ts=2024-04-16T07:02:42.304841339Z caller=operator.go:766 component=prometheus-controller key=prometheus/prometheus msg="sync prometheus"
level=info ts=2024-04-16T07:03:30.565074028Z caller=operator.go:563 component=prometheusagent-controller key=prometheus/prometheus-agent msg="sync prometheus"
level=info ts=2024-04-16T07:03:40.005947279Z caller=operator.go:563 component=prometheusagent-controller key=prometheus/prometheus-agent msg="sync prometheus"
level=info ts=2024-04-16T07:03:40.010506182Z caller=operator.go:766 component=prometheus-controller key=prometheus/prometheus msg="sync prometheus"
level=info ts=2024-04-16T07:17:18.345594596Z caller=operator.go:563 component=prometheusagent-controller key=prometheus/prometheus-agent msg="sync prometheus"
level=info ts=2024-04-16T07:17:21.582209642Z caller=operator.go:563 component=prometheusagent-controller key=prometheus/prometheus-agent msg="sync prometheus"
level=info ts=2024-04-16T07:17:22.445630324Z caller=operator.go:563 component=prometheusagent-controller key=prometheus/prometheus-agent msg="sync prometheus"

Anything else?

Before operator, I configure scrape to use SA token and CA directly with


      tls_config:
        ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token

So I can't do it with the operator b/c no such options for the scrape configs.

@oleksii-kalinin oleksii-kalinin added kind/bug needs-triage Issues that haven't been triaged yet labels Apr 16, 2024
@slashpai
Copy link
Contributor

You would need to create service account token secret example if prometheus service account name is prometheus

apiVersion: v1
kind: Secret
type: kubernetes.io/service-account-token
metadata:
 name: prometheus-secret
 annotations:
  kubernetes.io/service-account.name: "prometheus"

Also create secret for TLS config and use secret selector to select the configs

Example

apiVersion: monitoring.coreos.com/v1alpha1
kind: ScrapeConfig
metadata:
  name: scrape-config-kubernetes-sd-example
  namespace: default
  labels:
    app.kubernetes.io/name: scrape-config-kubernetes-sd-example
spec:
  scheme: HTTPS
  authorization:
    credentials:
      name: prometheus-secret
      key: token
  tlsConfig:
    ca:
      secret:
        name: default-server
        key: ca.crt
    insecureSkipVerify: true
  kubernetesSDConfigs:
  - role: Node

@slashpai slashpai added kind/support and removed needs-triage Issues that haven't been triaged yet kind/bug labels Apr 16, 2024
@oleksii-kalinin
Copy link
Author

Ok, it'll probably work, however, it's not the way described in the docs.

@slashpai
Copy link
Contributor

slashpai commented Apr 16, 2024

ya adding kubernetesSDConfig example in https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/user-guides/scrapeconfig.md should be helpful

Referring to https://github.com/slashpai/prometheus-operator-examples/tree/main/scrape_config/kubernetes_sd may be helpful for some examples for time being.

@slashpai slashpai changed the title server returned HTTP status 403 Forbidden on nodes and cadvisor Add documentation for kubernetesSDConfigs usage Apr 16, 2024
@simonpasquier simonpasquier changed the title Add documentation for kubernetesSDConfigs usage Add documentation for kubernetesSDConfigs usage to scrape node metrics Apr 18, 2024
@simonpasquier simonpasquier changed the title Add documentation for kubernetesSDConfigs usage to scrape node metrics Add documentation for kubernetesSDConfigs usage to scrape node targets Apr 18, 2024
@steadyk
Copy link

steadyk commented May 7, 2024

Thanks @slashpai for the hint with the API token Secrets!

We tried to switch the Strimzi additional scrape config example to the new ScrapeConfig CR.

Additionally to the bearer token, we also used the ca.crt from the API token Secret.
There was no need to add the insecureSkipVerify anymore.

One of the resulting ScrapeConfigs looks now like this:

apiVersion: monitoring.coreos.com/v1alpha1
kind: ScrapeConfig
metadata:
  name: kubernetes-cadvisor
  labels:
    prometheus: prometheus
spec:
  ...
  authorization:
    credentials:
      name: prometheus-secret
      key: token
  ...
  tlsConfig:
    ca:
      secret:
        name: prometheus-secret
        key: ca.crt
  relabelings:
  ...
  metricRelabelings:
  ...

Since we want to avoid long-living API tokens, we decided to introduce a Kyverno CleanupPolicy, which removes the token based on a schedule:

apiVersion: kyverno.io/v2beta1
kind: CleanupPolicy
metadata:
  name: remove-api-token
spec:
  match:
    any:
    - resources:
        kinds:
        - Secret
        names:
        - prometheus-secret
  schedule: "<cron schedule>"

Our ArgoCD will recreate the Secret afterwards.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants