Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable customization of prometheus metrics labels created by rook operator #9618

Closed
svolland-csgroup opened this issue Jan 20, 2022 · 9 comments

Comments

@svolland-csgroup
Copy link

svolland-csgroup commented Jan 20, 2022

Is this a bug report or feature request?

  • Feature Request

What should the feature do:
The operator actually supports an undocumented (?) monitoring label rook.io/managedBy that is used to add a managedBy label to the prometheus metrics.

spec:
  labels:
    monitoring:
      rook.io/managedBy: some-value

The operator adds a relabeling to the ServiceMonitor for this specific label, this is implemented in function applyMonitoringLabels in mgr.go.
The new feature shall enable to define a list of fully customizable labels to add to the metrics (instead of a single fixed-named one).

This could also be implemented by leveraging the targetLabels feature of the ServiceMonitor. In this case the labels already specified in the clusterspec for the mgr could also be applied to the Service spec of the mgr daemon (they are actually only applied to the mgr deployment/pods). Then one could pick theses labels by configuring the list of targetLabels.

What is use case behind this feature:
The use case is the need to add extra labels to prometheus metrics. This is particularly useful when the operator is managing more than one cluster, to distinguish the metrics by using more functional label meanings than the currently available labels that might serve this purpose (e.g. namespace, instance or pod) .

@travisn
Copy link
Member

travisn commented Jan 20, 2022

@svolland-csgroup The monitoring labels are currently only applied if the label is rook.io/mangedBy, but I don't recall why that was so strict with #8355. Would it solve the issue if we just allow any label to be applied instead of only that specific label? That makes sense to me for most flexibility, unless you're saying another change is also needed for targetLabgels.

@anmolsachan Do you recall why only rook.io/managedBy was allowed?

@svolland-csgroup
Copy link
Author

svolland-csgroup commented Jan 21, 2022

Applying any label would solve the issue and only require little code changes.

Leveraging the targetLabels feature would be a nicer solution because it uses the prometheus operator functionality that would generate appropriate configuration (instead of directly generating custom prometheus relabeling configurations), but it requires more changes:

  • the monitoring labels must be added to the mgr service resource (targetLabels exploits labels on the service)
  • then, depending on the degree of customization wanted:
    • either generate a targetLabels in the mgr ServiceMonitor with all label keys specified in the labels.monitoring config
    • or only add a configurable subset of these label keys (that would require a specific configuration section in the cluster spec, or using a specific prefix for these labels e.g. metrics/my-key=my-value)

Care must be taken that the labels specified in labels.monitoring do not overwrite those already existing in the service (i.e. app, ceph_daemon_id, rook_cluster), see the example below.

Configuration:

spec:
  labels:
    monitoring:
      my-monitoring-label: my-value

generated mgr service:

apiVersion: v1
kind: Service
metadata:
  labels:
    app: rook-ceph-mgr
    ceph_daemon_id: a
    rook_cluster: my-cluster
    my-monitoring-label: my-value
  name: rook-ceph-mgr
[...]

generated ServiceMonitor:

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: rook-ceph-mgr
spec:
[...]
  targetLabels:
    - my-monitoring-label

@travisn
Copy link
Member

travisn commented Jan 21, 2022

How will it help the ServiceMonitor if the targetLabels support a custom label? Don't the labels already on the rook-ceph-mgr service already uniquely identify the service that prometheus needs to scrape? Or what is different about the custom targetLabels?

@travisn
Copy link
Member

travisn commented Jan 21, 2022

Ah, it must be as you said in the initial post that the targetLabels will help in this case:
"This is particularly useful when the operator is managing more than one cluster, to distinguish the metrics by using more functional label meanings than the currently available labels that might serve this purpose (e.g. namespace, instance or pod)"

@anmolsachan
Copy link
Contributor

anmolsachan commented Jan 25, 2022

@svolland-csgroup The monitoring labels are currently only applied if the label is rook.io/mangedBy, but I don't recall why that was so strict with #8355. Would it solve the issue if we just allow any label to be applied instead of only that specific label? That makes sense to me for most flexibility, unless you're saying another change is also needed for targetLabgels.

@anmolsachan Do you recall why only rook.io/managedBy was allowed?

Hey Travis! Apologies for the late reply on this.

There was no strictness for using rook.io/managedBy, but there was only one use case at that moment.

Use case: "As a Kubernetes Admin I should be able to identify the metrics coming from the cephclusters uniquely, when on the same K8s cluster I create multiple cephclusters at the same time".

Adding an extra managedBy label in the Service Monitor adds a label to every metric exported by the Prometheus exporter. Also, it was optional because the above-stated use case might not be true.
In our case, we were setting the mangedBy value as the name of the higher managing storageCluster CR name to segregate cluster metrics and built relationships in promQL, but it's up to the user to choose the value.
It also covered cephclusters running in regular and external mode side by side.

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs. Thank you for your contributions.

@BlaineEXE
Copy link
Member

I believe this is made irrelevant by #9837. Closing this for now. We have the option to reopen if it is still relevant with the changed method of deploying prometheus metrics rules.

@holmesb
Copy link

holmesb commented Apr 12, 2023

This issue is still relevent @BlaineEXE, in pretty much ever widely used helm chart, it is possible to define serviceMonitor relabelings and metricsRelabelings. Need to add these under monitoring in the helm values.yaml, and then rook needs to to add these to the serviceMonitor resource.

Another use case: without relabeling, the instance label in ceph metrics in many cases is a non-identifiable IP address, instead of a meaningful host name. So eg there's no association between an OSD and the node it is on. If we could relabel, could rewrite instance from a metadata label containing the hostname. See api doc for serviceMonitor.spec.endpoints where relabelings & metricsRelabelings are.

@BlaineEXE
Copy link
Member

@holmesb I would consider this a new feature request to the current helm chart.

@BlaineEXE BlaineEXE changed the title Enable further customization of prometheus metrics labels Enable customization of prometheus metrics labels created by rook operator Apr 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants