-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
metrics: fix missing k8s rest client metrics #26412
Conversation
K8s rest client metrics, cilium_k8s_client_api_latency_time_seconds and cilium_k8s_client_api_calls_total, are missing because of a conflict with controller-runtime on the metrics registration. controller-runtime(sigs.k8s.io/controller-runtime/pkg/metrics) also registers the client-go rest client metrics on its registry using metrics.Register method in its init function. The metrics.Register can be called only once, and subsequent calls will be ignored. cilium-agent doesn't use controller-runtime, but there's an indirect dependency from github.com/cilium/cilium/daemon to github.com/cilium/cilium/operator/metrics package. (operator uses controller-runtime in its Gateway API implementation) For example, github.com/cilium/cilium/daemon/cmd -> github.com/cilium/cilium/pkg/ipam -> github.com/cilium/cilium/pkg/ipam/metrics -> github.com/cilium/cilium/operator/metrics The init function of sigs.k8s.io/controller-runtime/pkg/metrics will be called because of this indirect dependency. This commit fixes this conflict by registering the metrics directly as well as calling metrics.Register. Fixes: cilium#25610 Signed-off-by: Yusuke Suzuki <yusuke-suzuki@cybozu.co.jp>
Thanks! Since we have no codeowners on release branches, I'll tag some reviewers manually. Adding @aanm as the original author and @dylandreimerink for recent context around metrics. Please correct as necessary. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that this is a v1.13 backport it doesn't conflict with the recent metrics changes on main. LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the PR. @ysksuzuki I understand that this issue doesn't happen in main, but it can happen if the order of imports changes again. Maybe would be worth forward port this change to the main branch?
/test-backport-1.13 Job 'Cilium-PR-K8s-1.24-kernel-4.19' failed: Click to show.Test Name
Failure Output
Jenkins URL: https://jenkins.cilium.io/job/Cilium-PR-K8s-1.24-kernel-4.19/62/ If it is a flake and a GitHub issue doesn't already exist to track it, comment Then please upload the Jenkins artifacts to that issue. Job 'Cilium-PR-K8s-1.22-kernel-4.19' failed: Click to show.Test Name
Failure Output
Jenkins URL: https://jenkins.cilium.io/job/Cilium-PR-K8s-1.22-kernel-4.19/99/ If it is a flake and a GitHub issue doesn't already exist to track it, comment Then please upload the Jenkins artifacts to that issue. Job 'Cilium-PR-K8s-1.21-kernel-4.19' failed: Click to show.Test Name
Failure Output
Jenkins URL: https://jenkins.cilium.io/job/Cilium-PR-K8s-1.21-kernel-4.19/60/ If it is a flake and a GitHub issue doesn't already exist to track it, comment Then please upload the Jenkins artifacts to that issue. |
Right. I will raise a PR to forward port it. Also, I want to try cleaning up the dependency from ipam to operator package in #26122. There's no way to override the workqueue metrics if controller-runtime wins. |
/test-1.21-4.19 |
/test-1.22-4.19 |
/test-1.24-4.19 |
CI green except for, inexplicably, waiting for some |
K8s rest client metrics,
cilium_k8s_client_api_latency_time_seconds
andcilium_k8s_client_api_calls_total
, are missing because of a conflict with controller-runtime on the metrics registration.(This issue of missing metrics occurs only in 1.13. See #25555 (comment))
controller-runtime(sigs.k8s.io/controller-runtime/pkg/metrics) also registers the client-go rest client metrics on its registry using metrics.Register method in its init function. The metrics.Register can be called only once, and subsequent calls will be ignored.
cilium-agent doesn't use controller-runtime, but there's an indirect dependency from github.com/cilium/cilium/daemon to github.com/cilium/cilium/operator/metrics package. (operator uses controller-runtime in its Gateway API implementation) For example,
github.com/cilium/cilium/daemon/cmd
-> github.com/cilium/cilium/pkg/ipam
-> github.com/cilium/cilium/pkg/ipam/metrics
-> github.com/cilium/cilium/operator/metrics
The init function of sigs.k8s.io/controller-runtime/pkg/metrics will be called because of this indirect dependency.
This commit fixes this conflict by registering the metrics directly as well as calling metrics.Register.
Fixes: #25610
Please ensure your pull request adheres to the following guidelines:
description and a
Fixes: #XXX
line if the commit addresses a particularGitHub issue.
Fixes: <commit-id>
tag, thenplease add the commit author[s] as reviewer[s] to this issue.