Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix cilium metrics list: revert "prometheus/client_golang" #19496

Merged
merged 2 commits into from
Apr 29, 2022

Conversation

ti-mo
Copy link
Contributor

@ti-mo ti-mo commented Apr 20, 2022

This reverts commit d8e3f28.

Extra Go metrics were added to the Prometheus client, some having NaN values, breaking json marshaling on the server side of cilium metrics list:

msg="Cilium API handler panicked" panic_message="json: unsupported value: NaN" url=/v1/metrics/ ...

Fixes: #19425

Revert Prometheus client to fix 'cilium metrics list'

@ti-mo ti-mo requested a review from a team as a code owner April 20, 2022 07:27
@maintainer-s-little-helper
Copy link

Commit 3c6a77c2aca9cc5575f3e89b066e2af383f62b54 does not contain "Signed-off-by".

Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin

@ti-mo ti-mo requested a review from rolinh April 20, 2022 07:27
@maintainer-s-little-helper maintainer-s-little-helper bot added dont-merge/needs-sign-off The author needs to add signoff to their commits before merge. dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. labels Apr 20, 2022
@ti-mo ti-mo force-pushed the tb/revert-prom-client-1.12 branch from 3c6a77c to ff01139 Compare April 20, 2022 07:28
@maintainer-s-little-helper
Copy link

Commit 3c6a77c2aca9cc5575f3e89b066e2af383f62b54 does not contain "Signed-off-by".

Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin

Copy link
Member

@tklauser tklauser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably makes sense to ignore github.com/prometheus/client_golang in the dependabot config until a new version with a fix is released. So the bot won't attempt to update the dependency again.

ignore:
# using a cilium-specific fork
- dependency-name: "github.com/miekg/dns"
# k8s dependencies will be updated manually along with tests
- dependency-name: "k8s.io/*"
- dependency-name: "sigs.k8s.io/*"
# cloud provider SDKs are updated too frequently, update them manually
- dependency-name: "github.com/aliyun/alibaba-cloud-sdk-go"
- dependency-name: "github.com/aws/*"
- dependency-name: "github.com/Azure/*"

@ti-mo ti-mo requested a review from a team as a code owner April 20, 2022 07:43
@ti-mo ti-mo requested a review from qmonnet April 20, 2022 07:43
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-sign-off The author needs to add signoff to their commits before merge. label Apr 20, 2022
@ti-mo ti-mo added the release-note/bug This PR fixes an issue in a previous release of Cilium. label Apr 20, 2022
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Apr 20, 2022
Copy link
Member

@rolinh rolinh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we're at it and given that dependabot ignores this dep for now, could we bump from v1.11.0 to v1.11.1 to pull in the security fix?

@ti-mo ti-mo force-pushed the tb/revert-prom-client-1.12 branch from 5302f24 to db071b8 Compare April 20, 2022 11:40
@ti-mo
Copy link
Contributor Author

ti-mo commented Apr 20, 2022

@rolinh Good point, done!

@ti-mo ti-mo force-pushed the tb/revert-prom-client-1.12 branch from db071b8 to be197d4 Compare April 20, 2022 11:59
@ti-mo
Copy link
Contributor Author

ti-mo commented Apr 20, 2022

/test

Job 'Cilium-PR-K8s-GKE' hit: #17628 (95.15% similarity)

@joestringer joestringer added the dont-merge/needs-rebase This PR needs to be rebased because it has merge conflicts. label Apr 20, 2022
@ti-mo ti-mo force-pushed the tb/revert-prom-client-1.12 branch from be197d4 to 7315b93 Compare April 21, 2022 12:00
@ti-mo ti-mo removed the dont-merge/needs-rebase This PR needs to be rebased because it has merge conflicts. label Apr 21, 2022
@ti-mo
Copy link
Contributor Author

ti-mo commented Apr 21, 2022

/test

Job 'Cilium-PR-K8s-1.23-kernel-net-next' hit: #18895 (93.74% similarity)

@ti-mo ti-mo force-pushed the tb/revert-prom-client-1.12 branch from 7315b93 to c82d92b Compare April 22, 2022 11:34
@ti-mo
Copy link
Contributor Author

ti-mo commented Apr 22, 2022

/test

This reverts commit d8e3f28.

---

See cilium#19425.

Extra Go metrics were added to the Prometheus client, some having
NaN values, breaking json marshaling on the server side of `cilium
metrics list`:

msg="Cilium API handler panicked" panic_message="json: unsupported
value: NaN" url=/v1/metrics/ ...

Signed-off-by: Timo Beckers <timo@isovalent.com>
Signed-off-by: Timo Beckers <timo@isovalent.com>
@ti-mo ti-mo force-pushed the tb/revert-prom-client-1.12 branch from c82d92b to 187f1c3 Compare April 27, 2022 08:31
@ti-mo
Copy link
Contributor Author

ti-mo commented Apr 27, 2022

/test

Job 'Cilium-PR-K8s-GKE' failed:

Click to show.

Test Name

K8sHealthTest cilium-health Checks status between nodes

Failure Output

FAIL: Kubernetes DNS did not become ready in time

If it is a flake and a GitHub issue doesn't already exist to track it, comment /mlh new-flake Cilium-PR-K8s-GKE so I can create one.

@joestringer
Copy link
Member

This seems like a straightforward revert, approvals are in. The CI runs below are hitting issues, the kafka one looks new to me but I don't understand how the PR could be related:

https://jenkins.cilium.io/job/Cilium-PR-K8s-GKE/8303/
https://jenkins.cilium.io/job/Cilium-PR-Runtime-net-next/2030/

Please do consider filing issues or referencing existing issues if these are flakes we are not yet tracking. However, given this is resolving a known issue due to the dependency bump, and reverting back to the previous version should be safe, I think this is good to merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note/bug This PR fixes an issue in a previous release of Cilium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CI: ci-external-workload: Cilium API handler panicked
6 participants