-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: strimzi.resources metric is missing in new unidirectional topic operator #9802
Comments
I seem to have it:
|
Do you have any topics provisioned ? We are seeing that there is no population of this metric per topic as before. We are using the metric to check status != 1 and the reason label to monitor reconcile errors for topics. I will try to provide more data next weeek. |
Ahh, ok. No, I do not have the per-topic metrics there. Just the counter. Not sure we want to keep these detailed metrics as they are hard to manage. But I guess that can be discussed when the issue is triaged. |
I see your concern. As minimum we need to understand if there is any reconciliation issues. That is really hard to monitor with this feature removed. If we do not use the label together with "reason" we would need to extract it from logs, which would be a pain. Perhaps it could be a configurable option? |
Triaged on 21.3.2024: @fvaleri is going to take a look at this one. |
Hi @cthtrifork, thanks for raising this. It was decided to not provide this metric with UTO because it does not scale well (it's an additional metric for each managed topic). Additionally, we don't have anything similar for the other operators.
Why would you want to extract the reason from logs? My suggestion is to leverage the KT status. You can use A command similar to this one: $ kubectl get kt -o custom-columns=TOPIC:.metadata.name,REASON:.status.conditions[0].reason,MESSAGE:.status.conditions[0].message,READY:.status.conditions[0].status | grep False
t1 NotSupported Replication factor change not supported, but required for partitions [] False
Personally, I don't like the idea because metrics are supposed to track the system behavior and performance, not the state of every single managed resource. The UTO has optional metrics to track internal operations that you can use for performance tests or troubleshooting, but they are aggregated. That said, let's see what others think. |
Hey, |
I did not know that |
1 similar comment
I did not know that |
So I agree with @fvaleri to not provide these metrics because of the scaling but I also think that a contribution to Strimzi (maybe in the examples folder?) to provide a configuration to use |
We can have a dedicated improvement issue or PR if you think the |
Yes I will close this bug report. Thanks for assisting. We will look at kube-state-metrics! |
Bug Description
It is documented here:
https://github.com/strimzi/proposals/blob/main/051-unidirectional-topic-operator.md#metrics
But it does not seem to be carried over from the old operator.
After upgrading, we are not able to see any
strimzi_resource_state
metric for each topic, as we have had before.Steps to reproduce
No response
Expected behavior
No response
Strimzi version
0.39.0
Kubernetes version
Kubernetes 1.27.7
Installation method
Yaml files
Infrastructure
Azure AKS
Configuration files and logs
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: