ceph: fixing the queries for alerts 'CephMgrIsAbsent' and 'CephMgrIsMissingReplicas' #8985
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
CephMgrIsAbsent
This query initially had the following query
absent(up{job="rook-ceph-mgr"})
which will fire when the 'up' query is not present, but had two flows
a. it will not be fired if 'up' provides a result with ZERO value
b. it will not give any fields in the metric, so 'namespace' was missing
when the above query was replaced with the following,
up{job="rook-ceph-mgr"} == 0
query had the following shortage
a. whenever mgr pod is completely down (like 'replicas' set to ZERO
and 'mgr' is not coming up), 'up' query will not give any result.
Thus we had to combine both the queries to get results in both the scenarios.
CephMgrIsMissingReplicas
This query previously was,
sum(up{job="rook-ceph-mgr"}) < 1
had the same structure as the above (Absent) query, but it's
intention was to check the no: of 'replicas' count for ceph mgr.
Now it is changed to a kube query which handles the replicas count.
Signed-off-by: Arun Kumar Mohan amohan@redhat.com
Description of your changes:
Which issue is resolved by this Pull Request:
Resolves #
Checklist:
make codegen
) has been run to update object specifications, if necessary.