Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ceph: fixing the queries for alerts 'CephMgrIsAbsent' and 'CephMgrIsMissingReplicas' #8985

Commits on Oct 15, 2021

  1. ceph: fixing the queries for alerts 'CephMgrIsAbsent' and 'CephMgrIsM…

    …issingReplicas'
    
    CephMgrIsAbsent
    ----------------
    This alert initially had the following query
    
    absent(up{job="rook-ceph-mgr"})
    
    which will fire when the 'up' query is not present, but had two flows
      a. it will not be fired if 'up' provides a result with ZERO value
      b. it will not give any fields in the metric, so 'namespace' was missing
    
    when the above query was replaced with the following,
    
    up{job="rook-ceph-mgr"} == 0
    
    query had the following shortage
      a. whenever mgr pod is completely down (like 'replicas' set to ZERO
    and 'mgr' is not coming up), 'up' query will not give any result.
    
    Thus we had to combine both the queries to get results in both the scenarios.
    
    CephMgrIsMissingReplicas
    ------------------------
    This query previously was,
    
    sum(up{job="rook-ceph-mgr"}) < 1
    
    had the same structure as the above (Absent) query, but it's
    intention was to check the no: of 'replicas' count for ceph mgr.
    Now it is changed to a kube query which handles the replicas count.
    
    Signed-off-by: Arun Kumar Mohan <amohan@redhat.com>
    aruniiird committed Oct 15, 2021
    Copy the full SHA
    cfa2c2d View commit details
    Browse the repository at this point in the history