Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add multiple conditions with logic for AlertPolicy #218

Open
Simon-Boyer opened this issue Jun 22, 2023 · 4 comments
Open

Add multiple conditions with logic for AlertPolicy #218

Simon-Boyer opened this issue Jun 22, 2023 · 4 comments
Labels
enhancement New feature or request v2alpha1 Incompatible changes proposed for the next major release

Comments

@Simon-Boyer
Copy link

Simon-Boyer commented Jun 22, 2023

Problem to solve

Currently, the specifications only allow for 1 condition to be specified in an AlertPolicy. This can be limiting, particularly when setting up multi-window alerts. There should be some ways to setup AND/OR logics for conditions.

Proposal

I think the easiest way to do it is to have an AND logic at the top level, like slogen is doing here: https://github.com/OpenSLO/slogen/blob/main/samples/sumologic/v1/alert-policy/burn-rate.yaml. For me that is the most "natural" assumption and it would also keep slogen in the spec without any change.

To support OR logic (and even more complex AND), we could have and or/and keyword. So it could look something like that:

apiVersion: openslo/v1
kind: AlertPolicy
metadata:
  name: High-Burn-Rate
  displayName: Alert Policy
spec:
  conditions:
    - or:
      - and:
        - kind: AlertCondition
          spec:
            condition:
              kind: burnrate
              threshold: 14.4
              lookbackWindow: -1h
              alertAfter: 5m
        - kind: AlertCondition
          spec:
            condition:
              kind: burnrate
              threshold: 14.4
              lookbackWindow: -1h
              alertAfter: 1m
      - and:
        - kind: AlertCondition
          spec:
            condition:
              kind: burnrate
              threshold: 6
              lookbackWindow: -6h
              alertAfter: 5m
        - kind: AlertCondition
          spec:
            condition:
              kind: burnrate
              threshold: 6
              lookbackWindow: -30m
              alertAfter: 5m

which is roughly equivalent to the expression for the "page" level in the SRE workbook I just linked to:

expr: (
        job:slo_errors_per_request:ratio_rate1h{job="myjob"} > (14.4*0.001)
      and
        job:slo_errors_per_request:ratio_rate5m{job="myjob"} > (14.4*0.001)
      )
    or
      (
        job:slo_errors_per_request:ratio_rate6h{job="myjob"} > (6*0.001)
      and
        job:slo_errors_per_request:ratio_rate30m{job="myjob"} > (6*0.001)
      )
severity: page

Or you could simply have multiple conditions which all needs to be met, in which case you have to option to stay at the top level:

apiVersion: openslo/v1
kind: AlertPolicy
metadata:
  name: High-Burn-Rate
  displayName: Alert Policy
spec:
  conditions:
    - kind: AlertCondition
      spec:
        condition:
          kind: burnrate
          threshold: 14.4
          lookbackWindow: -1h
          alertAfter: 5m
    - kind: AlertCondition
      spec:
        condition:
          kind: burnrate
          threshold: 14.4
          lookbackWindow: -1h
          alertAfter: 1m

This will allow it to be backward compatible with the current v1 spec, will be compatible with what slogen does and will allow for more complex and/or logic expressions.

Further details

Briefly discussed previously here: #54 (comment)

Links / references

https://sre.google/workbook/alerting-on-slos/#6-multiwindow-multi-burn-rate-alerts
https://github.com/OpenSLO/slogen/blob/main/samples/sumologic/v1/alert-policy/burn-rate.yaml

@Simon-Boyer Simon-Boyer added the enhancement New feature or request label Jun 22, 2023
@weyert
Copy link
Collaborator

weyert commented Oct 30, 2023

I think this is a reasonable improvements

@nieomylnieja nieomylnieja added the v2alpha1 Incompatible changes proposed for the next major release label Nov 17, 2023
@nieomylnieja
Copy link
Member

@Simon-Boyer, I think we can proceed with your proposal, before we make anything official, I'd be grateful if you could describe the proposed changes here: https://github.com/OpenSLO/OpenSLO/blob/main/enhancements/v2alpha1.md#datasource :)

@Simon-Boyer
Copy link
Author

Simon-Boyer commented Nov 18, 2023

Sounds good, I'll try to open a PR this week

@Simon-Boyer
Copy link
Author

Simon-Boyer commented Dec 7, 2023

Last weeks were a bit chaotic, finally had some time to write the PR: #240

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request v2alpha1 Incompatible changes proposed for the next major release
Projects
None yet
Development

No branches or pull requests

3 participants