Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SLO Aggregation #41

Open
ian-bartholomew opened this issue Jul 15, 2021 · 3 comments
Open

SLO Aggregation #41

ian-bartholomew opened this issue Jul 15, 2021 · 3 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@ian-bartholomew
Copy link
Contributor

Problem to solve

Currently, multiple SLOs can encompass a single user journey, without a single SLO that measures the user experience.

Proposal

Add the ability to aggregate SLOs, and roll them up into a single SLO

This is similar to Keptn's Quality Gates, and proposed by Andres Grabner. More info here: https://www.youtube.com/watch?v=bMnMkOKVzdg

Further details

Key features:

  • Performance Signature
  • Synthetic SLI from multiple SLOs
  • Key SLOs
    • All will fail if this fails
  • Weighted
    • Total weight is the aggregation/sum of all weights
  • Performance testings
  • Regression Detection
@ian-bartholomew ian-bartholomew added enhancement New feature or request help wanted Extra attention is needed labels Jul 15, 2021
@mmazur
Copy link
Collaborator

mmazur commented Nov 18, 2021

The simplest usable (to me) syntax for this would be:

spec:
  description: Aggregate SLO
  budgetingMethod: ?
  indicator:
    objectiveMetric:
      source: http://localhost:9090
      queryType: promql
      query: component:latency:slo_ok_5m{component="prod-comp-1"}
  objectives:
  - target: 0.95
  timeWindows:
  - count: 28
    unit: Day
    isRolling: true

With the evaluation engine being smart enough to compute this by creating the following query for prom and just storing the result:

avg_over_time(component:latency:slo_ok_5m{component="prod-comp-1"}[28d])

@mmazur
Copy link
Collaborator

mmazur commented Nov 18, 2021

Hmm, I think I misunderstood the goal of this issue. I'll move over to my own issue :)

@r3code
Copy link
Contributor

r3code commented Dec 30, 2022

@ian-bartholomew In sloth.dev definition I do it by calculating a raw error_query_ratio it as sum of all good events and all totals from SLIs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants