Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define SLO as an Opentelemetry CRD #2945

Open
melchiormoulin opened this issue May 10, 2024 · 3 comments
Open

Define SLO as an Opentelemetry CRD #2945

melchiormoulin opened this issue May 10, 2024 · 3 comments
Labels
area:collector Issues for deploying collector enhancement New feature or request needs-info

Comments

@melchiormoulin
Copy link

Component(s)

No response

Is your feature request related to a problem? Please describe.

Many observability vendors use terraform to define SLO but it's not convenient to have a helm chart to deploy the application on one side and set SLO on the other side define in terraform. Having an Opentelemetry CRD standard to define SLO would avoid many copy paste and some needed knowledge in terraform.

Describe the solution you'd like

Beeing able to define A CRD in order to deploy SLO. This could be by leveraging https://github.com/google/slo-generator/blob/master/samples/prometheus/slo_prom_metrics_availability_ratio.yaml or creating a new standard.

Describe alternatives you've considered

Using openSLO is an alternative such as https://github.com/OpenSLO/OpenSLO/blob/main/examples/budgeting-method/occurences-slo.yaml but i believe having the telemetry and SLO defined as the standard would be a plus.

Additional context

No response

@melchiormoulin melchiormoulin added enhancement New feature or request needs triage labels May 10, 2024
@jaronoff97
Copy link
Contributor

Thanks for opening this issue! I'm not sure I entirely understand the use case here. The collector doesn't do any metrics querying and for an SLO to work effectively you usually need to do some amount of querying or longer term aggregation which is out of the scope of the collector. What would you imagine the CRD would end up actually doing here? Define an alert in a vendor's system? Define some pipeline rules for the collector?

@jaronoff97 jaronoff97 added area:collector Issues for deploying collector and removed needs triage labels May 16, 2024
@melchiormoulin
Copy link
Author

Thank you for your questions.
It would be more for the operator than the collector.
The goal would be to first define a standard manifest to define SLO as a kubernetes CRD.
Then each vendor can implement the SLO both for computing the SLO and burn rate alerting.

For example when a CRD is created the operator would create the SLO on the vendor and the corresponding alerting needed.
Some vendors have already started to implement it https://github.com/DataDog/datadog-operator/blob/main/examples/datadogslo/metric-example.yaml

@jaronoff97
Copy link
Contributor

Unfortunately, I do think this is out scope for our project and is more what OpenSLO is hoping to accomplish. This would require us to interface directly with vendors specifically for telemetry querying purposes. That's very different than the aim of our project which is to simplify the experience of running vendor-neutral OpenTelemetry software in Kubernetes. Is there a reason OpenSLO isn't sufficient for what you're looking for? You mention in your issue that having the SLO and telemetry defined close to each other is a benefit, but it's unclear why and what's stopping from defining a collector CR and an OpenSLO simultaneously.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:collector Issues for deploying collector enhancement New feature or request needs-info
Projects
None yet
Development

No branches or pull requests

3 participants