Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose function controller readiness metric for prometheus-based monitoring #869

Open
1 of 5 tasks
kwiatekus opened this issue Apr 9, 2024 · 1 comment
Open
1 of 5 tasks
Labels
kind/feature Categorizes issue or PR as related to a new feature.

Comments

@kwiatekus
Copy link
Contributor

kwiatekus commented Apr 9, 2024

Description

Introduce a new metric specifically designed to reflect the readiness status of serverless function controller:

AC:

  • it should indicate whether the function controller's main reconciliation loop is ready to serve requests or not (if the queue is served)
  • the frequency of metric update should be independent from kubernetes probing frequency configuration (i.e separate go rutine with own ticker).
  • frequent probing should not have negative effect on function-controller performance; probe should add an event for function controller who serves it with a fast exit. (we have it already. health probing is entering reconciliation loop)
  • No user misconfigurations (i.e invalid function CR or function code) should have an effect on the metric (and disrupt the SLO budget)
  • the metric should be observable in the time frame (via promql) so that observer can model alerting rules based on aggregated time series.

The above criteria are for the basic availability indication.
Think of additional availability indicator for serverless that could be used to inspect weather every requested function CR was "attempted to be built" and those which were successfully built were "attempted to be deployed"

Reasons
Ensure SLO is observable for serverless.
Enable administrators to set up alerting and monitoring based on function controller readiness.

Attachments

@kwiatekus kwiatekus added the kind/feature Categorizes issue or PR as related to a new feature. label Apr 9, 2024
@kwiatekus kwiatekus changed the title Expose Function Controller Readiness Metric for Prometheus-based Monitoring Expose function controller readiness metric for prometheus-based monitoring Apr 9, 2024
@kwiatekus
Copy link
Contributor Author

@ebensom Could the avs rules be configured to ping /readyz endpoint of the function controller (just like it was calling webhook before)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

No branches or pull requests

1 participant