You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Description
The MetricPipeline supports already an input type runtime which emits metrics around the container and pod resource consumption. What is missing are further typical metrics:
from the apiserver about configured resource limits
Having these metrics available, basic troubleshooting for kubernetes workload including alerting can be fullfiled.
Goal
Provide a way to collect a typical set of metrics for basic workload troubleshooting (comparable to the metrics used by the dashboards provided by the kube-prometheus-stack)
Criterias
Typical metrics are collectable which are needed to troubleshoot
Pod compute resource
Node resource usage
Volume resource usage
Health of workloads (deployment stuck for example)
Namespace specific metrics can be enabled per namespace (probably independent from non-namespaces resources)
Node and Volume related metrics can be enabled optional to workload related metrics
Reasons
The current feature set is a good start but are missing apiserver related details like limits to get a complete picture for troubleshooting and defining relevant alerts. Furthermore typical workload health related metrics are missing from the apiserver. Also volumes and node statistics are important in daily operations.
Attachments
Release Notes
The text was updated successfully, but these errors were encountered:
a-thaler
changed the title
Metric inputs to cover typical workload operations
Typical kubernetes workload metrics as telemetry input to enable dashboarding and alerting
Apr 23, 2024
Description
The MetricPipeline supports already an input type
runtime
which emits metrics around the container and pod resource consumption. What is missing are further typical metrics:mainly the typical metrics resulting from the kubletstatsreceiver and the k8sclusterreceiver
Having these metrics available, basic troubleshooting for kubernetes workload including alerting can be fullfiled.
Goal
Provide a way to collect a typical set of metrics for basic workload troubleshooting (comparable to the metrics used by the dashboards provided by the kube-prometheus-stack)
Criterias
Actions
Reasons
The current feature set is a good start but are missing apiserver related details like limits to get a complete picture for troubleshooting and defining relevant alerts. Furthermore typical workload health related metrics are missing from the apiserver. Also volumes and node statistics are important in daily operations.
Attachments
Release Notes
The text was updated successfully, but these errors were encountered: