You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Description:
Prometheus Virtual Host metrics are being assigned wrongly having cases where the wrong host is being attached to the wrong vhost.
I am not sure what the correct behaviour is in some cases we just get metrics like envoy_vhost_vcluster_upstream_rq_retry and in the current case we get vhost.<virtual host name>.vcluster.<virtual cluster name> which is what the envoy docs say but with the wrong virtual host name attached to the wrong virtual host cluster.
A. The envoy_vhost_vcluster_upstream_rq_retry metrics that get emitted do have labels for envoy_virtual_cluster and envoy_virtual_host
B. vhost.<virtual host name>.vcluster.<virtual cluster name> only have the envoy_virtual_host label attached and no cluster label.
I hope A is the correct way as its easier to build dashboards from.
Its hard to pinpoint what's going on and seems to be random in how it picks for example I got envoy_vhost_foo_com_ab_foo_com_vcluster_ab_foo_com_upstream_rq_retry{envoy_virtual_host="api-gateway/ab/ab"} 0
for the metric name.
Metrics dump
# TYPE envoy_vhost_foo_com_ab_foo_com_vcluster_ab_foo_com_upstream_rq_retry counter
envoy_vhost_foo_com_ab_foo_com_vcluster_ab_foo_com_upstream_rq_retry{envoy_virtual_host="api-gateway/ab/ab"} 0
# TYPE envoy_vhost_foo_com_ab_foo_com_vcluster_ab_foo_com_upstream_rq_retry_limit_exceeded counter
envoy_vhost_foo_com_ab_foo_com_vcluster_ab_foo_com_upstream_rq_retry_limit_exceeded{envoy_virtual_host="api-gateway/ab/ab"} 0
# TYPE envoy_vhost_foo_com_ab_foo_com_vcluster_ab_foo_com_upstream_rq_retry_overflow counter
envoy_vhost_foo_com_ab_foo_com_vcluster_ab_foo_com_upstream_rq_retry_overflow{envoy_virtual_host="api-gateway/ab/ab"} 0
# TYPE envoy_vhost_foo_com_ab_foo_com_vcluster_ab_foo_com_upstream_rq_retry_success counter
envoy_vhost_foo_com_ab_foo_com_vcluster_ab_foo_com_upstream_rq_retry_success{envoy_virtual_host="api-gateway/ab/ab"} 0
# TYPE envoy_vhost_foo_com_ab_foo_com_vcluster_ab_foo_com_upstream_rq_timeout counter
envoy_vhost_foo_com_ab_foo_com_vcluster_ab_foo_com_upstream_rq_timeout{envoy_virtual_host="api-gateway/ab/ab"} 0
# TYPE envoy_vhost_foo_com_ab_foo_com_vcluster_ab_foo_com_upstream_rq_total counter
envoy_vhost_foo_com_ab_foo_com_vcluster_ab_foo_com_upstream_rq_total{envoy_virtual_host="api-gateway/ab/ab"} 0
# TYPE envoy_vhost_foo_com_ab_foo_com_vcluster_other_upstream_rq_retry counter
envoy_vhost_foo_com_ab_foo_com_vcluster_other_upstream_rq_retry{envoy_virtual_host="api-gateway/ab/ab"} 0
# TYPE envoy_vhost_foo_com_ab_foo_com_vcluster_other_upstream_rq_retry_limit_exceeded counter
envoy_vhost_foo_com_ab_foo_com_vcluster_other_upstream_rq_retry_limit_exceeded{envoy_virtual_host="api-gateway/ab/ab"} 0
# TYPE envoy_vhost_foo_com_ab_foo_com_vcluster_other_upstream_rq_retry_overflow counter
envoy_vhost_foo_com_ab_foo_com_vcluster_other_upstream_rq_retry_overflow{envoy_virtual_host="api-gateway/ab/ab"} 0
# TYPE envoy_vhost_foo_com_ab_foo_com_vcluster_other_upstream_rq_retry_success counter
envoy_vhost_foo_com_ab_foo_com_vcluster_other_upstream_rq_retry_success{envoy_virtual_host="api-gateway/ab/ab"} 0
# TYPE envoy_vhost_foo_com_ab_foo_com_vcluster_other_upstream_rq_timeout counter
envoy_vhost_foo_com_ab_foo_com_vcluster_other_upstream_rq_timeout{envoy_virtual_host="api-gateway/ab/ab"} 0
# TYPE envoy_vhost_foo_com_ab_foo_com_vcluster_other_upstream_rq_total counter
envoy_vhost_foo_com_ab_foo_com_vcluster_other_upstream_rq_total{envoy_virtual_host="api-gateway/ab/ab"} 0
from the http://localhost:19000/stats vhost.my-ns/foo.foo-public
I assume the same is for HTTPRoute names as well, not looked but I assume they need to be escaped like we do for hostnames?
Description:
Prometheus Virtual Host metrics are being assigned wrongly having cases where the wrong host is being attached to the wrong vhost.
I am not sure what the correct behaviour is in some cases we just get metrics like
envoy_vhost_vcluster_upstream_rq_retry
and in the current case we getvhost.<virtual host name>.vcluster.<virtual cluster name>
which is what the envoy docs say but with the wrong virtual host name attached to the wrong virtual host cluster.A. The envoy_vhost_vcluster_upstream_rq_retry metrics that get emitted do have labels for
envoy_virtual_cluster
andenvoy_virtual_host
B.
vhost.<virtual host name>.vcluster.<virtual cluster name>
only have theenvoy_virtual_host
label attached and no cluster label.I hope A is the correct way as its easier to build dashboards from.
Repro steps:
Take a example :
Its hard to pinpoint what's going on and seems to be random in how it picks for example I got
envoy_vhost_foo_com_ab_foo_com_vcluster_ab_foo_com_upstream_rq_retry{envoy_virtual_host="api-gateway/ab/ab"} 0
for the metric name.
Metrics dump
it looks like its merging the virtual clusters.
Environment:
k8s 1.29.0
envoy gateway 1.0.1 Merge gateways
The text was updated successfully, but these errors were encountered: