Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HPA metric got stuck at a random value and not scaling down after reaching max replica count #597

Open
Naveen-oops opened this issue Aug 7, 2023 · 12 comments
Labels
question Further information is requested

Comments

@Naveen-oops
Copy link

What happened?

I utilize two customized metrics, A and B, in my HPA system. A is a gauge-based metric called SLA Metric, while B is a count-based metric that tracks failed requests with HTTP status code 502 or 503 from Istio. These metrics are scraped by Prometheus.

To use custom metrics in HPA, we're employing Kube Metrics Adapter link. When the application load increases, the value of the SLA Metric also increases, and then the pods scale up until they reach the maximum replica count as expected.
However, the problem arises when the load dissipates, and the pods never scale down. Despite the SLA Metric's value being below the target in Prometheus, the HPA description still displays the metric value with a stale value that can be above or below the target.

One possible reason for this is that Metric B, which relies on Istio requests, shows up as unknown since there have been no failed requests with the 502 or 503 status codes. Thus, the Prometheus query fails.
We have noticed this behavior after upgrading Kube from version 1.21 to 1.24 , changing the HPA version from autoscaling/v2beta2 to autoscaling/v2, and changing the kube-metrics-adapter version from v0.1.16 to v0.1.19.

kubectl describe hpa my-hpa

Name:                                                          my-hpa
Namespace:                                                     namespace
Labels:                                                        app.kubernetes.io/managed-by=Helm
Annotations:                                                   meta.helm.sh/release-name: my-pod
                                                               meta.helm.sh/release-namespace: default
                                                               metric-config.object.avg-sla-breach.prometheus/query:
                                                                 avg(
                                                                  avg_over_time(
                                                                     is_sla_breach{
                                                                       app="my-pod",
                                                                       canary="false"
                                                                     }[10m]
                                                                  )
                                                                 )
                                                               metric-config.object.istio-requests-total.prometheus/per-replica: true
                                                               metric-config.object.istio-requests-total.prometheus/query:
                                                                 sum(
                                                                   rate(
                                                                     istio_requests_total{
                                                                       response_code=~"502|503",
                                                                       destination_service="my-pod.namespace.svc.cluster.local"
                                                                     }[1m]
                                                                   )
                                                                 ) /
                                                                 count(
                                                                   count(
                                                                     container_memory_usage_bytes{
                                                                       namespace="namespace",
                                                                       pod=~"my-pod.*"
                                                                     }
                                                                   ) by (pod)
                                                                 )
CreationTimestamp:                                             Wed, 12 Jul 2023 17:52:21 +0530
Reference:                                                     Deployment/my-pod
Metrics:                                                       ( current / target )
  "istio-requests-total" on Pod/my-pod (target value):    <unknown> / 200m
  "avg-sla-breach" on Pod/my-pod (target value):  833m / 500m
Min replicas:                                                  1
Max replicas:                                                  3
Deployment pods:                                               3 current / 3 desired
Conditions:
  Type            Status  Reason                 Message
  ----            ------  ------                 -------
  AbleToScale     True    SucceededGetScale      the HPA controller was able to get the target's current scale
  ScalingActive   False   FailedGetObjectMetric  the HPA was unable to compute the replica count: unable to get metric istio-requests-total: Pod on namespace my-pod/unable to fetch metrics from custom metrics API: the server could not find the metric istio-requests-total for pods my-pod
  ScalingLimited  True    TooManyReplicas        the desired replica count is more than the maximum replica count
Events:
  Type     Reason                 Age                       From                       Message
  ----     ------                 ----                      ----                       -------
  Warning  FailedGetObjectMetric  2m14s (x140768 over 25d)  horizontal-pod-autoscaler  unable to get metric istio-requests-total: Pod on namespace my-pod/unable to fetch metrics from custom metrics API: the server could not find the metric istio-requests-total for pods my-pod

To troubleshoot this further we checked the metrics value using

kubectl get --raw "/apis/custom.metrics.k8s.io/v1beta1/namespaces/my-namespace/pods/my-pod/avg-sla-breach"

Output:

{"kind":"MetricValueList","apiVersion":"custom.metrics.k8s.io/v1beta1","metadata":{"selfLink":"/apis/custom.metrics.k8s.io/v1beta1/namespaces/my-namespace/pods/my-pod/avg-sla-breach"},"items":[{"describedObject":{"kind":"Pod","namespace":"my-namespace","name":"my-pod","apiVersion":"v1"},"metricName":"avg-sla-breach","timestamp":"2023-08-07T08:14:35Z","value":"0","selector":null}]}

Although the metric value appears as zero, we can observe that the HPA description displays a stagnant value.

Workaround :

HPA behaves as expected when the second metric B is completely removed or modified to return 0 when the query fails.

What did you expect to happen?

HPA should scale down properly based on one of the metrics, even when the other metric value is not available.

How can we reproduce it (as minimally and precisely as possible)?

  • Set up the Kube metrics adapter link .
  • Create a custom-metric-based HPA that uses two metrics among which the value of one is undefined.
  • Increase the load i.e., the value of the other metric so that the HPA kicks in and scales up the pods to the max replica count.
  • Reduce the load i.e. the value of the metric, it will be stuck at a random value.

Anything else we need to know?

Does anyone faced similar issues in hpa or is this behavior of hpa for multiple metrics changed recently, especially in scaling down events? Can anyone from the community look into the issue and give some clarity?

Kubernetes version

v1.24.7

Cloud provider

EKS

OS version

Alpine Linux
@mikkeloscar
Copy link
Contributor

HPA should scale down properly based on one of the metrics, even when the other metric value is not available.

It's intentional that the HPA doesn't scale down if one metric is invalid. This is a safety feature to avoid that it scales down if the metric source is unavailable and the HPA couldn't know if it's safe or not to scale down. Imagine the issue is a network problem between the kube-metrics-adapter and prometheus, or prometheus is temporarily unavailable. In this case you don't want to scale down just because the metric was not available.

This feature has been in the HPA since the beginning but was temporarily broken in v1.16. I made a PR to fix it here which fixed it from v1.21 and on wards: kubernetes/kubernetes#99514

Ideally you should construct your Prometheus query such that it returns 0 if there is no 502 or 503 if you are sure that means there was none. You also want to avoid a situation where Prometheus simply didn't collect the metrics from whatever component exposes these metrics as that could be the same issue as I described above.

@mikkeloscar mikkeloscar added the question Further information is requested label Aug 14, 2023
@Naveen-oops
Copy link
Author

Naveen-oops commented Aug 14, 2023

Hi @mikkeloscar, thanks for the explanation. When the Prometheus query is modified to return zero if the metric is unavailable fixes the issue.

Now one thing I want to understand is, why the metric value is stuck with a stale value that is slightly above or below the target in kubectl describe of hpa. Also, this issue seems to happen only after replica counts reached a maximum and stayed there for some time. But at the same time able to see correct metric values in Prometheus, in hpa it is stuck this is solved only when hpa is manually edited.
Is there any relation between this issue and the one you are mentioning? Because I am able to reproduce the above scenario only when one of the metrics is unavailable.

can you help in getting some clarity on this?

@mikkeloscar
Copy link
Contributor

I'm not sure I understand your question. If one metric is unavailable what you describe sounds like what I mentioned above. If you mean it's stuck even with both metrics working, then it's some other issue and we likely need to look at logs of kube-metrics-adapter to understand it.

@Naveen-oops
Copy link
Author

Naveen-oops commented Aug 24, 2023

Hi @mikkeloscar ,

I just wanted some clarification.
After the Prometheus query is modified as you mentioned the issue is solved. As the two metrics will always have definite values, hpa is scaling up and scaling down as expected.

Now during this issue, the value of A metric got stuck at a random value as shown below. But if I run the same query directly on Prometheus, I am correctly getting the metric. This behavior seems strange, hence want to understand what is happening here. Ignore the B metric, as in any way the value of this should not affect the other metrics in HPA right ?

Metric Value in HPA Value in Prometheus Actual value target
A(sla metric) 457(stuck at this value) 0 0 500
B(istio failed requests) undefined undefined undefined 200

I want to understand why the value of A metric is stuck at a random value below the target (say 457) or above the target (843) and this was observed consistently when reproduced using the steps I mentioned in the issue description.

Could you help with this?

@mikkeloscar
Copy link
Contributor

Can you share the logs of kube-metrics-adapter when this happens? I think that would be helpful to maybe understand it?

@Naveen-oops
Copy link
Author

Hi @mikkeloscar ,
Thanks for the quick reply!

Do you want kube-metrics-adapter logs in the issue timeframe alone or do you also want logs before and after a specific hour time frame?
Other than Kube-metrics-adapter logs anything would be helpful for you, let me know.

I will check with my team and share the requested logs for troubleshooting.

@mikkeloscar
Copy link
Contributor

More context is better, but the logs from when the problem started to happen and some 10-15 minutes forward is probably enough to get an idea.

@Naveen-oops
Copy link
Author

Naveen-oops commented Aug 28, 2023

Hi @mikkeloscar ,

I reproduced the issue as mentioned in the description, this time the sla-metric value stuck at 375m/ 500. But when I queried on Prometheus, it showed a zero(0). For "kubectl describe hpa my-hpa" command , the output of hpa the value is still stuck at 375m.

Attaching the kube-metrics-adapter-logs for your reference:
kube-metric-adapter-final.log

Prometheus query output: (metric value)

{"status":"success","data":{"resultType":"vector","result":[{"metric":{},"value":[1693225218.501,"0"]}]}} 

Let me know if any other details is needed. Sorry for the delay in response , as I had to check with my team before sharing the logs.

@mikkeloscar
Copy link
Contributor

@Naveen-oops Thanks for sharing the logs. It looks like kube-metrics-adapter is getting the new metrics all the time, so that doesn't look wrong. I would be curious if you could also share the output of kubectl describe hpa <your-hpa> and kubectl get hpa <your-hpa> -o yaml at the time where this occurs?

@Naveen-oops
Copy link
Author

Naveen-oops commented Sep 4, 2023

Hi @mikkeloscar sure
Still the hpa is showing random value , but actual value is zero only.

kubectl describe hpa

Name:                                                                        pod-name-hpa
Namespace:                                                                   my-namespace
Labels:                                                                      app.kubernetes.io/managed-by=Helm
Annotations:                                                                 meta.helm.sh/release-name: pod-name
                                                                             meta.helm.sh/release-namespace: default
                                                                             metric-config.object.sla-metric.prometheus/query:
                                                                               avg(
                                                                                avg_over_time(
                                                                                   is_sla_breach{
                                                                                     app="pod-name"
                                                                                   }[10m]
                                                                                )
                                                                               )
                                                                             metric-config.object.istio-requests-total.prometheus/per-replica: true
                                                                             metric-config.object.istio-requests-total.prometheus/query:
                                                                               sum(
                                                                                 rate(
                                                                                   istio_requests_total{
                                                                                     response_code=~"502|503",
                                                                                     destination_service="pod-name.my-namespace.svc.cluster.local",
                                                                                   }[1m]
                                                                                 )
                                                                               ) /
                                                                               count(
                                                                                 count(
                                                                                   container_memory_usage_bytes{
                                                                                     namespace="my-namespace",
                                                                                     pod=~"pod-name.*"
                                                                                   }
                                                                                 ) by (pod)
                                                                               )
CreationTimestamp:                                                           Wed, 12 Jul 2023 17:52:25 +0530
Reference:                                                                   Deployment/pod-name
Metrics:                                                                     ( current / target )
  "istio-requests-total" on Pod/pod-name (target value):  <unknown> / 200m
  "sla-metric" on Pod/pod-name (target value):        399m / 500m
Min replicas:                                                                1
Max replicas:                                                                3
Deployment pods:                                                             3 current / 3 desired
Conditions:
  Type            Status  Reason                 Message
  ----            ------  ------                 -------
  AbleToScale     True    SucceededGetScale      the HPA controller was able to get the target's current scale
  ScalingActive   False   FailedGetObjectMetric  the HPA was unable to compute the replica count: unable to get metric istio-requests-total: Pod on my-namespace pod-name/unable to fetch metrics from custom metrics API: the server could not find the metric istio-requests-total for pods pod-name
  ScalingLimited  True    TooManyReplicas        the desired replica count is more than the maximum replica count
Events:
  Type     Reason                 Age                    From                       Message
  ----     ------                 ----                   ----                       -------
  Warning  FailedGetObjectMetric  56s (x56684 over 28d)  horizontal-pod-autoscaler  unable to get metric istio-requests-total: Pod on my-namespace pod-name/unable to fetch metrics from custom metrics API: the server could not find the metric istio-requests-total for pods pod-name

kubectl get hpa -o yaml

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  annotations:
    meta.helm.sh/release-name: pod-name
    meta.helm.sh/release-namespace: default
    metric-config.object.sla-metric.prometheus/query: |
      avg(
       avg_over_time(
          is_sla_breach{
            app="pod-name"
          }[10m]
       )
      )
    metric-config.object.istio-requests-total.prometheus/per-replica: "true"
    metric-config.object.istio-requests-total.prometheus/query: |
      sum(
        rate(
          istio_requests_total{
            response_code=~"502|503",
            destination_service="pod-name.my-namespace.svc.cluster.local",
          }[1m]
        )
      ) /
      count(
        count(
          container_memory_usage_bytes{
            namespace="my-namespace",
            pod=~"pod-name.*"
          }
        ) by (pod)
      )
  creationTimestamp: "2023-07-12T12:22:25Z"
  labels:
    app.kubernetes.io/managed-by: Helm
  name: pod-name
  namespace: my-namespace
  resourceVersion: "15805426"
  uid: a986206a-cbbc-4d07-af45-495b7adceb79
spec:
  maxReplicas: 3
  metrics:
  - object:
      describedObject:
        apiVersion: v1
        kind: Pod
        name: pod-name
      metric:
        name: istio-requests-total
      target:
        type: Value
        value: 200m
    type: Object
  - object:
      describedObject:
        apiVersion: v1
        kind: Pod
        name: pod-name
      metric:
        name: sla-metric
      target:
        type: Value
        value: 500m
    type: Object
  minReplicas: 1
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: pod-name

status:
  conditions:
  - lastTransitionTime: "2023-07-12T12:22:40Z"
    message: the HPA controller was able to get the target's current scale
    reason: SucceededGetScale
    status: "True"
    type: AbleToScale
  - lastTransitionTime: "2023-09-02T00:13:20Z"
    message: 'the HPA was unable to compute the replica count: unable to get metric
      istio-requests-total: Pod on my-namespace pod-name/unable to fetch
      metrics from custom metrics API: the server could not find the metric istio-requests-total
      for pods pod-name'
    reason: FailedGetObjectMetric
    status: "False"
    type: ScalingActive
  - lastTransitionTime: "2023-09-02T00:08:19Z"
    message: the desired replica count is more than the maximum replica count
    reason: TooManyReplicas
    status: "True"
    type: ScalingLimited
  currentMetrics:
  - type: ""
  - object:
      current:
        value: 399m
      describedObject:
        apiVersion: v1
        kind: Pod
        name: pod-name
      metric:
        name: sla-metric
    type: Object
  currentReplicas: 3
  desiredReplicas: 3
  lastScaleTime: "2023-09-02T00:07:49Z"

@mikkeloscar
Copy link
Contributor

@Naveen-oops Thanks for sharing all the information. It looks like an HPA issue. I think I need to replicate it to better understand where this happens, but I will need to find time to do that, can't promise when I will have the time. What version of Kubernetes are you running? Maybe there are upstream issues about this? 🤔

@Naveen-oops
Copy link
Author

Thanks for the support @mikkeloscar. We are currently using Kube version v1.24.7.

I am also not able to find the relevant issues in the Kubernetes project. Anyway, I have already created this same issue in the Kube project also kubernetes/kubernetes#119788 , which can be used to track it there.

Just let me know if any fix is provided for this issue, I am curious about the fix and eager to know what is happening behind the scenes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants