You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Tried creating memory based autoscaling by setting predictor.scaleMetric to memory and a memory based HPA was created. Yaay!
In scenario 1 and 2 HPAs are created as expected if the metric is set as cpu.
Now, I want my HPA to be controlled by both memory and CPU. I tried setting predictor.scaleMetric to memory and a corresponding scaleTarget. Also set CPU thresholds using serving.kserve.io/metric: cpu. But only predictor.scaleMetric is respected.
What did you expect to happen:
I want HPA to have both memory and CPU based triggers.
@yuzisun Sorry, I missed this one. I want to have an HPA created with two triggers, one for CPU and one for memory, like we can have it in a normal Kubernetes deployment. This will enable us to scale both on CPU and memory triggers depending on what is over utilized.
/kind bug
What steps did you take and what happened:
Tried the following
predictor.scaleMetric
tomemory
and a memory based HPA was created. Yaay!In scenario 1 and 2 HPAs are created as expected if the metric is set as
cpu
.Now, I want my HPA to be controlled by both memory and CPU. I tried setting
predictor.scaleMetric
tomemory
and a correspondingscaleTarget
. Also set CPU thresholds usingserving.kserve.io/metric: cpu
. But onlypredictor.scaleMetric
is respected.What did you expect to happen:
I want HPA to have both memory and CPU based triggers.
What's the InferenceService yaml:
Anything else you would like to add:
The HPA YAML inferenceservice generates
Environment:
/etc/os-release
): Amazon Linux 2The text was updated successfully, but these errors were encountered: