-
Notifications
You must be signed in to change notification settings - Fork 991
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
External scaler connection errors ignored, the HPA is missing metrics #5787
Comments
Hello, |
@JorTurFer I'm on 2.14.0 (but I tested the main branch yesterday and the problem occurred too). This is the ScaledObject definition: apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
labels:
app.kubernetes.io/managed-by: Helm
scaledobject.keda.sh/name: scaledobject-workers
name: scaledobject-workers
namespace: default
spec:
scaleTargetRef:
kind: Deployment
name: scheduler
triggers:
- metadata:
scalerAddress: scheduler-scaler.default.svc.cluster.local:8080
type: external-push And this is the HPA that gets created - notice that the list of metrics only contains a CPU-based metric (this is the default one inserted by K8s): apiVersion: v1
items:
- apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
annotations:
meta.helm.sh/release-name: scheduler
meta.helm.sh/release-namespace: default
creationTimestamp: "2024-05-08T14:49:33Z"
labels:
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: keda-hpa-scaledobject-workers
app.kubernetes.io/part-of: scaledobject-workers
app.kubernetes.io/version: 2.14.0
scaledobject.keda.sh/name: scaledobject-workers
name: keda-hpa-scaledobject-workers
namespace: default
ownerReferences:
- apiVersion: keda.sh/v1alpha1
blockOwnerDeletion: true
controller: true
kind: ScaledObject
name: scaledobject-workers
uid: 1c21176d-71bc-4de2-9740-9fe03f5f66d7
resourceVersion: "2777064"
uid: a272a347-f011-499f-92e5-fa08d650f985
spec:
maxReplicas: 100
metrics:
- resource:
name: cpu
target:
averageUtilization: 80
type: Utilization
type: Resource
minReplicas: 1
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: scheduler
status:
conditions:
- lastTransitionTime: "2024-05-08T14:49:48Z"
message: the HPA controller was able to get the target's current scale
reason: SucceededGetScale
status: "True"
type: AbleToScale
- lastTransitionTime: "2024-05-08T14:49:48Z"
message: 'the HPA was unable to compute the replica count: failed to get cpu
utilization: unable to get metrics for resource cpu: unable to fetch metrics
from resource metrics API: the server could not find the requested resource
(get pods.metrics.k8s.io)'
reason: FailedGetResourceMetric
status: "False"
type: ScalingActive
currentMetrics: null
currentReplicas: 1
desiredReplicas: 0
kind: List
metadata:
resourceVersion: "" I'm also attaching logs from the operator pod: Now, for example, if I edit the ScaledObject (with |
I'm going to try to reproduce this. |
@JorTurFer That's correct, the gRPC server with the external scaler should be down for some time after a ScaledObject is installed |
Report
When I install a Helm chart containing both an external scaler GRPC service and a
ScaledObject
, the resulting HPA has an empty list of metrics (K8s inserts the default 80% CPU utilization metric in that case). It then remains in that state even after the external scaler GRPC service has been initialized (I can manually force it to re-reconcile by editing the ScaledObject).This is happening because Helm installs the external scaler service and the
ScaledObject
at the same time. The external scaler's GRPC server isn't available immediately (it takes ~1 sec for the pod to start), and KEDA runs the reconciliation of theScaledObject
before the external scaler is available, ignoring the GRPC connection error.Expected Behavior
In my opinion, it would probably be better if KEDA were to re-queue the reconciliation request in these situations. For example,
Reconcile()
inscaledobject_controller.go
could be returningctrl.Result{RequeueAfter: time.Minute}
if a GRPC connection error was observed.Actual Behavior
KEDA doesn't update the HPA even after the external scaler is available.
Steps to Reproduce the Problem
ScaledObject
resource using an external scalerHorizonalPodAutoscaler
created by KEDA is missing the metric specified in theScaledObject
Logs from KEDA operator
No response
KEDA Version
None
Kubernetes Version
None
Platform
Any
Scaler Details
No response
Anything else?
No response
The text was updated successfully, but these errors were encountered: