You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Afaik by design the hard limit is applied based on concurrency statistics over some window (panic, stable). Each request needs to stay long enough so that autoscaler can consider the container to be full and so it can trigger the scaling out process. In your example I suspect they are too fast. The crash seems not reproducible you need to specify more for that part.
I suggest you try the above with the autoscale-go sample app and use two requests in parallel with sleep=30000 (30 sec or a value that suits you), this will give enough time to observe how autoscaler behaves (it will create two pods). Also if you want to check more on how knative autoscaler works check this knative docs draft post.
Knative has a throttler mechanism for both queue proxy and the activator and it queues requests until autoscaler can kick in and so that eventually you will have enough pods to serve requests based on the hard limit. Fyi Knative is not a job framework if that is the intended use case, see this link for more.
skonto
changed the title
Knative service receivs more requests than configured hard limit number of requests
Knative service receives more requests than configured hard limit number of requests
Mar 19, 2024
What version of Knative?
Has issue
No issue
Expected Behavior
If hard limit - container concurrency is set, then no more than configured amount of requests are forwarded to service
https://knative.dev/docs/serving/autoscaling/concurrency/#hard-limit
Actual Behavior
Knative service keeps receiving more requests than specified and later on crashes.
Steps to Reproduce the Problem
The text was updated successfully, but these errors were encountered: