-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow scale to zero to work when min-scale is greater than 0 #15154
Comments
Hi @daraghlowe, I will take a look on what you report about |
Hi @daraghlowe
According to the docs the behavior is:
Also in the PR: "This annotation will not impact initial-scale values, as it will only apply on subsequent scales from zero."
if dspc is > 0 due to traffic come in and also revision is active I am wondering why you don't see two pods.
Could you enable debug logging for the autoscaler and paste the output also provide more details like the ksvc you used? |
Hi @skonto I did some additional testing and confirmed that the activation-scale annotation does work as you mentioned. As long as the revision is receiving traffic (i sent one request per 5s) then it will stay scaled up to the the number of replicas set in the If the revision doesn't receive traffic then it will scale down to 1 replica and it will scale back up to 2 (activation-scale) as soon as you send 1 request. We were testing in our dev environment where the revision wasn't receiving any traffic and when we saw the replicas scale down from 2 to 1, we misunderstood how it was working as we expected that it would always stay at minimum of 2 replicas and when it scaled down, it would scale down directly to zero rather than scale down to 1 replica first. Thanks for investigating this and clarifying how it works! |
What feature do you want?
We want to be able to set min-scale while active to 2 so that our active Knative services are high available and still allow scale to zero so we're not wasting compute resources when our Knative services are not receiving requests
Describe the feature
We're currently running our Knative services without min-scale set and we allow the services to scale down to zero when they're not actively receiving requests. This obviously saves us a lot in ensuring we're not wasting compute resources and is a feature of Knative that we want to continue utilising.
In addition to scale to zero, we also want our Knative services to be highly available when they are receiving requests and are active. Specifically we have a problem when our node pool upgrades happen that any of the services running a single replica will experience downtime while the pods are evicted and migrated to the new nodes.
The rational behind wanting both scale to zero and highly available services while active is that the type app that is running in the service is controlled by our customers and we can't easily know which services are pre-production and non important and which ones are critical and must be highly available.
The solution that we would like to implement is:
minAvailable: 1
min-scale: 2
so that our Knative services have a minimum of 2 replicas when they're runningUnfortunately however when we set
min-scale: 2
this results in all of our Knative services scaling up to a minimum of 2 pods, including all of them that had been scaled down to zero.We did some testing with using
activation-scale
but it doesn't solve the problem as the service can scale down to 1 replica when its active if it doesn't get enough request concurrency after initially activating and scaling up to 2. The description of the PR that was merged seems to indicate that it should work like we want it however, but it doesn't. #13136Would it be possible to add another annotation that can fulfil the description of #13136 rather than activation-scale like
min-scale-while-active
?As an alternative we're currently thinking of building a controller than will temporarily increase the min-scale of our active services to 2 when an upgrade is occurring. Curious if there is some other solution or workaround that you could recommend instead of this approach?
The text was updated successfully, but these errors were encountered: