✨ Add support for controller-manager webhook shutdown delay #2601

dippynark · 2023-12-02T13:07:33Z

Currently the controller-runtime signal handler cancels the context as soon as a termination signal is received:

controller-runtime/pkg/manager/signals/signal.go

Lines 27 to 30 in 1657cf6

    
           // SetupSignalHandler registers for SIGTERM and SIGINT. A context is returned 
        
           // which is canceled on one of these signals. If a second signal is caught, the program 
        
           // is terminated with exit code 1. 
        
           func SetupSignalHandler() context.Context {

This is problematic for controller-managers managing a webhook server since graceful webhook server shutdown may begin before kube-proxy has observed the termination (and reconciled local iptables rules) leading to connection attempts after webhook server listeners have already been closed.

This PR adds support for a ShutdownDelay webhook server option allowing users to delay webhook server shutdown and give time for routes to be updated so that clients connect to other server replicas before closing server listeners. In particular, on GKE the kube-proxy --iptables-min-sync-period flag is currently set to 10 seconds, so a delay duration of 15 seconds might make sense on such clusters.

k8s-ci-robot · 2023-12-02T13:07:39Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: dippynark
Once this PR has been reviewed and has the lgtm label, please assign joelanford for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot · 2023-12-02T13:07:42Z

Welcome @dippynark!

It looks like this is your first PR to kubernetes-sigs/controller-runtime 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/controller-runtime has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

k8s-ci-robot · 2023-12-02T13:07:43Z

Hi @dippynark. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Signed-off-by: Luke Addison <lukeaddison785@gmail.com>

alvaroaleman · 2023-12-02T22:03:19Z

pkg/manager/signals/signal.go

+// of this manager before starting shutdown of any webhook servers to avoid
+// receiving connection attempts after closing webhook listeners. If a second
+// signal is caught, the program is terminated with exit code 1.
+func SetupSignalHandlerWithDelay(delay time.Duration) context.Context {


Do we really need this? IMHO this doesn't make sense because if this is needed or not is not a property of the binary, but of the environment. Kubernetes allows to use a preStopHook for this and recently even merged native sleep support for this precise use-case: kubernetes/kubernetes#119026

Ah nice, I wasn't aware of this sleep lifecycle hook feature! I agree with this but until Kubernetes v1.28 (I think the first version where the sleep lifecycle hook is available) the application otherwise needs to ship with a sleep binary so I think it'd still be worth having to deal with the long tail of Kubernetes versions.

However, happy for this to be closed and wait for the sleep hook for a native solution!

inteon · 2023-12-03T09:50:38Z

@dippynark Should we not create a webhook-specific solution?

I do remember reading a great article on how to properly shutdown a webserver on Kubernetes; unfortunately, I cannot find it anymore.
I do remember reading that new connections should still be accepted, but old connections should start being drained when you receive a shutdown signal. After a certain amount of time, the server can stop accepting new connections and fully shutdown.
It would be great if controller-runtime could apply some of these best-practices by default (only to its webhooks). I'm not sure just delaying the context cancel is the most elegant solution.

I think in go-world this translates to:

call server.SetKeepAlivesEnabled(false) on context cancel
call server.Shutdown(shutdownCtx) after timeout A (now no new connections are accepted)
shutdownCtx is canceled after timeout B (now all remaining connections are forcefully closed)

dippynark · 2023-12-03T10:56:53Z

@inteon Nice! When I looked at this I couldn't see how to add some time between triggering shutdown and closing listeners: https://github.com/golang/go/blob/de5b418bea70aaf27de1f47e9b5813940d1e15a4/src/net/http/server.go#L2986-L2989

But didn't know about SetKeepAlivesEnabled! Will have a go at doing shutdown this way

inteon · 2023-12-03T11:00:39Z

@inteon Nice! When I looked at this I couldn't see how to add some time between triggering shutdown and closing listeners: https://github.com/golang/go/blob/de5b418bea70aaf27de1f47e9b5813940d1e15a4/src/net/http/server.go#L2986-L2989

But didn't know about SetKeepAlivesEnabled! Will have a go at doing shutdown this way

@dippynark I created a POC here: #2607. Feel free to copy that code or leave feedback on that PR.

dippynark · 2023-12-03T11:44:05Z

@inteon Ah just saw your POC, have also updated this PR. I think we want the delay to be configurable though, because the amount of time we want to wait depends on the kube-proxy --iptables-min-sync-period flag (e.g. on GKE it's set to 10 seconds so we'd want to wait for more than 10 seconds). I kept the default at 0 for backwards compatibility (but can change the default to 10 seconds if that is preferred).

I have also not disabled keep-alives before waiting for the shutdown delay because if clients are still trying to open new connections and we disable keep-alives then those connections will only last for a single HTTP request (and there is nothing stopping the client opening a new connection to the same server that is about to be shut down). Instead, I think we want to only wait and keep the server running as normal until clients stop trying to open new connections and then when Shutdown is called the HTTP standard library already disables keep-alives and waits for active connections to close: https://github.com/golang/go/blob/de5b418bea70aaf27de1f47e9b5813940d1e15a4/src/net/http/server.go#L3385

Also, do we definitely want to stop the readiness probe? If users are using the same endpoint for liveness then the server might be killed early. IMO we should leave the health check alone and rely on the termination of the Pod to signal Kubernetes to stop sending traffic (but not too sure what situations you were thinking of where Kubernetes doesn't know about it).

I can copy any code from your POC PR depending on what you think of the above (I haven't done a controller-runtime PR before so would prefer to merge this one if it's all the same to you 😄)

inteon · 2023-12-03T13:32:26Z

I do agree with not making the readiness/ liveness probe fail. I think most controllers will only shutdown because Kubernetes told it to shutdown. The only situation in which this might be useful is if there was some kind of critical failure that forces the pod to restart. Currently c/r does not provide a clear readiness <-> liveness split for this probe, so it will likely be added to the liveness probe too, which is undesirable.

I do think however that the SetKeepAlivesEnabled(false) call is useful. If the routing has been updated before the --iptables-min-sync-period, it will allow long-running connections to already switch to the new routes and the Shutdown process won't take as long. This prevent users with a low terminationGracePeriodSeconds, from not stopping long-lasting connections because the grace period ends before Shutdown is called.

Please do copy. I just wrote it down as an experiment. I officially give you all the copyrights for that code 😄.

dippynark · 2023-12-03T14:07:32Z

@inteon Sounds good! I have changed to disable keep-alives so I think the PR is good, as long as you're happy with defaulting ShutdownDelay to 0? 10 seconds works for me too, just couldn't really make up my mind (I think very slight preference for 0 to maintain current behaviour and because the optimal value depends on the cluster configuration)

alvaroaleman · 2023-12-03T18:23:41Z

Another thing that I do not understand about this: The Kube-Apiserver doesn't actually use kube-proxy to reach webhooks and extended apiserves, it directly resolves the endpoints internally which means the described problem shouldn't apply...?

dippynark · 2023-12-03T19:07:58Z

@alvaroaleman I thought this was the case too before working on this, but as far as I can tell the standard API Server -> webhook setup does rely on kube-proxy.

Definitely not 100% on this, but my understanding is that when the API server wants to connect to a webhook behind a Kubernetes Service, it typically proxies the request via the Konnectivity Agent running on the cluster which performs Service DNS resolution and connects to the Service VIP, making use of kube-proxy routing; this allows connectivity even when the control plane and webhook Pods are in disjoint networks.

I tried to follow through the API server code which seems to confirm this. For example, here the API server is setting up the webhook client using the Service hostname before delegating to the cluster dialer provided by the EgressSelector instead of picking one of the endpoints itself.

inteon · 2023-12-03T20:06:16Z

pkg/webhook/server.go

+		case <-time.After(s.Options.ShutdownDelay):
+		case <-ctx.Done():


This construction basically means that ShutdownDelay should not be longer than 1 minute, right?
Maybe that is something we can enforce & document in the comments linked to this field?

hmm yeah I don't think it's great being capped at 1 minute. Does it look better now? I've changed it to always wait for the full shutdown delay

inteon · 2023-12-11T12:28:13Z

/ok-to-test

dippynark · 2023-12-12T13:12:46Z

/test pull-controller-runtime-test

inteon · 2023-12-12T13:33:26Z

pkg/webhook/server.go

+		// migrating clients to server instances that are not about to shutdown
+		srv.SetKeepAlivesEnabled(false)
+		// Wait before shutting down webhook server
+		<-time.After(s.Options.ShutdownDelay)


Suggested change

<-time.After(s.Options.ShutdownDelay)

time.Sleep(s.Options.ShutdownDelay)

We don't need to create a channel, I think.

Oops, fixed

inteon · 2023-12-12T14:26:34Z

pkg/webhook/server.go

@@ -101,6 +101,10 @@ type Options struct {

 	// WebhookMux is the multiplexer that handles different webhooks.
 	WebhookMux *http.ServeMux
+
+	// ShutdownDelay delays server shutdown to wait for clients to stop opening
+	// new connections before closing server listeners. Defaults to 0.


Open question:
Would it make sense to set a > 0 default, eg. 15s which is less than the 30 seconds default graceful shutdown delay?

It's a tricky one for sure, but because it's so specific to the Kubernetes cluster you're running on it doesn't feel right to align to any particular cluster or cloud provider (e.g. I'm not sure what value EKS and AKS use for example).

IMO we should keep it at 0 for backwards compatibility and to ensure that connection draining 'doesn't work' consistently across clusters by default. I think this is also more likely to encourage applications that care a lot about this (e.g. Gatekeeper) to expose a flag to change this to make it easier to reconfigure for the cluster you're running on (i.e. push the decision onto users of controller-runtime since an optimal default isn't clear).

No problems with changing to 15 seconds by default though if you were leaning towards that (that would be better for us anyway because we're using GKE 😄). We could alternatively align to the kube-proxy default of 1 second? But on GKE at least this won't help much.

inteon · 2023-12-12T14:34:56Z

Another thing that I do not understand about this: The Kube-Apiserver doesn't actually use kube-proxy to reach webhooks and extended apiserves, it directly resolves the endpoints internally which means the described problem shouldn't apply...?

@alvaroaleman There seems to be a --enable-aggregator-routing flag, maybe that is related? (https://kubernetes.io/docs/reference/command-line-tools-reference/kube-apiserver/)

Turns on aggregator routing requests to endpoints IP rather than cluster IP.

sbueringer · 2023-12-18T14:49:54Z

Q: Did I get this correctly that kube-proxy reacts to the deletionTimestamp on the Pod and not only on readiness when dropping a Pod from a Service?

dippynark · 2023-12-18T15:09:07Z

@sbueringer Yeah, my understanding is that the following actions are triggered simultaneously when a Pod's deletion timestamp is set:

The kubelet runs any preStop hooks and then sends a TERM signal to the container processes
EndpointSlices are updated to indicate that the Pod is terminating which triggers kube-proxy to remove corresponding iptables rules

There are more details here: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-termination

Note that there is currently no way to guarentee that all kube-proxy instances have reconciled EndpointSlice updates before shutting down (leading to issues like I described in this PR). The best mechanism I am aware of is to delay shutdown for slightly longer than the --iptables-min-sync-period kube-proxy flag: kubernetes/kubernetes#106476

k8s-triage-robot · 2024-03-17T15:38:57Z

The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

Mark this PR as fresh with /remove-lifecycle stale
Close this PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot · 2024-04-16T15:51:55Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all PRs.

This bot triages PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the PR is closed

You can:

Mark this PR as fresh with /remove-lifecycle rotten
Close this PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-ci-robot requested a review from joelanford December 2, 2023 13:07

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Dec 2, 2023

k8s-ci-robot requested a review from vincepri December 2, 2023 13:07

k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Dec 2, 2023

k8s-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Dec 2, 2023

dippynark mentioned this pull request Dec 2, 2023

feat: Add support for delaying webhook shutdown open-policy-agent/gatekeeper#3170

Closed

dippynark added 2 commits December 2, 2023 13:36

Add support for shutdown delay

9501679

Signed-off-by: Luke Addison <lukeaddison785@gmail.com>

Add unit test

bf963c8

Signed-off-by: Luke Addison <lukeaddison785@gmail.com>

dippynark force-pushed the support-shutdown-delay branch from 121fbc7 to bf963c8 Compare December 2, 2023 13:38

k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Dec 2, 2023

alvaroaleman reviewed Dec 2, 2023

View reviewed changes

inteon mentioned this pull request Dec 3, 2023

✨ Gracefully shutdown the c/r webhooks #2607

Closed

Add ShutdownDelay field

f9393cb

k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Dec 3, 2023

Replace newline

12f9864

dippynark changed the title ~~✨ Add support for controller-manager shutdown delay~~ ✨ Add support for controller-manager webhook shutdown delay Dec 3, 2023

Do not disable keep-alives

5c3c96e

Address comments

4bd1de4

inteon reviewed Dec 3, 2023

View reviewed changes

dippynark added 4 commits December 3, 2023 21:41

Keep shutdown timeout

384d8ce

Add newline

2b4c651

Improve comments

3529110

Add back newline

dca33cb

k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Dec 11, 2023

inteon reviewed Dec 12, 2023

View reviewed changes

dippynark requested review from alvaroaleman and inteon December 12, 2023 14:57

Remove unnecessary channel

9351414

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 17, 2024

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Apr 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

✨ Add support for controller-manager webhook shutdown delay #2601

✨ Add support for controller-manager webhook shutdown delay #2601

dippynark commented Dec 2, 2023 •

edited

k8s-ci-robot commented Dec 2, 2023

k8s-ci-robot commented Dec 2, 2023

k8s-ci-robot commented Dec 2, 2023

alvaroaleman Dec 2, 2023

dippynark Dec 2, 2023

inteon commented Dec 3, 2023 •

edited

dippynark commented Dec 3, 2023

inteon commented Dec 3, 2023 •

edited

dippynark commented Dec 3, 2023 •

edited

inteon commented Dec 3, 2023 •

edited

dippynark commented Dec 3, 2023 •

edited

alvaroaleman commented Dec 3, 2023

dippynark commented Dec 3, 2023

inteon Dec 3, 2023 •

edited

dippynark Dec 3, 2023

inteon commented Dec 11, 2023

dippynark commented Dec 12, 2023

inteon Dec 12, 2023

dippynark Dec 12, 2023 •

edited

inteon Dec 12, 2023

dippynark Dec 12, 2023 •

edited

inteon commented Dec 12, 2023 •

edited

sbueringer commented Dec 18, 2023

dippynark commented Dec 18, 2023 •

edited

k8s-triage-robot commented Mar 17, 2024

k8s-triage-robot commented Apr 16, 2024

	// SetupSignalHandler registers for SIGTERM and SIGINT. A context is returned
	// which is canceled on one of these signals. If a second signal is caught, the program
	// is terminated with exit code 1.
	func SetupSignalHandler() context.Context {

		case <-time.After(s.Options.ShutdownDelay):
		case <-ctx.Done():

	<-time.After(s.Options.ShutdownDelay)
	time.Sleep(s.Options.ShutdownDelay)

✨ Add support for controller-manager webhook shutdown delay #2601

Are you sure you want to change the base?

✨ Add support for controller-manager webhook shutdown delay #2601

Conversation

dippynark commented Dec 2, 2023 • edited

k8s-ci-robot commented Dec 2, 2023

k8s-ci-robot commented Dec 2, 2023

k8s-ci-robot commented Dec 2, 2023

alvaroaleman Dec 2, 2023

Choose a reason for hiding this comment

dippynark Dec 2, 2023

Choose a reason for hiding this comment

inteon commented Dec 3, 2023 • edited

dippynark commented Dec 3, 2023

inteon commented Dec 3, 2023 • edited

dippynark commented Dec 3, 2023 • edited

inteon commented Dec 3, 2023 • edited

dippynark commented Dec 3, 2023 • edited

alvaroaleman commented Dec 3, 2023

dippynark commented Dec 3, 2023

inteon Dec 3, 2023 • edited

Choose a reason for hiding this comment

dippynark Dec 3, 2023

Choose a reason for hiding this comment

inteon commented Dec 11, 2023

dippynark commented Dec 12, 2023

inteon Dec 12, 2023

Choose a reason for hiding this comment

dippynark Dec 12, 2023 • edited

Choose a reason for hiding this comment

inteon Dec 12, 2023

Choose a reason for hiding this comment

dippynark Dec 12, 2023 • edited

Choose a reason for hiding this comment

inteon commented Dec 12, 2023 • edited

sbueringer commented Dec 18, 2023

dippynark commented Dec 18, 2023 • edited

k8s-triage-robot commented Mar 17, 2024

k8s-triage-robot commented Apr 16, 2024

dippynark commented Dec 2, 2023 •

edited

inteon commented Dec 3, 2023 •

edited

inteon commented Dec 3, 2023 •

edited

dippynark commented Dec 3, 2023 •

edited

inteon commented Dec 3, 2023 •

edited

dippynark commented Dec 3, 2023 •

edited

inteon Dec 3, 2023 •

edited

dippynark Dec 12, 2023 •

edited

dippynark Dec 12, 2023 •

edited

inteon commented Dec 12, 2023 •

edited

dippynark commented Dec 18, 2023 •

edited