Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement custom containerd behavior for Windows Agents, ensure supporting processes exit #5419

Merged

Conversation

HarrisonWAffel
Copy link
Contributor

@HarrisonWAffel HarrisonWAffel commented Feb 9, 2024

Proposed Changes

Update the pebinary and staticpod executors to conform to the updated k3s executor interface which enables greater control over how containerd and docker are managed.

The pebinary executor partially reimplements the logic for starting containerd with one minor change: If the containerd process is killed for any reason, os.Exit will not be invoked and instead containerd will be restarted after 5 seconds. This is in line with how we start and restart other core components within the windows agent, such as the kubelet.

This change ensures that the rke2 agent process and the Windows service do not prematurely exit alongside containerd, giving enough for the signals.Context cancellation to fully propagate and stop all processes spawned from rke2/k3s.

In order to better coordinate how the RKE2 windows service shuts down, additional logic has been added to monitor processes spawned by RKE2 for up to ten seconds. This ensures that the agent process does not exit until all other processes have been cleaned up, or the time limit is reached.

The docker implementation for Windows agents has not been changed and the pebinary executor simply invokes the k3s implementation for docker.

The linux staticpod executor invokes the existing implementation within k3s for containerd and docker, resulting in no behavior changes for linux agents.

Types of Changes

Updates to the pebinary executor, staticpod executor, and service_windows

Verification

This can be verified by joining an RKE2 windows node to an existing cluster, and manually stopping the rke2 service using Powershell, ensuring that all rke2 processes stop at the same time.

To verify this manually I have done the following

  • Provision a Linux server node and start rke2 with rke2 server --cni calico
  • Compile a windows rke2 agent binary with the changes in this PR
  • Configure the windows node to pull runtime images from a custom registry populated with the required images by adding the system-default-registry field within /etc/rancher/rke2/config.yaml
  • Add the rke2 windows agent service and start it using Start-Service
  • Wait for the service to start, and for the windows agent to join the cluster successfully
  • Open Task Manager, run stop-service rke2, and ensure that all processes are stopped alongside rke2.exe. Optionally, watch the RKE2 service logs with Get-EventLog -logname application -source rke2
  • Start the rke2 service again and ensure that no issues are encountered during startup

Testing

The existing end to end testing which covers mixed OS clusters (tests/e2e/mixedos/mixedos_test.go) already tests that the windows agent starts and behaves as expected. I'm happy to help expand the e2e tests to cover this specific scenario if we feel it's necessary.

Linked Issues

#2204

User-Facing Change

NONE

Further Comments

Since this change uses the updated executor interface from k3s, I've bumped the k3s version to the latest commit on master (cfc3a124eed6) at the time of raising this PR.

@HarrisonWAffel HarrisonWAffel requested a review from a team as a code owner February 9, 2024 20:54
return true, nil
}

// MonitorProcessExit ensures that the kubelet, kube-proxy, calico-node, and containerd processes have stopped running.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an interesting way to do this. I'm not a big fan of inline powershell and shelling out to do things that we should be able to do from native golang.

Can we not use a WaitGroup when starting these processes through the PEBinaryExecutor, and just wait on that when shutting down?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit of face palm moment on my side - a WaitGroup is a much better idea. After looking into it a bit, I found apimachinery/pkg/util/wait which offers wait groups that utilize contexts. Spent some time Friday and this morning testing the changes out and everything works as desired without any inlined PowerShell.

Let me know what you think. Also, I wanted to get your opinion on if it would be a good idea to add a timeout for the WaitGroup.Wait call. I can't think of a situation where one of these commands would continue to run after the context cancelation, but I want to avoid a situation where we might wait forever for the WaitGroup to complete.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No I think this is good. If we're blocking on child process exit and want to be sure they get cleaned up, I think blocking shutdown indefinitely is appropriate.

@HarrisonWAffel HarrisonWAffel force-pushed the windows-agent-containerd-behavior branch from bab1b5a to 7264dea Compare February 12, 2024 17:41
@dereknola
Copy link
Contributor

I would love to see the E2E test expanded, since you already have a manual verification method.

@HarrisonWAffel
Copy link
Contributor Author

@dereknola Is this something that could be done as a follow up PR? My understanding is that code freeze for Feb patches is EOD, and so far I've had trouble setting up the e2e test suite on an ubuntu VM

@dereknola
Copy link
Contributor

dereknola commented Feb 12, 2024

It can happen in another PR. Two things for that:

  1. Open an issue to track adding the E2E test
  2. Expand the current Verification section of this PR with more info/commands on how you verified the changes.

If you give me enough info on it I can easily take over the testing changes.

@@ -614,6 +616,16 @@ func (s *StaticPodConfig) ETCD(ctx context.Context, args executor.ETCDConfig, ex
return staticpod.Run(s.ManifestsDir, spa)
}

// Containerd starts the k3s implementation of containerd
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

staticpod is the executor implementation for rke2-linux, hence I think the comment is wrong

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intent here is to call out that the rke2 static pod executor invokes the k3s implementation for containerd and docker. Current versions of rke2 already rely on the implementation within k3s, however now that the executor interface has been expanded I wanted to make it clear that the static pod executor invoked the k3s implementation

return containerd.Run(ctx, config)
}

// Docker starts the k3s implementation of cridockerd
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is correct, the comment is just calling out that all this does is wrap the k3s implementation.

@@ -160,9 +165,9 @@ func (p *PEBinaryConfig) Kubelet(ctx context.Context, args []string) error {
cleanArgs = append(cleanArgs, arg)
}

logrus.Infof("Running RKE2 kubelet %v", cleanArgs)
go func() {
win.ProcessWaitGroup.StartWithContext(ctx, func(ctx context.Context) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If now we use a context here, do you think it makes sense that we still have a separate one for the cni cniCtx?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The initial implementation of pebinary had access to this context as well. iiuc, using a separate context ensures that the CNI plugin is restarted each time that the kubelet is restarted, which may happen before the global context is canceled (for example, if someone accidentally runs Stop-Process for the kubelet). Since this PR does not focus on how the kubelet starts and stops the CNI plugins I opted to leave this unchanged.

Comment on lines +256 to +257
stdOut := io.Writer(os.Stdout)
stdErr := io.Writer(os.Stderr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you manage to get containerd logs in Windows when using this? I had a lot of trouble and was unable to do it, so by default I decided to dump logs in files (kube-proxy, etc)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I was able to find the logs in C:\var\lib\rancher\rke2\agent\containerd\containerd.log

pair[0] = strings.TrimPrefix(pair[0], "CONTAINERD_")
cenv = append(cenv, strings.Join(pair, "="))
default:
env = append(env, strings.Join(pair, "="))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't it only check on CONTAINERD_ envs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a mirror implementation of what k3s does, I'm not sure what the implications are if start ignoring specific environment variables

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah this is all existing logic copy-pasted from k3s. it strips CONTAINERD_ from vars with that prefix, and passes the rest through as-is. The intent is not to drop the unprefixed ones.

}

for {
logrus.Infof("Running containerd %s", config.ArgString(args[1:]))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might worth logging the env variales too, that's what we do in Calico to help debugging. If we are picking all of them, probably it makes more sense to log it only in Debug=true scenarios

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe I have addressed this

Copy link
Contributor

@brandond brandond Feb 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would avoid logging env vars, using env vars to pass through secrets or credentials since they don't get logged or show in ps output is a pattern that we want to continue to enable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah thats a good point, I'll update this

@HarrisonWAffel HarrisonWAffel force-pushed the windows-agent-containerd-behavior branch from 7264dea to 0ea7135 Compare February 12, 2024 18:52
…ticpod executor

Signed-off-by: Harrison Affel <harrisonaffel@gmail.com>
Signed-off-by: Harrison Affel <harrisonaffel@gmail.com>
@HarrisonWAffel HarrisonWAffel force-pushed the windows-agent-containerd-behavior branch from 0ea7135 to 0cdd703 Compare February 12, 2024 19:55
@HarrisonWAffel HarrisonWAffel merged commit 3f6c90b into rancher:master Feb 12, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants