uninstall: when --wait is specified, use foreground deletion. #2344

tommyp1ckles · 2024-02-28T00:38:26Z

By default, the helm libraries will use background cascading delete which means the call to do helm uninstall returns following the deployment being removed.

This means that running workloads, such as hubble-relay, may continue to be in the terminating state following cilium uninstall --wait exiting.

We depend on this behavior in CI E2E to clean up and reuse clusters for testing Cilium in different configurations.

In flakes such as: cilium/cilium#30993 it seems like the old Hubble Pods are bleeding into the "fresh" install. These should be harmless, however this is triggering failures of the [no-error-logs] assertion in the following connectivity tests.

This change will provide a more thorough uninstall procedure in this case.

michi-covalent

looks innocent enough

By default, the helm libraries will use background cascading delete which means the call to do helm uninstall returns following the deployment being removed. This means that running workloads, such as hubble-relay, may continue to be in the terminating state following `cilium uninstall --wait` exiting. We depend on this behavior in CI E2E to clean up and reuse clusters for testing Cilium in different configurations. In flakes such as: cilium/cilium#30993 it seems like the old Hubble Pods are bleeding into the "fresh" install. These should be harmless, however this is triggering failures of the [no-error-logs] assertion in the following connectivity tests. This change will provide a more thorough uninstall procedure in this case. Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com>

The last commit added using foreground cascading delete when doing uninstall with --wait. However, other issues that can occur when reusing clusters following uninstall are: * Old endpoint state written to disk being restored upon reinstall. * CNI deletes can be written to disk in a local queue if Cilium Agent CNI is down, resulting in potential error logs when re-installing cilium and replaying queued CNI DEL commands. When uninstalling with --wait, put disabling Hubble into a seperate uninstall step, which then blocks until there are no more Hubble Pods running. This ensures that Hubble Pods can fully terminate via Cilium without the above situations happening. Because Helm hubble disable uses Helm upgrade, we cannot rely on cascading foreground delete - so we just poll k8s until all Hubble Pods are gone. Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com>

derailed

@tommyp1ckles Nice work!

tommyp1ckles requested review from a team as code owners February 28, 2024 00:38

tommyp1ckles requested review from derailed and squeed February 28, 2024 00:38

tommyp1ckles temporarily deployed to ci February 28, 2024 00:38 — with GitHub Actions Inactive

michi-covalent approved these changes Feb 28, 2024

View reviewed changes

tommyp1ckles force-pushed the pr/tp/use-foreground-deletion-for-uninstall branch from 8fa1ac0 to 37c99e4 Compare February 28, 2024 01:42

tommyp1ckles requested review from a team as code owners February 28, 2024 01:42

tommyp1ckles requested review from lambdanis and christarazi February 28, 2024 01:42

tommyp1ckles temporarily deployed to ci February 28, 2024 01:42 — with GitHub Actions Inactive

tommyp1ckles force-pushed the pr/tp/use-foreground-deletion-for-uninstall branch from 37c99e4 to 1b96d74 Compare February 28, 2024 01:48

tommyp1ckles temporarily deployed to ci February 28, 2024 01:48 — with GitHub Actions Inactive

tommyp1ckles force-pushed the pr/tp/use-foreground-deletion-for-uninstall branch from 1b96d74 to 47f6b9e Compare February 28, 2024 01:52

tommyp1ckles temporarily deployed to ci February 28, 2024 01:52 — with GitHub Actions Inactive

christarazi approved these changes Feb 28, 2024

View reviewed changes

michi-covalent mentioned this pull request Feb 28, 2024

connectivity: Add "Link not found" in the exception list #2341

Closed

derailed approved these changes Feb 28, 2024

View reviewed changes

michi-covalent approved these changes Feb 28, 2024

View reviewed changes

tommyp1ckles merged commit f7987f8 into main Feb 28, 2024
13 checks passed

tommyp1ckles deleted the pr/tp/use-foreground-deletion-for-uninstall branch February 28, 2024 22:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

uninstall: when --wait is specified, use foreground deletion. #2344

uninstall: when --wait is specified, use foreground deletion. #2344

tommyp1ckles commented Feb 28, 2024

michi-covalent left a comment

derailed left a comment

uninstall: when --wait is specified, use foreground deletion. #2344

uninstall: when --wait is specified, use foreground deletion. #2344

Conversation

tommyp1ckles commented Feb 28, 2024

michi-covalent left a comment

Choose a reason for hiding this comment

derailed left a comment

Choose a reason for hiding this comment