Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ubuntu 22.04 fails to bootstrap on Azure #386

Closed
embik opened this issue May 3, 2024 · 0 comments · Fixed by #388
Closed

Ubuntu 22.04 fails to bootstrap on Azure #386

embik opened this issue May 3, 2024 · 0 comments · Fixed by #388
Labels
priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. sig/cluster-management Denotes a PR or issue as being assigned to SIG Cluster Management.

Comments

@embik
Copy link
Member

embik commented May 3, 2024

Since a few days (2-3?), Ubuntu 22.04 fails to bootstrap on Azure. This was first detected in kubermatic/machine-controller#1790.

It turns out that for some unknown reason, cloud-init init no longer works during the bootstrap phase. Its failure mode is mysterious:

root@djlrqb6kl4-worker-5czq9q-ddff4d849-8wpc6:~# cloud-init --debug init --file /etc/cloud/cloud.cfg.d/djlrqb6kl4-worker-5czq9q-kube-system-provisioning-config.cfg
[...]
2024-05-03 10:26:15,157 - util.py[WARNING]: No instance datasource found! Likely bad things to come!
2024-05-03 10:26:16,627 - activators.py[WARNING]: Running ['netplan', 'apply'] resulted in stderr output: WARNING:root:Cannot call Open vSwitch: ovsdb-server.service is not running.

root@djlrqb6kl4-worker-5czq9q-ddff4d849-8wpc6:~# echo $?
1

No clear error message is presented ("just" warnings), but the command exits with exit code 1 anyway. It doesn't seem the code responsible for the last line should produce an error, so it's intransparent what is causing the failure.

https://github.com/canonical/cloud-init/blob/4ffde902befbd7eacedaba98d485540147e7bae0/cloudinit/net/activators.py#L33-L40

Workaround

It seems possible to create a custom OSP from the default ubuntu-osp and run cloud-init --local before the real command. This is just considered a workaround and not for final inclusion in OSM, because we have zero idea why it works.

diff --git a/deploy/osps/default/osp-ubuntu.yaml b/deploy/osps/default/osp-ubuntu.yaml
index 90e680d..c58805b 100644
--- a/deploy/osps/default/osp-ubuntu.yaml
+++ b/deploy/osps/default/osp-ubuntu.yaml
@@ -111,6 +111,7 @@ spec:
               # Compare the semver values of cloud-init versions to determine the correct command to run.
               # This is required because the command line arguments for cloud-init changed in version 24.1, for details: https://github.com/canonical/cloud-init/releases/tag/24.1.
               if [[ $(echo -e "24.0.0\n$CLOUD_INIT_VERSION" | sort -V | head -n1) = "24.0.0" ]]; then
+                  cloud-init init --local --file /etc/cloud/cloud.cfg.d/{{ .SecretName }}.cfg
                   cloud-init init --file /etc/cloud/cloud.cfg.d/{{ .SecretName }}.cfg
               else
                   cloud-init --file /etc/cloud/cloud.cfg.d/{{ .SecretName }}.cfg init
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. sig/cluster-management Denotes a PR or issue as being assigned to SIG Cluster Management.
Projects
None yet
1 participant