Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

no control plane node rolling update is triggered on a change of preRKE2Commands #307

Closed
tmmorin opened this issue Apr 24, 2024 · 1 comment · Fixed by #325
Closed

no control plane node rolling update is triggered on a change of preRKE2Commands #307

tmmorin opened this issue Apr 24, 2024 · 1 comment · Fixed by #325
Assignees
Labels
kind/bug Something isn't working needs-priority Indicates an issue or PR needs a priority assigning to it needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.

Comments

@tmmorin
Copy link

tmmorin commented Apr 24, 2024

I observed the following after adding a simple test command (echo 42 > /tmp/test) to preRKE2Commands in both my RKE2ControlPlane resource and the RKE2ConfigTemplate resource used for a MachineDeployment.

$ k get rke2controlplane management-cluster-control-plane -o yaml | yq .spec.preRKE2Commands[-1]                 
echo 42 > /tmp/test
$ k get rke2configtemplate management-cluster-md0-4a6c705d7e -o yaml | yq .spec.template.spec.preRKE2Commands[-1]                                 
echo 42 > /tmp/test

As expected, for the MachineDeployment a node rolling update was triggered.

All the RKE2Config resources for my MD have this test command:

$ k get rke2config -o yaml -l cluster.x-k8s.io/deployment-name=management-cluster-md0 | yq '.items[] | {"name":.metadata.name,"test":(.spec.preRKE2Commands[-1] | test('42'))}'
name: management-cluster-md0-4a6c705d7e-7dfxr
test: true
name: management-cluster-md0-4a6c705d7e-jdgcg
test: true
name: management-cluster-md0-4a6c705d7e-qk5xd
test: true

However, for the control plane, no rolling update was triggered.

The RKE2Config resources for the control plane don't have the test command:

$ k get rke2config -o yaml -l cluster.x-k8s.io/control-plane | yq '.items[] | {"name":.metadata.name,"test":(.spec.preRKE2Commands[-1] | test('42'))}'                               
name: management-cluster-control-plane-2qsws
test: false
name: management-cluster-control-plane-r5czr
test: false
name: management-cluster-control-plane-v48rj
test: false

The status of the RKE2ControlPlane is fully ready though, showing no sign of any rolling update being in progress:

$ k get rke2controlplane management-cluster-control-plane -o yaml | yq .status                                                                        
availableServerIPs:
  - 172.20.129.32
conditions:
  - lastTransitionTime: "2024-04-24T14:47:18Z"
    status: "True"
    type: Ready
  - lastTransitionTime: "2024-04-15T10:16:05Z"
    status: "True"
    type: Available
  - lastTransitionTime: "2024-04-15T10:16:05Z"
    status: "True"
    type: CertificatesAvailable
  - lastTransitionTime: "2024-04-24T14:51:15Z"
    status: "True"
    type: ControlPlaneComponentsHealthy
  - lastTransitionTime: "2024-04-24T14:47:18Z"
    status: "True"
    type: MachinesReady
  - lastTransitionTime: "2024-04-24T14:46:29Z"
    status: "True"
    type: MachinesSpecUpToDate
  - lastTransitionTime: "2024-04-24T14:47:18Z"
    status: "True"
    type: Resized
initialized: true
observedGeneration: 11
ready: true
readyReplicas: 3
replicas: 3
updatedReplicas: 3

Of course, the expected behavior would be to have a rolling update being triggered.

Note that a rolling update is properly triggered on a change of, for instance spec.agentConfig.kubelet.extraArgs.

(The title of this issue is about "a change of preRKE2Commands", because I didn't try to be exhaustive in this bug report, but we observed the issue on other fields and it's likely not specific to preRKE2Commands)

@tmmorin tmmorin added kind/bug Something isn't working needs-priority Indicates an issue or PR needs a priority assigning to it needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Apr 24, 2024
@tmmorin
Copy link
Author

tmmorin commented Apr 25, 2024

hello @belgaied2 @richardcase @Danil-Grigorev -- fyi ^

we can workaround this limitation by triggerring the rolling update via an arbitrary change in some benign spec.agentConfig.kubelet.extraArgs, but this really isn't great, because there remains the issue that for the unaware user, a intend to change will silently fail to be applied

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working needs-priority Indicates an issue or PR needs a priority assigning to it needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one.
Projects
None yet
2 participants