Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ v1.28: Prepare quickstart, capd and tests for the new release including kind bump #9160

Merged

Conversation

chrischdi
Copy link
Member

@chrischdi chrischdi commented Aug 10, 2023

What this PR does / why we need it:

According to #8708 : Modify quickstart and CAPD to use the new Kubernetes release:

  • Bump the Kubernetes version in:

    • test/*: search for occurrences of the previous Kubernetes version
    • Tiltfile
  • Ensure the latest available kind version is used (including the latest images for this kind release)

  • Verify the quickstart manually

  • Prior art: ⚠️ Use Kubernetes 1.25 in Quick Start docs and CAPD. #7156

  • bump InitWithKubernetesVersion and WorkloadKubernetesVersion in clusterctl_upgrade_test.go

Open TODOs:

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Relates to #8708

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Aug 10, 2023
@chrischdi chrischdi marked this pull request as draft August 10, 2023 05:53
@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Aug 10, 2023
hack/ensure-kind.sh Outdated Show resolved Hide resolved
test/go.mod Outdated Show resolved Hide resolved
hack/ensure-kind.sh Outdated Show resolved Hide resolved
test/e2e/clusterctl_upgrade_test.go Outdated Show resolved Hide resolved
test/e2e/config/docker.yaml Outdated Show resolved Hide resolved
Copy link
Contributor

@killianmuldoon killianmuldoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One early nit - let's change from the breaking symbol in the header to a ✨ - @sbueringer mentioned this when we were drafting the most recent CAPI release notes.

@sbueringer
Copy link
Member

So far so good. I'll take another look once 1.28 and (probably) kind is out, etc.

docs/book/src/user/quick-start.md Outdated Show resolved Hide resolved
docs/book/src/developer/tilt.md Outdated Show resolved Hide resolved
@chrischdi chrischdi changed the title [WIP] ⚠️ v1.28: Prepare quickstart, capd and tests for the new release including kind bump [WIP] ✨ v1.28: Prepare quickstart, capd and tests for the new release including kind bump Aug 14, 2023
@chrischdi chrischdi marked this pull request as ready for review August 15, 2023 18:23
@chrischdi
Copy link
Member Author

/test help

@k8s-ci-robot
Copy link
Contributor

@chrischdi: The specified target(s) for /test were not found.
The following commands are available to trigger required jobs:

  • /test pull-cluster-api-build-main
  • /test pull-cluster-api-e2e-main
  • /test pull-cluster-api-test-main
  • /test pull-cluster-api-verify-main

The following commands are available to trigger optional jobs:

  • /test pull-cluster-api-apidiff-main
  • /test pull-cluster-api-e2e-full-dualstack-and-ipv6-main
  • /test pull-cluster-api-e2e-full-main
  • /test pull-cluster-api-e2e-informing-main
  • /test pull-cluster-api-e2e-mink8s-main
  • /test pull-cluster-api-e2e-scale-main-experimental
  • /test pull-cluster-api-e2e-workload-upgrade-1-27-latest-main
  • /test pull-cluster-api-test-mink8s-main

Use /test all to run the following jobs that were automatically triggered:

  • pull-cluster-api-apidiff-main
  • pull-cluster-api-build-main
  • pull-cluster-api-e2e-informing-main
  • pull-cluster-api-e2e-main
  • pull-cluster-api-test-main
  • pull-cluster-api-verify-main

In response to this:

/test help

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@chrischdi
Copy link
Member Author

/test pull-cluster-api-e2e-full-main

@chrischdi
Copy link
Member Author

/retitle ✨ v1.28: Prepare quickstart, capd and tests for the new release including kind bump

@k8s-ci-robot k8s-ci-robot changed the title [WIP] ✨ v1.28: Prepare quickstart, capd and tests for the new release including kind bump ✨ v1.28: Prepare quickstart, capd and tests for the new release including kind bump Aug 16, 2023
@sbueringer
Copy link
Member

sbueringer commented Aug 16, 2023

My current guess is: the built images shows v1.28.0 as kubelet version.
The used commit it resolved to is the commit right before the v1.28.0 tag 🤔

This shouldn't be it because the MachinePool should use the same image as KCP/MD (but looks like it doesn't)

For example: https://storage.googleapis.com/kubernetes-jenkins/pr-logs/pull/kubernetes-sigs_cluster-api/9160/pull-cluster-api-e2e-workload-upgrade-1-27-latest-main/1691763868927266816/artifacts/clusters/bootstrap/resources/k8s-upgrade-and-conformance-ow8pqg/Machine/k8s-upgrade-and-conformance-rjzpyt-md-0-7htck-5db77756xqd9b6pfz.yaml

EDIT: Ah sorry, Kubelet vs Machine version..

@sbueringer
Copy link
Member

sbueringer commented Aug 16, 2023

@chrischdi WDYT about extending our collection stuff to also collect Nodes from the workload cluster? (and not only kube-system pods)

In any case I would have expected that MachinePools are using our self-built kind image and that it actually also uses v1.28.0-rc.1.9+3fb5377b25ec51 as kubelet version

@chrischdi
Copy link
Member Author

@chrischdi WDYT about extending our collection stuff to also collect Nodes from the workload cluster? (and not only kube-system pods)

Jep that sounds great.

My current guess is: the built images shows v1.28.0 as kubelet version.
The used commit it resolved to is the commit right before the v1.28.0 tag 🤔

This shouldn't be it because the MachinePool should use the same image as KCP/MD (but looks like it doesn't)

For example: storage.googleapis.com/kubernetes-jenkins/pr-logs/pull/kubernetes-sigs_cluster-api/9160/pull-cluster-api-e2e-workload-upgrade-1-27-latest-main/1691763868927266816/artifacts/clusters/bootstrap/resources/k8s-upgrade-and-conformance-ow8pqg/Machine/k8s-upgrade-and-conformance-rjzpyt-md-0-7htck-5db77756xqd9b6pfz.yaml

EDIT: Ah sorry, Kubelet vs Machine version..

You're totally right.

The MachinePool Machines seems to have run the v1.28.0 image:

curl -s --output - https://storage.googleapis.com/kubernetes-jenkins/pr-logs/pull/kubernetes-sigs_cluster-api/9160/pull-cluster-api-e2e-workload-upgrade-1-27-latest-main/1691763868927266816/artifacts/clusters/k8s-upgrade-and-conformance-rjzpyt/machine-pools/k8s-upgrade-and-conformance-rjzpyt-mp-0/k8s-upgrade-and-conformance-rjzpyt-worker-16nsza/kubelet-version.txt
Kubernetes v1.28.0

Its also visible that the kindest/node:v1.28.0 image got pulled or built:

❯ curl -s https://storage.googleapis.com/kubernetes-jenkins/pr-logs/pull/kubernetes-sigs_cluster-api/9160/pull-cluster-api-e2e-workload-upgrade-1-27-latest-main/1691763868927266816/artifacts/localhost/docker-images.txt | grep 'kindest/node'
kindest/node                                                            v1.27.4                                          7aadb4f55cb2   13 minutes ago   1.05GB
kindest/node                                                            v1.28.0-rc.1.9_3fb5377b25ec51                    6f55849c1353   22 minutes ago   1.07GB
kindest/node                                                            v1.28.0                                          ad70201dab13   14 hours ago     950MB

The MachinePool and DockerMachinePool look correct though...

I'll take a look at it tomorrow. Also started the same test on a different branch/PR (#9209) to see if it is even caused by this PR.

@sbueringer
Copy link
Member

sbueringer commented Aug 16, 2023

Sounds good! Probably the quickest way to figure this one out is to just run it locally

I think it should be this PR as we should have a corresponding periodic which is green. But doesn't hurt to try
(this should be the same: https://testgrid.k8s.io/sig-cluster-lifecycle-cluster-api#capi-e2e-main-1-27-latest (otherwise it's a ProwJob config error))

@sbueringer
Copy link
Member

My bet is on the kind mapper btw!

@chrischdi
Copy link
Member Author

chrischdi commented Aug 17, 2023

It's reproducible locally with tilt-up + the following cluster + machinepool (which is from the failed test above):

apiVersion: cluster.x-k8s.io/v1beta1
kind: ClusterClass
metadata:
  name: quick-start
  namespace: default
spec:
  controlPlane:
    machineHealthCheck:
      maxUnhealthy: 100%
      unhealthyConditions:
      - status: "False"
        timeout: 20s
        type: e2e.remediation.condition
    machineInfrastructure:
      ref:
        apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
        kind: DockerMachineTemplate
        name: quick-start-control-plane
    metadata:
      annotations:
        ClusterClass.controlPlane.annotation: ClusterClass.controlPlane.annotationValue
      labels:
        ClusterClass.controlPlane.label: ClusterClass.controlPlane.labelValue
    ref:
      apiVersion: controlplane.cluster.x-k8s.io/v1beta1
      kind: KubeadmControlPlaneTemplate
      name: quick-start-control-plane
  infrastructure:
    ref:
      apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
      kind: DockerClusterTemplate
      name: quick-start-cluster
  patches:
  - definitions:
    - jsonPatches:
      - op: add
        path: /spec/template/spec/loadBalancer
        valueFrom:
          template: |
            imageRepository: {{ .lbImageRepository }}
      selector:
        apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
        kind: DockerClusterTemplate
        matchResources:
          infrastructureCluster: true
    name: lbImageRepository
  - definitions:
    - jsonPatches:
      - op: add
        path: /spec/template/spec/kubeadmConfigSpec/clusterConfiguration/etcd
        valueFrom:
          template: |
            local:
              imageTag: {{ .etcdImageTag }}
      selector:
        apiVersion: controlplane.cluster.x-k8s.io/v1beta1
        kind: KubeadmControlPlaneTemplate
        matchResources:
          controlPlane: true
    description: Sets tag to use for the etcd image in the KubeadmControlPlane.
    name: etcdImageTag
  - definitions:
    - jsonPatches:
      - op: add
        path: /spec/template/spec/kubeadmConfigSpec/clusterConfiguration/dns
        valueFrom:
          template: |
            imageTag: {{ .coreDNSImageTag }}
      selector:
        apiVersion: controlplane.cluster.x-k8s.io/v1beta1
        kind: KubeadmControlPlaneTemplate
        matchResources:
          controlPlane: true
    description: Sets tag to use for the etcd image in the KubeadmControlPlane.
    name: coreDNSImageTag
  - definitions:
    - jsonPatches:
      - op: add
        path: /spec/template/spec/customImage
        valueFrom:
          template: |
            kindest/node:{{ .builtin.machineDeployment.version | replace "+" "_" }}
      selector:
        apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
        kind: DockerMachineTemplate
        matchResources:
          machineDeploymentClass:
            names:
            - default-worker
    - jsonPatches:
      - op: add
        path: /spec/template/spec/customImage
        valueFrom:
          template: |
            kindest/node:{{ .builtin.controlPlane.version | replace "+" "_" }}
      selector:
        apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
        kind: DockerMachineTemplate
        matchResources:
          controlPlane: true
    description: Sets the container image that is used for running dockerMachines
      for the controlPlane and default-worker machineDeployments.
    name: customImage
  - definitions:
    - jsonPatches:
      - op: add
        path: /spec/template/spec/preLoadImages
        valueFrom:
          variable: preLoadImages
      selector:
        apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
        kind: DockerMachineTemplate
        matchResources:
          controlPlane: true
          machineDeploymentClass:
            names:
            - default-worker
    description: |
      Sets the container images to preload to the node that is used for running dockerMachines.
      This is especially required for self-hosted e2e tests to ensure the required controller images to be available
      and reduce load to public registries.
    name: preloadImages
  - definitions:
    - jsonPatches:
      - op: add
        path: /spec/template/spec/rolloutStrategy/rollingUpdate/maxSurge
        valueFrom:
          template: '{{ .kubeadmControlPlaneMaxSurge }}'
      selector:
        apiVersion: controlplane.cluster.x-k8s.io/v1beta1
        kind: KubeadmControlPlaneTemplate
        matchResources:
          controlPlane: true
    description: Sets the maxSurge value used for rolloutStrategy in the KubeadmControlPlane.
    enabledIf: '{{ ne .kubeadmControlPlaneMaxSurge "" }}'
    name: kubeadmControlPlaneMaxSurge
  - definitions:
    - jsonPatches:
      - op: add
        path: /spec/template/spec/kubeadmConfigSpec/initConfiguration/nodeRegistration/taints
        value: []
      - op: add
        path: /spec/template/spec/kubeadmConfigSpec/joinConfiguration/nodeRegistration/taints
        value: []
      selector:
        apiVersion: controlplane.cluster.x-k8s.io/v1beta1
        kind: KubeadmControlPlaneTemplate
        matchResources:
          controlPlane: true
    enabledIf: '{{ not .controlPlaneTaint }}'
    name: controlPlaneTaint
  - definitions:
    - jsonPatches:
      - op: add
        path: /spec/template/spec/kubeadmConfigSpec/joinConfiguration/nodeRegistration/kubeletExtraArgs
        value:
          cloud-provider: external
      - op: add
        path: /spec/template/spec/kubeadmConfigSpec/initConfiguration/nodeRegistration/kubeletExtraArgs
        value:
          cloud-provider: external
      selector:
        apiVersion: controlplane.cluster.x-k8s.io/v1beta1
        kind: KubeadmControlPlaneTemplate
        matchResources:
          controlPlane: true
    description: Configures kubelet to run with an external cloud provider for control
      plane nodes.
    enabledIf: '{{ .externalCloudProvider }}'
    name: controlPlaneExternalCloudProvider
  - definitions:
    - jsonPatches:
      - op: add
        path: /spec/template/spec/joinConfiguration/nodeRegistration/kubeletExtraArgs
        value:
          cloud-provider: external
      selector:
        apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
        kind: KubeadmConfigTemplate
        matchResources:
          machineDeploymentClass:
            names:
            - '*-worker'
    description: Configures kubelet to run with an external cloud provider for machineDeployment
      nodes.
    enabledIf: '{{ .externalCloudProvider }}'
    name: machineDeploymentExternalCloudProvider
  - definitions:
    - jsonPatches:
      - op: add
        path: /spec/template/spec/kubeadmConfigSpec/initConfiguration/localAPIEndpoint
        value:
          advertiseAddress: '::'
      selector:
        apiVersion: controlplane.cluster.x-k8s.io/v1beta1
        kind: KubeadmControlPlaneTemplate
        matchResources:
          controlPlane: true
    description: Configures KCP to use IPv6 for its localAPIEndpoint.
    enabledIf: '{{ .ipv6Primary }}'
    name: localEndpointIPv6
  - definitions:
    - jsonPatches:
      - op: add
        path: /spec/template/spec/kubeadmConfigSpec/clusterConfiguration/apiServer/extraArgs
        value:
          admission-control-config-file: /etc/kubernetes/kube-apiserver-admission-pss.yaml
      - op: add
        path: /spec/template/spec/kubeadmConfigSpec/clusterConfiguration/apiServer/extraVolumes
        value:
        - hostPath: /etc/kubernetes/kube-apiserver-admission-pss.yaml
          mountPath: /etc/kubernetes/kube-apiserver-admission-pss.yaml
          name: admission-pss
          pathType: File
          readOnly: true
      - op: add
        path: /spec/template/spec/kubeadmConfigSpec/files
        valueFrom:
          template: |
            - content: |
                apiVersion: apiserver.config.k8s.io/v1
                kind: AdmissionConfiguration
                plugins:
                - name: PodSecurity
                  configuration:
                    apiVersion: pod-security.admission.config.k8s.io/v1{{ if semverCompare "< v1.25" .builtin.controlPlane.version }}beta1{{ end }}
                    kind: PodSecurityConfiguration
                    defaults:
                      enforce: "baseline"
                      enforce-version: "latest"
                      audit: "baseline"
                      audit-version: "latest"
                      warn: "baseline"
                      warn-version: "latest"
                    exemptions:
                      usernames: []
                      runtimeClasses: []
                      namespaces: [kube-system]
              path: /etc/kubernetes/kube-apiserver-admission-pss.yaml
      selector:
        apiVersion: controlplane.cluster.x-k8s.io/v1beta1
        kind: KubeadmControlPlaneTemplate
        matchResources:
          controlPlane: true
    description: Adds an admission configuration for PodSecurity to the kube-apiserver.
    enabledIf: '{{ semverCompare ">= v1.24" .builtin.controlPlane.version }}'
    name: podSecurityStandard
  variables:
  - name: lbImageRepository
    required: true
    schema:
      openAPIV3Schema:
        default: kindest
        type: string
  - name: etcdImageTag
    required: true
    schema:
      openAPIV3Schema:
        default: ""
        description: etcdImageTag sets the tag for the etcd image.
        example: 3.5.3-0
        type: string
  - name: coreDNSImageTag
    required: true
    schema:
      openAPIV3Schema:
        default: ""
        description: coreDNSImageTag sets the tag for the coreDNS image.
        example: v1.8.5
        type: string
  - name: kubeadmControlPlaneMaxSurge
    required: false
    schema:
      openAPIV3Schema:
        default: ""
        description: kubeadmControlPlaneMaxSurge is the maximum number of control
          planes that can be scheduled above or under the desired number of control
          plane machines.
        example: "0"
        type: string
  - name: preLoadImages
    required: false
    schema:
      openAPIV3Schema:
        default: []
        description: preLoadImages sets the images for the docker machines to preload.
        items:
          type: string
        type: array
  - name: controlPlaneTaint
    required: false
    schema:
      openAPIV3Schema:
        default: true
        type: boolean
  - name: externalCloudProvider
    required: false
    schema:
      openAPIV3Schema:
        default: false
        type: boolean
  - name: ipv6Primary
    required: false
    schema:
      openAPIV3Schema:
        default: false
        type: boolean
  workers:
    machineDeployments:
    - class: default-worker
      machineHealthCheck:
        maxUnhealthy: 100%
        unhealthyConditions:
        - status: "False"
          timeout: 20s
          type: e2e.remediation.condition
      template:
        bootstrap:
          ref:
            apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
            kind: KubeadmConfigTemplate
            name: quick-start-default-worker-bootstraptemplate
        infrastructure:
          ref:
            apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
            kind: DockerMachineTemplate
            name: quick-start-default-worker-machinetemplate
        metadata:
          annotations:
            ClusterClass.machineDeployment.annotation: ClusterClass.machineDeployment.annotationValue
          labels:
            ClusterClass.machineDeployment.label: ClusterClass.machineDeployment.labelValue
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: DockerClusterTemplate
metadata:
  annotations:
    InfrastructureClusterTemplate.annotation: InfrastructureClusterTemplate.annotationValue
  labels:
    InfrastructureClusterTemplate.label: InfrastructureClusterTemplate.labelValue
  name: quick-start-cluster
  namespace: default
spec:
  template:
    metadata:
      annotations:
        InfrastructureClusterTemplate.template.annotation: InfrastructureClusterTemplate.template.annotationValue
      labels:
        InfrastructureClusterTemplate.template.label: InfrastructureClusterTemplate.template.labelValue
    spec:
      failureDomains:
        fd1:
          controlPlane: true
        fd2:
          controlPlane: true
        fd3:
          controlPlane: true
        fd4:
          controlPlane: false
        fd5:
          controlPlane: false
        fd6:
          controlPlane: false
        fd7:
          controlPlane: false
        fd8:
          controlPlane: false
---
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlaneTemplate
metadata:
  annotations:
    ControlPlaneTemplate.annotation: ControlPlaneTemplate.annotationValue
  labels:
    ControlPlaneTemplate.label: ControlPlaneTemplate.labelValue
  name: quick-start-control-plane
  namespace: default
spec:
  template:
    metadata:
      annotations:
        ControlPlaneTemplate.template.annotation: ControlPlaneTemplate.template.annotationValue
      labels:
        ControlPlaneTemplate.template.label: ControlPlaneTemplate.template.labelValue
    spec:
      kubeadmConfigSpec:
        clusterConfiguration:
          apiServer:
            certSANs:
            - localhost
            - host.docker.internal
            - '::'
            - ::1
            - 127.0.0.1
            - 0.0.0.0
          controllerManager:
            extraArgs:
              enable-hostpath-provisioner: "true"
        initConfiguration:
          nodeRegistration:
            kubeletExtraArgs:
              eviction-hard: nodefs.available<0%,nodefs.inodesFree<0%,imagefs.available<0%
        joinConfiguration:
          nodeRegistration:
            kubeletExtraArgs:
              eviction-hard: nodefs.available<0%,nodefs.inodesFree<0%,imagefs.available<0%
      machineTemplate:
        metadata:
          annotations:
            ControlPlaneTemplate.machineTemplate.annotation: ControlPlaneTemplate.machineTemplate.annotationValue
          labels:
            ControlPlaneTemplate.machineTemplate.label: ControlPlaneTemplate.machineTemplate.labelValue
        nodeDrainTimeout: 1s
      rolloutBefore:
        certificatesExpiryDays: 21
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: DockerMachineTemplate
metadata:
  annotations:
    InfraMachineTemplate.controlPlane.annotation: InfraMachineTemplate.controlPlane.annotationValue
  labels:
    InfraMachineTemplate.controlPlane.label: InfraMachineTemplate.controlPlane.labelValue
  name: quick-start-control-plane
  namespace: default
spec:
  template:
    metadata:
      annotations:
        InfraMachineTemplate.controlPlane.template.annotation: InfraMachineTemplate.controlPlane.template.annotationValue
      labels:
        InfraMachineTemplate.controlPlane.template.label: InfraMachineTemplate.controlPlane.template.labelValue
    spec:
      extraMounts:
      - containerPath: /var/run/docker.sock
        hostPath: /var/run/docker.sock
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: DockerMachineTemplate
metadata:
  annotations:
    InfraMachineTemplate.machineDeployment.annotation: InfraMachineTemplate.machineDeployment.annotationValue
  labels:
    InfraMachineTemplate.machineDeployment.label: InfraMachineTemplate.machineDeployment.labelValue
  name: quick-start-default-worker-machinetemplate
  namespace: default
spec:
  template:
    metadata:
      annotations:
        InfraMachineTemplate.machineDeployment.template.annotation: InfraMachineTemplate.machineDeployment.template.annotationValue
      labels:
        InfraMachineTemplate.machineDeployment.template.label: InfraMachineTemplate.machineDeployment.template.labelValue
    spec:
      extraMounts:
      - containerPath: /var/run/docker.sock
        hostPath: /var/run/docker.sock
---
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: KubeadmConfigTemplate
metadata:
  annotations:
    BootstrapConfigTemplate.machineDeployment.annotation: BootstrapConfigTemplate.machineDeployment.annotationValue
  labels:
    BootstrapConfigTemplate.machineDeployment.label: BootstrapConfigTemplate.machineDeployment.labelValue
  name: quick-start-default-worker-bootstraptemplate
  namespace: default
spec:
  template:
    metadata:
      annotations:
        BootstrapConfigTemplate.machineDeployment.template.annotation: BootstrapConfigTemplate.machineDeployment.template.annotationValue
      labels:
        BootstrapConfigTemplate.machineDeployment.template.label: BootstrapConfigTemplate.machineDeployment.template.labelValue
    spec:
      joinConfiguration:
        nodeRegistration:
          kubeletExtraArgs:
            eviction-hard: nodefs.available<0%,nodefs.inodesFree<0%,imagefs.available<0%
---
apiVersion: v1
binaryData: null
data:
  resources: |
    # kindnetd networking manifest
    ---
    kind: ClusterRole
    apiVersion: rbac.authorization.k8s.io/v1
    metadata:
      name: kindnet
    rules:
      - apiGroups:
          - ""
        resources:
          - nodes
        verbs:
          - list
          - watch
          - patch
      - apiGroups:
          - ""
        resources:
          - configmaps
        verbs:
          - get
    ---
    kind: ClusterRoleBinding
    apiVersion: rbac.authorization.k8s.io/v1
    metadata:
      name: kindnet
    roleRef:
      apiGroup: rbac.authorization.k8s.io
      kind: ClusterRole
      name: kindnet
    subjects:
      - kind: ServiceAccount
        name: kindnet
        namespace: kube-system
    ---
    apiVersion: v1
    kind: ServiceAccount
    metadata:
      name: kindnet
      namespace: kube-system
    ---
    apiVersion: apps/v1
    kind: DaemonSet
    metadata:
      name: kindnet
      namespace: kube-system
      labels:
        tier: node
        app: kindnet
        k8s-app: kindnet
    spec:
      selector:
        matchLabels:
          app: kindnet
      template:
        metadata:
          labels:
            tier: node
            app: kindnet
            k8s-app: kindnet
        spec:
          hostNetwork: true
          tolerations:
            - operator: Exists
              effect: NoSchedule
          serviceAccountName: kindnet
          containers:
            - name: kindnet-cni
              image: kindest/kindnetd:v20230511-dc714da8
              env:
                - name: HOST_IP
                  valueFrom:
                    fieldRef:
                      fieldPath: status.hostIP
                - name: POD_IP
                  valueFrom:
                    fieldRef:
                      fieldPath: status.podIP
                # We're using the dualstack CIDRs here. The order doesn't matter for kindnet as the loops are run concurrently.
                # REF: https://github.com/kubernetes-sigs/kind/blob/3dbeb894e3092a336ab4278d3823e73a1d66aff7/images/kindnetd/cmd/kindnetd/main.go#L149-L175
                - name: POD_SUBNET
                  value: '192.168.0.0/16,fd00:100:96::/48'
              volumeMounts:
                - name: cni-cfg
                  mountPath: /etc/cni/net.d
                - name: xtables-lock
                  mountPath: /run/xtables.lock
                  readOnly: false
                - name: lib-modules
                  mountPath: /lib/modules
                  readOnly: true
              resources:
                requests:
                  cpu: "100m"
                  memory: "50Mi"
                limits:
                  cpu: "100m"
                  memory: "50Mi"
              securityContext:
                privileged: false
                capabilities:
                  add: ["NET_RAW", "NET_ADMIN"]
          volumes:
            - name: cni-bin
              hostPath:
                path: /opt/cni/bin
                type: DirectoryOrCreate
            - name: cni-cfg
              hostPath:
                path: /etc/cni/net.d
                type: DirectoryOrCreate
            - name: xtables-lock
              hostPath:
                path: /run/xtables.lock
                type: FileOrCreate
            - name: lib-modules
              hostPath:
                path: /lib/modules
kind: ConfigMap
metadata:
  name: cni-k8s-upgrade-and-conformance-rjzpyt-crs-0
  namespace: default
---
apiVersion: addons.cluster.x-k8s.io/v1beta1
kind: ClusterResourceSet
metadata:
  name: k8s-upgrade-and-conformance-rjzpyt-crs-0
  namespace: default
spec:
  clusterSelector:
    matchLabels:
      cni: k8s-upgrade-and-conformance-rjzpyt-crs-0
  resources:
  - kind: ConfigMap
    name: cni-k8s-upgrade-and-conformance-rjzpyt-crs-0
  strategy: ApplyOnce
---
apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
kind: KubeadmConfig
metadata:
  name: k8s-upgrade-and-conformance-rjzpyt-mp-0-config
  namespace: default
spec:
  joinConfiguration:
    nodeRegistration: {}
---
apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
  labels:
    cni: k8s-upgrade-and-conformance-rjzpyt-crs-0
  name: k8s-upgrade-and-conformance-rjzpyt
  namespace: default
spec:
  clusterNetwork:
    pods:
      cidrBlocks:
      - 192.168.0.0/16
    serviceDomain: cluster.local
    services:
      cidrBlocks:
      - 10.128.0.0/12
  topology:
    class: quick-start
    controlPlane:
      metadata:
        annotations:
          Cluster.topology.controlPlane.annotation: Cluster.topology.controlPlane.annotationValue
        labels:
          Cluster.topology.controlPlane.label: Cluster.topology.controlPlane.labelValue
          Cluster.topology.controlPlane.label.node.cluster.x-k8s.io: Cluster.topology.controlPlane.nodeLabelValue
      nodeDeletionTimeout: 30s
      nodeVolumeDetachTimeout: 5m
      replicas: 1
    variables:
    - name: etcdImageTag
      value: ""
    - name: coreDNSImageTag
      value: ""
    - name: preLoadImages
      value: []
    version: v1.28.0-rc.1.9+3fb5377b25ec51
    workers:
      machineDeployments:
      - class: default-worker
        failureDomain: fd4
        metadata:
          annotations:
            Cluster.topology.machineDeployment.annotation: Cluster.topology.machineDeployment.annotationValue
          labels:
            Cluster.topology.machineDeployment.label: Cluster.topology.machineDeployment.labelValue
            Cluster.topology.machineDeployment.label.node.cluster.x-k8s.io: Cluster.topology.machineDeployment.nodeLabelValue
        minReadySeconds: 5
        name: md-0
        nodeDeletionTimeout: 30s
        nodeVolumeDetachTimeout: 5m
        replicas: 2
        strategy:
          rollingUpdate:
            maxSurge: 20%
            maxUnavailable: 0
          type: RollingUpdate
---
apiVersion: cluster.x-k8s.io/v1beta1
kind: MachinePool
metadata:
  name: k8s-upgrade-and-conformance-rjzpyt-mp-0
  namespace: default
spec:
  clusterName: k8s-upgrade-and-conformance-rjzpyt
  failureDomains:
  - fd4
  - fd5
  - fd6
  - fd7
  - fd8
  replicas: 2
  template:
    spec:
      bootstrap:
        configRef:
          apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
          kind: KubeadmConfig
          name: k8s-upgrade-and-conformance-rjzpyt-mp-0-config
      clusterName: k8s-upgrade-and-conformance-rjzpyt
      infrastructureRef:
        apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
        kind: DockerMachinePool
        name: k8s-upgrade-and-conformance-rjzpyt-dmp-0
      version: v1.28.0-rc.1.9+3fb5377b25ec51
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: DockerMachinePool
metadata:
  name: k8s-upgrade-and-conformance-rjzpyt-dmp-0
  namespace: default
spec:
  template:
    preLoadImages: []

What I am able to observe is: for the normal DockerMachines, we set .spec.template.spec.customImage to kindest/node:v1.28.0-rc.1.9_3fb5377b25ec51. This is done via a ClusterClass patch (see yaml above). We don't do the same for the DockerMachinepool.

In the mapper we then return a "BestGuess" in

This gets called from the dockermachinepool_controller here:

Which in this case then will be v1.28.0.
We ignore the pre part here in the mapper (rc.1.9)

@sbueringer
Copy link
Member

sbueringer commented Aug 17, 2023

Hm. This will be brought in sync soon with Willie's CC MP PR. What options do you see in the meantime?

Also wondering why our periodic works

@chrischdi
Copy link
Member Author

chrischdi commented Aug 17, 2023

Hm. This will be brought in sync soon with Willie's CC MP PR. What options do you see in the meantime?

Also wondering why our periodic works

Periodic works because the mapper does not have an entry for v1.28.0 yet (that one gets introduced in this PR).

Or in other words: this PR introduces the mapper entry for v1.28.0. With that: the mapper returns v1.28.0 as best guess for any semver versions which also have v1.28.0 as major, minor, patch. It ignores the pre versions.

Current options I see is:

  • Ignore the issue for now and fix it via CC support for DockerMachinePools. Ignoring would be possible by merging CAPI: bump jobs for v1.28 kubernetes/test-infra#30347 . This way we would not test to upgrade to v1.28.0.rc... anymore
  • do the same for DockerMachinePools as we do for DockerMachines, which would be setting the customImage variable at the DockerMachinePool as we do for the DockerMachines by adding custom handling to this code for DockerMachinePools.
    • Once we have DockerMachinePools in CC we could remove this again.
    • Note: this will get pretty ugly because we either need to rotate the DockerMachinePool referenced by the MachinePool, or we have to use the pause annotation to not cause multi-rollouts.

@chrischdi
Copy link
Member Author

Taking a step back: a different fix would be to adjust https://github.com/kubernetes-sigs/cluster-api/blob/5350e793ce279aaa048ba51d87e309bbb2ee89d4/scripts/ci-e2e-lib.sh :

Change the handling of ci/latest-1.28 to use stable-1.28 if there already is a stable version.

@chrischdi
Copy link
Member Author

/test pull-cluster-api-e2e-workload-upgrade-1-27-1-28-main

@k8s-ci-robot
Copy link
Contributor

@chrischdi: The specified target(s) for /test were not found.
The following commands are available to trigger required jobs:

  • /test pull-cluster-api-build-main
  • /test pull-cluster-api-e2e-main
  • /test pull-cluster-api-test-main
  • /test pull-cluster-api-verify-main

The following commands are available to trigger optional jobs:

  • /test pull-cluster-api-apidiff-main
  • /test pull-cluster-api-e2e-full-dualstack-and-ipv6-main
  • /test pull-cluster-api-e2e-full-main
  • /test pull-cluster-api-e2e-informing-main
  • /test pull-cluster-api-e2e-mink8s-main
  • /test pull-cluster-api-e2e-scale-main-experimental
  • /test pull-cluster-api-e2e-workload-upgrade-1-28-latest-main
  • /test pull-cluster-api-test-mink8s-main

Use /test all to run the following jobs that were automatically triggered:

  • pull-cluster-api-apidiff-main
  • pull-cluster-api-build-main
  • pull-cluster-api-e2e-informing-main
  • pull-cluster-api-e2e-main
  • pull-cluster-api-test-main
  • pull-cluster-api-verify-main

In response to this:

/test pull-cluster-api-e2e-workload-upgrade-1-27-1-28-main

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@chrischdi
Copy link
Member Author

pull-cluster-api-e2e-workload-upgrade-1-28-latest-main

@chrischdi
Copy link
Member Author

/test pull-cluster-api-e2e-workload-upgrade-1-28-latest-main

Copy link
Member

@furkatgofurov7 furkatgofurov7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@furkatgofurov7
Copy link
Member

/area testing
/area documentation

@k8s-ci-robot k8s-ci-robot added area/testing Issues or PRs related to testing area/documentation Issues or PRs related to documentation labels Aug 17, 2023
@chrischdi
Copy link
Member Author

Note: we decided to merge

First to get a working upgrade-1-28-latest job.

The issue should be fixed for the next bump via #9016

@chrischdi
Copy link
Member Author

job succeeded @sbueringer :-)

@sbueringer
Copy link
Member

Nice!!

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 17, 2023
@k8s-ci-robot k8s-ci-robot merged commit 8679825 into kubernetes-sigs:main Aug 17, 2023
42 of 43 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v1.6 milestone Aug 17, 2023
@chrischdi chrischdi deleted the pr-1-28-quickstart-capd branch August 17, 2023 11:14
@sbueringer sbueringer mentioned this pull request Jan 19, 2024
13 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/documentation Issues or PRs related to documentation area/testing Issues or PRs related to testing cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants