Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Daemonset wait_for_rollout does not appear to have any impact. #2092

Open
pidydx opened this issue May 7, 2023 · 7 comments · May be fixed by #2419
Open

Daemonset wait_for_rollout does not appear to have any impact. #2092

pidydx opened this issue May 7, 2023 · 7 comments · May be fixed by #2419

Comments

@pidydx
Copy link

pidydx commented May 7, 2023

Terraform Version, Provider Version and Kubernetes Version

Terraform version: Terraform v1.4.5
Kubernetes provider version: 2.20.0
Kubernetes version: 1.24.11

Affected Resource(s)

  • kubernetes_daemonset

Steps to Reproduce

  1. terraform apply

Expected Behavior

What should have happened?
A kubernetes_daemonset should wait on creation/update until the pods are ready before allowing terraform to procede

Actual Behavior

What actually happened?
Create/Update is instant and terraform moves on to any dependent resources without waiting.

Important Factoids

References

These issues:
#919
#1053

And this documentation:
https://registry.terraform.io/providers/hashicorp/kubernetes/latest/docs/resources/daemonset#wait_for_rollout

All indicate that kubernetes_daemonset wait_for_rollout should do something.

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment
@pidydx pidydx added the bug label May 7, 2023
@github-actions github-actions bot removed the bug label May 7, 2023
@matthi-g
Copy link

matthi-g commented May 9, 2023

I noticed the same problem. My current workaround is to use a manifest instead and define the wait block with
wait { rollout = true }

The problem with that approach is that you need permissions to list custom resource definitions. I hope either the necessity for CRDs in the manifest or the bug with the wait_for_rollout get fixed soon.

For reference of the manaifest CRD permissions:
#1665

@BBBmau
Copy link
Contributor

BBBmau commented May 10, 2023

Hello, thank you for opening this issue @matthi-g. Could you provide us with a tfconfig that reproduces this issue?

@matthi-g
Copy link

Hi @BBBmau,
the goal of my daemonset is to pre pull images and keep them at least until the daemon set is deleted. The first (buggy) version of my implementation uses the daemon_set_v1 resource:
daemon_set_v1.txt

The second version does the same thing but instead makes use of the manifest resource:
manifest.txt

Creation of the daemonset_v1 finishes after 0s but observing the state of the created pods on the k8s node indicates that the pods are not up and in running state. The creation of manifest finishes after 28 s (depends on the used images and bandwidth) and this seems to line up with the state of the pods.

@pidydx
Copy link
Author

pidydx commented May 22, 2023

@BBBmau I don't have an example on hand, but as noted above daemonsets complete instantly and don't wait for the nodes to be ready. Without diving into the code to see how the check is supposed to be happening, my guess is that the Daemonset check is not validating that the number pod pods running is equal to the number of current nodes rather than checking if the cluster is simply ready to deploy the pods to any node that comes online.

@sbocinec
Copy link
Contributor

sbocinec commented Feb 9, 2024

@BBBmau I confirm, the issue is valid and still present. Here is a short reproducer:

terraform {
  required_version = "~> 1.7"
    kubernetes = {
      source  = "hashicorp/kubernetes"
      version = "2.25.2"
    }
  }
}

# This should be configured individually depending in your setup
# the locals used here are omitted in the reproducer
provider "kubernetes" {
  host                   = local.cluster_config.cluster_endpoint
  cluster_ca_certificate = base64decode(local.cluster_config.cluster_ca)

  exec {
    api_version = "client.authentication.k8s.io/v1beta1"
    args        = ["eks", "get-token", "--cluster-name", local.cluster_config.cluster_name]
    command     = "awscli2"
  }
}

resource "kubernetes_daemon_set_v1" "i-am-not-waiting-for-rollout" {
  metadata {
    name      = "i-am-not-waiting-for-rollout"
    namespace = "default"
  }

  spec {
    selector {
      match_labels = {
        name = "i-am-not-waiting-for-rollout"
      }
    }
    template {
      metadata {
        labels = {
          name = "i-am-not-waiting-for-rollout"
        }
      }
      spec {
        container {
          name              = "pause"
          image             = "public.ecr.aws/eks-distro/kubernetes/pause:3.9@sha256:4668ea92d0ce9b5e5d8e84a4d3875d99aea7892e136a873425095f60bc22c49a"
          image_pull_policy = "IfNotPresent"
        }
        host_pid = true
        init_container {
          name              = "tuner"
          args = [
            "sysctl fs.inotify.max_user_instances=518"
          ]
          command = [
            "/bin/chroot",
            "/host",
            "/bin/bash",
            "-c",
            "--",
          ]
          image             = "public.ecr.aws/docker/library/busybox:1.36-glibc@sha256:e046063223f7eaafbfbc026aa3954d9a31b9f1053ba5db04a4f1fdc97abd8963"
          image_pull_policy = "IfNotPresent"
          security_context {
            capabilities {
              add = [
                "SYS_CHROOT",
              ]
              drop = [
                "all",
              ]
            }
            privileged = true
          }
          volume_mount {
            mount_path        = "/host"
            mount_propagation = "Bidirectional"
            name              = "hostfs"
          }
        }
        volume {
          host_path {
            path = "/"
          }
          name = "hostfs"
        }
      }
    }
  }
  wait_for_rollout = true
}

Initial apply is instant:

  Enter a value: yes                                                                                                           
                                                                                                                               
kubernetes_daemon_set_v1.i-am-not-waiting-for-rollout: Creating...                                                             
kubernetes_daemon_set_v1.i-am-not-waiting-for-rollout: Creation complete after 1s [id=default/i-am-not-waiting-for-rollout]
Apply complete! Resources: 1 added, 0 changed, 0 destroyed.                                                                    

Any further change is applied instantly, without waiting for the DS to rollout new pods.

  Enter a value: yes                                                                                                           
                                                                                                                               
kubernetes_daemon_set_v1.i-am-not-waiting-for-rollout: Modifying... [id=default/i-am-not-waiting-for-rollout]                  
kubernetes_daemon_set_v1.i-am-not-waiting-for-rollout: Modifications complete after 2s [id=default/i-am-not-waiting-for-rollout]                                                                                                                              
                                                                                                                               
Apply complete! Resources: 0 added, 1 changed, 0 destroyed.  

⚠️ As others mentioned, workaround is to use kubernetes_manifest resource, but if you define a DaemonSet and trigger any change, it also fails as there is a bug with computed fields not being respected #1591 . Currently, there is no way to manage DaemonSets with wait_for_rollout = true in this provider.

@sbocinec
Copy link
Contributor

sbocinec commented Feb 9, 2024

I have identified the issue and trying to prepare a fix #2419 - need to test the fix yet.

sbocinec added a commit to sbocinec/terraform-provider-kubernetes that referenced this issue Feb 12, 2024
This fix ensures the resource correctly waits for the DaemonSet rollout
by using similar logic than the kubernetes_deployment. Additionally,
waitForDaemonSetReplicasFunc is renamed to waitForDaemonSetPodsFunc to
correctly describe the operation (there are no replicas in a DaemonSet).
sbocinec added a commit to sbocinec/terraform-provider-kubernetes that referenced this issue Feb 19, 2024
This fix ensures the resource correctly waits for the DaemonSet rollout
by using similar logic than the kubernetes_deployment. Additionally,
waitForDaemonSetReplicasFunc is renamed to waitForDaemonSetPodsFunc to
correctly describe the operation (there are no replicas in a DaemonSet).
sbocinec added a commit to sbocinec/terraform-provider-kubernetes that referenced this issue Feb 19, 2024
This fix ensures the resource correctly waits for the DaemonSet rollout
by using similar logic than the kubernetes_deployment. Additionally,
waitForDaemonSetReplicasFunc is renamed to waitForDaemonSetPodsFunc to
correctly describe the operation (there are no replicas in a DaemonSet).
sbocinec added a commit to sbocinec/terraform-provider-kubernetes that referenced this issue Feb 19, 2024
This fix ensures the resource correctly waits for the DaemonSet rollout
by using similar logic than the kubernetes_deployment. Additionally,
waitForDaemonSetReplicasFunc is renamed to waitForDaemonSetPodsFunc to
correctly describe the operation (there are no replicas in a DaemonSet).
sbocinec added a commit to sbocinec/terraform-provider-kubernetes that referenced this issue Feb 19, 2024
This fix ensures the resource correctly waits for the DaemonSet rollout
by using similar logic than the kubernetes_deployment. Additionally,
waitForDaemonSetReplicasFunc is renamed to waitForDaemonSetPodsFunc to
correctly describe the operation (there are no replicas in a DaemonSet).
sbocinec added a commit to sbocinec/terraform-provider-kubernetes that referenced this issue Feb 19, 2024
This fix ensures the resource correctly waits for the DaemonSet rollout
by using similar logic than the kubernetes_deployment. Additionally,
waitForDaemonSetReplicasFunc is renamed to waitForDaemonSetPodsFunc to
correctly describe the operation (there are no replicas in a DaemonSet).
@sbocinec
Copy link
Contributor

I have meanwhile tested the PR, updated acceptance tests and the PR is from my POV ready. Could you please have a look @BBBmau?

Though, the fix is going to break many existing TF configs as the ineffective wait_for_rollout is set to true by default... I have wrote more in the PR description.

alexsomesan pushed a commit to sbocinec/terraform-provider-kubernetes that referenced this issue Apr 4, 2024
This fix ensures the resource correctly waits for the DaemonSet rollout
by using similar logic than the kubernetes_deployment. Additionally,
waitForDaemonSetReplicasFunc is renamed to waitForDaemonSetPodsFunc to
correctly describe the operation (there are no replicas in a DaemonSet).
alexsomesan pushed a commit to sbocinec/terraform-provider-kubernetes that referenced this issue Apr 5, 2024
This fix ensures the resource correctly waits for the DaemonSet rollout
by using similar logic than the kubernetes_deployment. Additionally,
waitForDaemonSetReplicasFunc is renamed to waitForDaemonSetPodsFunc to
correctly describe the operation (there are no replicas in a DaemonSet).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants