Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubectl provider prevents upgrading CRDs by validating the existing manifest before applying change #2450

Open
msillence opened this issue Mar 21, 2024 · 2 comments
Labels
acknowledged Issue has undergone initial review and is in our work queue. bug progressive apply

Comments

@msillence
Copy link

There is no way to use teraform when updating CRDs and making a breaking schema change

The kubectl provider validates the existing deployed manifest before trying to apply the change

The kubectl provider will also validate the result, preventing changing it to the new structure before the CRD upgrade

Terraform Version, Provider Version and Kubernetes Version

Terraform version: v1.7.5
Kubernetes provider version: v2.27.0
Kubernetes version: v1.26.11-gke.1055000

Affected Resource(s)

kubectl

Terraform Configuration Files

resource "kubernetes_manifest" "kafka_broker_pool" {
  manifest = {
    "apiVersion" = "kafka.strimzi.io/v1beta2"
    "kind"       = "KafkaNodePool"
    "metadata" = {
      "labels" = {
        "strimzi.io/cluster" = var.strimzi_cluster_name
        # make pool numbering predictable
      }
      annotations = {
          "strimzi.io/next-node-ids" = "[100-200]"
      }
      "name"      = "broker"
      "namespace" = var.namespace
    }
    "spec" = {
      "replicas" = var.broker_replicas
        "resources" = {
          "requests" = {
            "memory" = "2Gi"
            "cpu"    = "0.5"
          }
          "limits" = {
            "memory" = "5Gi"
          }
        }
        "jvmOptions" = {
          "-XX" = {
            "MaxRAMPercentage" = "50"
            "UseG1GC" : "true"
            "MaxGCPauseMillis"               = "20"
            "InitiatingHeapOccupancyPercent" = "35"
            "ExplicitGCInvokesConcurrent"    = "true"
          }
        }
      "roles" = [
        "broker",
      ]
      "storage" = {
        "type" = "jbod"
        "volumes" = [
          {
            "deleteClaim" = false
            "id"          = 0
            "size"        = var.kafka_disk_size
            "type"        = "persistent-claim"
            "class"       = "kafka-storage"
          },
        ]
      }
      template = {
        pod = {
          metadata = {
            annotations = {
              "telegraf.influxdata.com/class"                = "timescale",
              "telegraf.influxdata.com/inputs"               = "${local.broker_annotations}"
              "telegraf.influxdata.com/env-fieldref-HOST_IP" = "status.hostIP"
              "telegraf.influxdata.com/env-fieldref-APP" = "metadata.labels['app']"
            }
          }
          affinity = {
            nodeAffinity = {
              requiredDuringSchedulingIgnoredDuringExecution = {
                nodeSelectorTerms = [
                  {
                    matchExpressions = [
                      {
                        key      = "topology.kubernetes.io/zone"
                        operator = "In"
                        values   = ["europe-west2-a", "europe-west2-b", "europe-west2-c"]
                      }
                    ]
                  }
                ]
              }
            }
          }
        }
      }
    }
  }
}

Steps to Reproduce

Install strimzi operator 0.39
Install a node pool with the unquoted jvmOptions:

       "jvmOptions" = {
          "-XX" = {
            "MaxRAMPercentage" = 50

Upgrade strimzi operator to 0.40

Upgrade the CRDs manually by from the release page and downloading then applying with kubectl

Try and update the terraform to the new schema with quoted values:

        "jvmOptions" = {
          "-XX" = {
            "MaxRAMPercentage" = "50"

Expected Behavior

The new terraform should apply
The resulting manifest should be validatated against the schema (CRDs)

Actual Behavior

The current manifest is validated against the current schema and the plan/apply fails with an error about types

│ [AttributeName("spec").AttributeName("jvmOptions").AttributeName("-XX").ElementKeyString("ExplicitGCInvokesConcurrent")]
│ cannot convert payload from "bool" to "tftypes.String"': err

Even removing the block jvmOptions from the terraform still fails reporting the same error

The only way to resolve this seems to be to manually correct the current manifest downloading it with kubectl then fixing and applying

After that the terraform can be run and applied

You can't apply the new change before the CRD update as the new values would not be valid under the old CRD

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment
@sheneska
Copy link
Contributor

Hi @msillence, thank you for opening this issue.
Please try to do two separate apply, first apply the CRD changes and then do everything else in the second apply.
Let us know if you have any questions.

@msillence
Copy link
Author

msillence commented Mar 28, 2024

Hi Sheneska thanks for your reply.

The problem seems to be that the old existing manifest is validated against the current schema (CRD) and then the new manifest is validated against the current schema.

If the schema changes in a breaking way there is no way terraform can validate the deployed manifest and new manifest against a single schema one of them won't be valid regardless of order you perform the actions.

e.g.
foo is an int before and an string after

if the manifest foo=1 and you are validating both the before and after states
If you try ange change foo="1" before changing the schema then obviously it fails when the result is validated against the schema.

If you apply the schema change first in a seperate terraform step
Then in a new terraform try and apply the terraform to change foo="1" it fails saying that foo is not valid it seems that the existing state is validated.

I'd suggest that this validation though potentially interesting to know if the existing state is good before your change is actually a problem when making breaking schema changes. It forces use of tools other than terraform to resolve the issue for something that should be possible with terraform. I would suggest that the pre validation should be a warning and not a fail hard error.

@iBrandyJackson iBrandyJackson added acknowledged Issue has undergone initial review and is in our work queue. progressive apply labels Mar 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
acknowledged Issue has undergone initial review and is in our work queue. bug progressive apply
Projects
None yet
Development

No branches or pull requests

3 participants