Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

<name> failed to fetch resource from kubernetes: the server could not find the requested resource #270

Open
filipvh-sentia opened this issue Jun 9, 2023 · 47 comments

Comments

@filipvh-sentia
Copy link

filipvh-sentia commented Jun 9, 2023

The issue

I'm running into an error with karpenter yaml templates and I'm unsure on what the cause is. I've used the kubectl_manifest in the past and it worked fine on consecutive applies, but for some reason it's not working with these custom resources.

EKS version: 1.27
gavinbunney kubectl version: 1.14
terraform: 1.4.6

Sample YAML file:

apiVersion: karpenter.k8s.aws/v1alpha1
kind: AWSNodeTemplate
metadata:
  name: system-infra
spec:
  instanceProfile: my_cluster-worker-role
  securityGroupSelector:
    kubernetes.io/cluster/my_cluster: owned
  subnetSelector:
    karpenter.sh/discovery: eks-private
  tags:
    Name: node
    managedBy: Terraform
    project: Sandbox
    repository: ..........
    stack: stack/sandbox
    workspace: default

When I try to apply it through terraform it will plan just fine, but when I try to apply it, it gives me the following error:

╷
│ Error: permissionsets failed to fetch resource from kubernetes: the server could not find the requested resource
│ 
│   with module.core_system.kubectl_manifest.permissions-sets[0],
│   on ......./rbac-addon.tf line 123, in resource "kubectl_manifest" "permissions-sets":
│  123: resource "kubectl_manifest" "permissions-sets" {
│ 
╵

I don't know where to start looking to resolve this. The items never show up in the statefile, but the objects are created inside the cluster. It looks like it created them and tries to find them, but then doesn't seem to find them?

I'll keep digging as this provider is my only way of applying yamls to the cluster through terraform without using null_resource ( dirty last resort hack imo ) or the kubernetes_manifest ( which doesn't work if the CRDs don't exist ).

I'm not sure if this is related to the fact these are custom resources ( through CRDs ).

@filipvh-sentia
Copy link
Author

First thought - could this be related to the fact that the resource is not a namespaced resource, but rather a cluster resource ( like a cluster role )?

@filipvh-sentia
Copy link
Author

I'm testing with lower kubernetes versions. Possibly related to the helm charts being incompatible with those versions.

@filipvh-sentia
Copy link
Author

Problem appears resolved on EKS v1.26.
If other people run into this error. This might be your solution.
I'll leave the ticket as I'm unsure if this issue is related to the charts being incompatible ( which I'm guessing ), or if it's related to the kubectl provider. I'll leave it up to the maintainer.

@koertkuipers
Copy link

koertkuipers commented Jun 11, 2023

i am seeing same issue on EKS kubernetes 1.27 with karpenter 0.27.5 and kubectl provider 1.14.0

@MioOgbeni
Copy link

MioOgbeni commented Jun 14, 2023

I' am also seeing same behavior. EKS 1.27, Kubectl provider 1.14.0. I'm trying just apply ENIconfig after EKS creation with multiple subnets.

Downgrade to EKS 1.26 resolved it.

@idontlikej
Copy link

My EKS 1.27 also has the issue related to the provider and karpenter:

│ Error: default failed to fetch resource from kubernetes: the server could not find the requested resource
│
│   with module.karpenter.kubectl_manifest.karpenter_provisioner,
│   on ../../../../tf_modules/terraform-aws-eks-karpenter/main.tf line 69, in resource "kubectl_manifest" "karpenter_provisioner":
│   69: resource "kubectl_manifest" "karpenter_provisioner" {

I found in the EKS logs that a request was made to the Kubernetes API with an incorrect path, resulting in a 404 error response. It appears that the provider is making the request with an extra ApiVersion, causing it to be duplicated in the path.

  "requestURI": "/apis/karpenter.k8s.aws/v1alpha1/v1alpha1/awsnodetemplates/default",

Full log entry:

{
  "kind": "Event",
  "apiVersion": "audit.k8s.io/v1",
  "level": "Metadata",
  "stage": "ResponseComplete",
  "requestURI": "/apis/karpenter.k8s.aws/v1alpha1/v1alpha1/awsnodetemplates/default",
  "verb": "get",
  "user": {
    "username": "kubernetes-admin",
    "groups": [
      "system:masters",
      "system:authenticated"
    ],
    "extra": {
...
      ]
    }
  },
  "sourceIPs": [
  ],
  "userAgent": "HashiCorp/1.0 Terraform/1.5.0",
  "objectRef": {
    "resource": "v1alpha1",
    "name": "awsnodetemplates",
    "apiGroup": "karpenter.k8s.aws",
    "apiVersion": "v1alpha1",
    "subresource": "default"
  },
  "responseStatus": {
    "metadata": {},
    "code": 404
  },
  "requestReceivedTimestamp": "2023-06-14T12:07:55.982587Z",
  "stageTimestamp": "2023-06-14T12:07:55.982853Z",
  "annotations": {
    "authorization.k8s.io/decision": "allow",
    "authorization.k8s.io/reason": ""
  }
}

@raffraffraff
Copy link

raffraffraff commented Jun 14, 2023

Also getting this in EKS 1.27, but in my case, I'm creating a ClusterSecretStore for external-secrets. Edit, gonna try helm_release with the https://charts.itscontained.io/ "raw" chart... if it works I'll report back.

OK, it worked. However, I had to download the chart and store it with my module because it's no longer hosted above. But it's a temporary workaround for my issue - the raw chart applies my ClusterSecretStore.

@dfroberg
Copy link

Yup same here EKS 1.27 cluster wide eniconfig

@tejisin
Copy link

tejisin commented Jun 14, 2023

Same here, happens with multiple resources - node template and provisioner for karpenter or horizontalrunnerautoscaler for actions-runner-controller

$ k version --short
Client Version: v1.27.1
Kustomize Version: v5.0.1
Server Version: v1.27.2-eks-c12679a

@pat-s
Copy link

pat-s commented Jun 15, 2023

Seems to be a generic issue for 1.27 - @gavinbunney any change you could look into this? Downgrading is not an options for many clusters...

@tejisin
Copy link

tejisin commented Jun 16, 2023

More debugging ensued, switching to the kubernetes_manifest resource kinda figured there is a delta between what is being applied from the yaml and what the kube control plane returns the object as. For example

│ Error: Provider produced inconsistent result after apply
│ 
│ When applying changes to kubernetes_manifest.karpenter_provisioner_amd64, provider "provider[\"registry.terraform.io/hashicorp/kubernetes\"]" produced an unexpected new value: .object.spec.requirements: new element 6 has appeared.
│ 
│ This is a bug in the provider, which should be reported in the provider's own issue tracker.
╵
╷
│ Error: Provider produced inconsistent result after apply
│ 
│ When applying changes to kubernetes_manifest.karpenter_provisioner_arm64, provider "provider[\"registry.terraform.io/hashicorp/kubernetes\"]" produced an unexpected new value: .object.spec.requirements: new element 6 has appeared.
│ 
│ This is a bug in the provider, which should be reported in the provider's own issue tracker.

Checking the resource yaml on the cluster and reconciling it back to the code fixed the issue on the kubernetes_manifest provider.

Another example

╷
│ Error: Provider produced inconsistent result after apply
│ 
│ When applying changes to kubernetes_manifest.runners-amd64, provider "provider[\"registry.terraform.io/hashicorp/kubernetes\"]" produced an unexpected new value: .object.spec.template.spec.resources.requests["cpu"]: was cty.StringVal("1.6"), but now cty.StringVal("1600m").
│ 
│ This is a bug in the provider, which should be reported in the provider's own issue tracker.
╵
╷
│ Error: Provider produced inconsistent result after apply
│ 
│ When applying changes to kubernetes_manifest.runners-amd64, provider "provider[\"registry.terraform.io/hashicorp/kubernetes\"]" produced an unexpected new value: .object.spec.template.spec.resources.limits["cpu"]: was cty.StringVal("4.0"), but now cty.StringVal("4").
│ 
│ This is a bug in the provider, which should be reported in the provider's own issue tracker.

Suggest trying the same with your resources and analyzing, or you could switch to the kubernetes_manifest resource.

@duclm2609
Copy link

I have same issue trying to apply EniConfig resource on EKS cluster version 1.27. It worked on first apply but failed after that.

@itspngu
Copy link

itspngu commented Jun 28, 2023

Since everyone in this issue seems to be talking about EKS, I'd like to add that we've run into the same issue with kubeadm based clusters on GCP. So it's definitely a 1.27 thing and not platform specific.

@9numbernine9
Copy link

9numbernine9 commented Jun 29, 2023

I'll also add that I've started seeing this since upgrading a K3S cluster to 1.27.3+k3s1, so that's one more point in the "this is a 1.27 issue" column!

@seungmun
Copy link

First of all, is there currently a way to bypass this issue?

@karmajunkie
Copy link

@seungmun per @tejisin's comment I switched all my kubectl_manifest resources to kubernetes_manifest which works fine. Some of my resources needed to be redeployed IIRC, but in my case that wasn't an issue.

@seungmun
Copy link

@karmajunkie Thanks for the super fast feedback. Are you sure you meant to use the resources in hashicorp/terraform-provider-kubernetes?

@karmajunkie
Copy link

@seungmun yep, that's the one. I'm not sure if that's going to be universally true but my needs are pretty straightforward.

@itspngu
Copy link

itspngu commented Jun 30, 2023

The Problem with hashicorp/kubernetes is that it needs CRDs to be present in the plan phase if you are creating custom resources, so you'd need to have 2 separate Terraform configurations with 2 separate states if that's the case. See hashicorp/terraform-provider-kubernetes#1367

The other option is to use a null_resource or similar with a provisioner block that runs kubectl to apply manifests as a workaround until this issue is fixed.

@Oliniusz
Copy link

I believe you also can't create e.g. the EKS cluster from a scratch and apply your manifests in the same terraform step with kubernetes_manifest - that was the initial reason I switched to this kubectl_manifest.

Let's be realistic though - this project has been abandoned. It would be awesome if @gavinbunney had time once more to update the code or add some other guys as maintainers but that might be unlikely.

@itspngu
Copy link

itspngu commented Jun 30, 2023

For me it's the mentioned problem with CRDs and CRs. I'm fairly sure the problem is related to the versions of the libraries used (prime suspect being the ancient Kubernetes client libraries) but I've never built a Terraform provider.

@pat-s
Copy link

pat-s commented Jul 4, 2023

I've tried to create a fork and update the go modules. While this worked eventually to build some binaries after some minor code changes, the subsequent terraform apply unfortunately errored with issues I also get from the hashicorp kubectl provider (like "kind" unknown, "group" issues, etc.).

I am not versed enough to continue from here onwards but it might be that a simple update of the underlying modules/libraries is not enough to reinstate the previous behavior of this provider here, given all the upstream changes in the k8s API.

@fbozic
Copy link

fbozic commented Jul 5, 2023

I'm experiencing similar issues with the kubectl provider on EKS cluster v1.27. Sometimes provider just drops resources from the tf state because it can not find them. When I try to import resources back with tf import ... I get an error: failed to fetch resource from kubernetes: ....
Here are debug logs from tf plan during which resources are removed from the state (just interesting parts):

2023-07-04T18:42:27.207+0200 [DEBUG] provider.terraform-provider-kubectl_v1.14.0: 2023/07/04 18:42:27 [WARN] kubernetes resource (/apis/crd.k8s.amazonaws.com/v1alpha1/eniconfigs/eu-central-1b) not found, removing from state
2023-07-04T18:42:27.207+0200 [DEBUG] provider.terraform-provider-kubectl_v1.14.0: 2023/07/04 18:42:27 [WARN] kubernetes resource (/apis/crd.k8s.amazonaws.com/v1alpha1/eniconfigs/eu-central-1c) not found, removing from state
2023-07-04T18:42:27.207+0200 [DEBUG] provider.terraform-provider-kubectl_v1.14.0: 2023/07/04 18:42:27 [WARN] kubernetes resource (/apis/crd.k8s.amazonaws.com/v1alpha1/eniconfigs/eu-central-1a) not found, removing from state
2023-07-04T18:42:27.207+0200 [WARN]  Provider "registry.terraform.io/gavinbunney/kubectl" produced an unexpected new value for module.eks_euc1.kubectl_manifest.eni_config["eu-central-1a"] during refresh.
2023-07-04T18:42:27.207+0200 [WARN]  Provider "registry.terraform.io/gavinbunney/kubectl" produced an unexpected new value for module.eks_euc1.kubectl_manifest.eni_config["eu-central-1c"] during refresh.
2023-07-04T18:42:27.207+0200 [WARN]  Provider "registry.terraform.io/gavinbunney/kubectl" produced an unexpected new value for module.eks_euc1.kubectl_manifest.eni_config["eu-central-1b"] during refresh.
2023-07-04T18:42:27.209+0200 [DEBUG] provider.terraform-provider-kubectl_v1.14.0: 2023/07/04 18:42:27 [DEBUG] eu-central-1b Unstructed YAML: map[apiVersion:crd.k8s.amazonaws.com/v1alpha1 kind:ENIConfig metadata:map[name:eu-central-1b] spec:map[securityGroups:[sg-0329cc8efeae8b0cf sg-0efb5545ec1b733ef] subnet:subnet-073fc7974ff152aa8]]
2023-07-04T18:42:27.209+0200 [DEBUG] provider.terraform-provider-kubectl_v1.14.0: 2023/07/04 18:42:27 [DEBUG] eu-central-1c Unstructed YAML: map[apiVersion:crd.k8s.amazonaws.com/v1alpha1 kind:ENIConfig metadata:map[name:eu-central-1c] spec:map[securityGroups:[sg-0329cc8efeae8b0cf sg-0efb5545ec1b733ef] subnet:subnet-0200bdef6e567c74f]]
2023-07-04T18:42:27.209+0200 [DEBUG] provider.terraform-provider-kubectl_v1.14.0: 2023/07/04 18:42:27 [DEBUG] eu-central-1b Unstructed YAML: map[apiVersion:crd.k8s.amazonaws.com/v1alpha1 kind:ENIConfig metadata:map[name:eu-central-1b] spec:map[securityGroups:[sg-0329cc8efeae8b0cf sg-0efb5545ec1b733ef] subnet:subnet-073fc7974ff152aa8]]
2023-07-04T18:42:27.209+0200 [DEBUG] provider.terraform-provider-kubectl_v1.14.0: 2023/07/04 18:42:27 [DEBUG] eu-central-1c Unstructed YAML: map[apiVersion:crd.k8s.amazonaws.com/v1alpha1 kind:ENIConfig metadata:map[name:eu-central-1c] spec:map[securityGroups:[sg-0329cc8efeae8b0cf sg-0efb5545ec1b733ef] subnet:subnet-0200bdef6e567c74f]]
2023-07-04T18:42:27.209+0200 [DEBUG] provider.terraform-provider-kubectl_v1.14.0: 2023/07/04 18:42:27 [DEBUG] eu-central-1a Unstructed YAML: map[apiVersion:crd.k8s.amazonaws.com/v1alpha1 kind:ENIConfig metadata:map[name:eu-central-1a] spec:map[securityGroups:[sg-0329cc8efeae8b0cf sg-0efb5545ec1b733ef] subnet:subnet-0c3c1b5745dbdad8c]]
2023-07-04T18:42:27.209+0200 [DEBUG] provider.terraform-provider-kubectl_v1.14.0: 2023/07/04 18:42:27 [DEBUG] eu-central-1a Unstructed YAML: map[apiVersion:crd.k8s.amazonaws.com/v1alpha1 kind:ENIConfig metadata:map[name:eu-central-1a] spec:map[securityGroups:[sg-0329cc8efeae8b0cf sg-0efb5545ec1b733ef] subnet:subnet-0c3c1b5745dbdad8c]]
2023-07-04T18:42:27.210+0200 [WARN]  Provider "registry.terraform.io/gavinbunney/kubectl" produced an invalid plan for module.eks_euc1.kubectl_manifest.eni_config["eu-central-1c"], but we are tolerating it because it is using the legacy plugin SDK.
2023-07-04T18:42:27.210+0200 [WARN]  Provider "registry.terraform.io/gavinbunney/kubectl" produced an invalid plan for module.eks_euc1.kubectl_manifest.eni_config["eu-central-1a"], but we are tolerating it because it is using the legacy plugin SDK.
2023-07-04T18:42:27.210+0200 [WARN]  Provider "registry.terraform.io/gavinbunney/kubectl" produced an invalid plan for module.eks_euc1.kubectl_manifest.eni_config["eu-central-1b"], but we are tolerating it because it is using the legacy plugin SDK.
2023-07-04T18:42:27.211+0200 [DEBUG] provider: plugin process exited: path=.terraform/providers/registry.terraform.io/gavinbunney/kubectl/1.14.0/darwin_arm64/terraform-provider-kubectl_v1.14.0 pid=16928
2023-07-04T18:42:27.834+0200 [DEBUG] Resource state not found for node "module.eks_euc1.kubectl_manifest.eni_config[\"eu-central-1c\"]", instance module.eks_euc1.kubectl_manifest.eni_config["eu-central-1c"]
2023-07-04T18:42:27.834+0200 [DEBUG] Resource state not found for node "module.eks_euc1.kubectl_manifest.eni_config[\"eu-central-1a\"]", instance module.eks_euc1.kubectl_manifest.eni_config["eu-central-1a"]
2023-07-04T18:42:27.834+0200 [DEBUG] Resource state not found for node "module.eks_euc1.kubectl_manifest.eni_config[\"eu-central-1b\"]", instance module.eks_euc1.kubectl_manifest.eni_config["eu-central-1b"]

I've managed to find a workaround that works for my use case. I've switched to helm_release resource from the official hashicorp/helm provider. And I'm using dysnix/raw chart which is basically an empty chart that takes raw K8s yaml resources as input. Original comment which suggests this approach: hashicorp/terraform-provider-kubernetes#1380 (comment)

I've tested this by creating a new cluster from scratch.
Old:

resource "kubectl_manifest" "eni_config" {
  for_each = zipmap(var.azs, var.pods_subnet_ids)

  yaml_body = yamlencode({
    apiVersion = "crd.k8s.amazonaws.com/v1alpha1"
    kind       = "ENIConfig"
    metadata = {
      name = each.key
    }
    spec = {
      securityGroups = [
        module.eks.cluster_primary_security_group_id,
        module.eks.node_security_group_id,
      ]
      subnet = each.value
    }
  })
}

New:

resource "helm_release" "eni_configs" {
  name       = "eni-configs"
  repository = "https://dysnix.github.io/charts"
  chart      = "raw"
  version    = "v0.3.2"
  values = [
    yamlencode({
      resources = [
        for az, subnet_id in zipmap(var.azs, var.pods_subnet_ids) :
        {
          apiVersion = "crd.k8s.amazonaws.com/v1alpha1"
          kind       = "ENIConfig"
          metadata = {
            name = az
          }
          spec = {
            securityGroups = [
              module.eks.cluster_primary_security_group_id,
              module.eks.node_security_group_id,
            ]
            subnet = subnet_id
          }
        }
      ]
    })
  ]
}

@Oliniusz
Copy link

Oliniusz commented Jul 7, 2023

For anybody stuck, blocked, losing hope for a progress on that - if you are in need of deploying your CRs before CRDs are available then I would suggest to do what I've done for now, eg. before:

resource "kubectl_manifest" "envoyfilter-proxy_protocol-internal" {
  yaml_body = <<-EOF
    apiVersion: networking.istio.io/v1alpha3
    kind: EnvoyFilter
    metadata:
      name: proxy-protocol-internal
      namespace: istio-system
    spec:
      configPatches:
      - applyTo: LISTENER
        patch:
          operation: MERGE
          value:
            listener_filters:
            - name: envoy.filters.listener.proxy_protocol
            - name: envoy.filters.listener.tls_inspector
      workloadSelector:
        labels:
          istio: ingressgateway-internal
    EOF
  depends_on = [
    helm_release.istio-istiod
  ]
}

after:

resource "helm_release" "envoyfilter-proxy-protocol-internal" {
  name       = "envoyfilter-proxy-protocol-internal"
  namespace  = kubernetes_namespace.istio-system.metadata[0].name
  repository = "https://bedag.github.io/helm-charts/"
  chart      = "raw"
  version    = "2.0.0"
  values = [
    <<-EOF
    resources:
      - apiVersion: networking.istio.io/v1alpha3
        kind: EnvoyFilter
        metadata:
          name: proxy-protocol-internal
          namespace: istio-system
        spec:
          configPatches:
          - applyTo: LISTENER
            patch:
              operation: MERGE
              value:
                listener_filters:
                - name: envoy.filters.listener.proxy_protocol
                - name: envoy.filters.listener.tls_inspector
          workloadSelector:
            labels:
              istio: ingressgateway-internal
    EOF
  ]
  depends_on = [
    helm_release.istio-istiod,
    kubernetes_namespace.istio-ingress
  ]
}

@vainkop
Copy link

vainkop commented Jul 7, 2023

A dirty workaround that works

locals {
  karpenter_aws_node_template_yaml = <<EOF
    apiVersion: karpenter.k8s.aws/v1alpha1
    kind: AWSNodeTemplate
    metadata:
      name: ${local.karpenter_provisioner_name}
    spec:
      blockDeviceMappings:
      - deviceName: /dev/xvda
        ebs:
          deleteOnTermination: ${var.karpenter_provisioner_ebs_delete_on_termination}
          volumeSize: ${var.karpenter_provisioner_ebs_size}
          volumeType: ${var.karpenter_provisioner_ebs_type}
      subnetSelector:
        karpenter.sh/discovery: ${module.eks.cluster_name}
      securityGroupSelector:
        karpenter.sh/discovery: ${module.eks.cluster_name}
      tags:
        karpenter.sh/discovery: ${module.eks.cluster_name}
  EOF
}

resource "null_resource" "install_karpenter_aws_node_template" {
  triggers = {
    timestamp = timestamp()
  }

  provisioner "local-exec" {
    interpreter = ["/bin/bash", "-c"]
    command     = "aws eks --region ${var.region} update-kubeconfig --name ${var.cluster_name} && kubectl apply -f - <<EOF\n${local.karpenter_aws_node_template_yaml}\nEOF"
  }
  depends_on = [helm_release.karpenter]
}



      



@pat-s
Copy link

pat-s commented Jul 8, 2023

#270 (comment) works great, thanks a ton!

@filipvh-sentia
Copy link
Author

#270 (comment)

This looks like a good alternative! I'll give that a shot thanks!

@jodem
Copy link

jodem commented Sep 28, 2023

@alekc thanks a lot.

For the record I also had to rename the provider from state :

terraform state replace-provider registry.terraform.io/gavinbunney/kubectl registry.terraform.io/alekc/kubectl

@jodem
Copy link

jodem commented Oct 2, 2023

Also as a side note I would advice production user not to use the plugin anymore since it look not maintained. One option I will take is to use terraform to template deployments (ex to s3) and use flux.io to take care of the deployment. This reduces the coupling between T8 and k8 and allow us to keep injecting values from tf to k8 automatically and use t8 templating tools.

@alekc
Copy link
Contributor

alekc commented Oct 2, 2023

@jodem but you do need a running flux instance isn't it? So you are still out of luck for all the tasks like bootstrapping etc. In theory you could make a use of a https://artifacthub.io/packages/helm/kiwigrid/any-resource helm chart, but it still cranky (and I had my share of. issues with helm provider so I tend to avoid it).

P.s. this provider is not actively maintained anymore, the one on my fork is ;)

@jodem
Copy link

jodem commented Oct 2, 2023

I use the helm provided to install flux and I will probably use the kuberentes manifest official ressource to boostrap it (2 manifest, one for the flux S3 source, one to give the s3 path to watch and synchronize). Or I would use your fork only for these 2 elements which limit the impact (today I have 250+ manifest using the plugin, which put my project in danger)

@valorl
Copy link

valorl commented Oct 2, 2023

@jodem I use the helm provider to install Flux and still use the kubectl provider to apply Flux CRs. The official kubernetes provider is terrible at CRDs, unfortunately. It requires cluster access to plan, making a single-pass cluster creation+bootstrap impossible.

@alekc
Copy link
Contributor

alekc commented Oct 2, 2023

Yup, pretty much my setup.
Kubectl provider for initial karpenter manifests -> Helm release for ArgoCD -> argocd app manifests with kubectl for everything else.

@jodem
Copy link

jodem commented Oct 2, 2023

Ok I'll stick with your fork Alekc, and in cas of problem it's "just" 2 manifest my ops will have to handle manually, they will survive :)

@GiuseppeChiesa-TomTom
Copy link

Hello, we are experiencing the same issue. I wonder since a solution is already available, if it's the case to open a PR on this repo for including also here the fix.

@Oliniusz
Copy link

This repository has been abandoned and no PR will get approved by the maintainer.

@tiagocborg
Copy link

tiagocborg commented Dec 26, 2023

A dirty workaround that works

locals {
  karpenter_aws_node_template_yaml = <<EOF
    apiVersion: karpenter.k8s.aws/v1alpha1
    kind: AWSNodeTemplate
    metadata:
      name: ${local.karpenter_provisioner_name}
    spec:
      blockDeviceMappings:
      - deviceName: /dev/xvda
        ebs:
          deleteOnTermination: ${var.karpenter_provisioner_ebs_delete_on_termination}
          volumeSize: ${var.karpenter_provisioner_ebs_size}
          volumeType: ${var.karpenter_provisioner_ebs_type}
      subnetSelector:
        karpenter.sh/discovery: ${module.eks.cluster_name}
      securityGroupSelector:
        karpenter.sh/discovery: ${module.eks.cluster_name}
      tags:
        karpenter.sh/discovery: ${module.eks.cluster_name}
  EOF
}

resource "null_resource" "install_karpenter_aws_node_template" {
  triggers = {
    timestamp = timestamp()
  }

  provisioner "local-exec" {
    interpreter = ["/bin/bash", "-c"]
    command     = "aws eks --region ${var.region} update-kubeconfig --name ${var.cluster_name} && kubectl apply -f - <<EOF\n${local.karpenter_aws_node_template_yaml}\nEOF"
  }
  depends_on = [helm_release.karpenter]
}



      

I had to do something similar. I did try with kubectl_manifest, Kubernetes_manifest, helm_release with raw, and other approaches, but everything was always flaky, sometimes work, sometimes not.

The only way that I find to make this work everytime I apply my stack was manifests and local-exec:

locals {
  aws_cli_add_cluster = "aws eks update-kubeconfig --name ${data.terraform_remote_state.eks.outputs.cluster_name} --region ${data.aws_region.current.name} --alias ${local.project_env}"
  k_use_ctx = "kubectl config use-context ${local.project_env}"
  k_rm_ctx = "kubectl config delete-context ${local.project_env}"
  k_apply_node_class = "env=${local.env} envsubst < ${path.module}/manifests/karpenter_nodeclass.yaml | kubectl apply -f -"
  k_apply_node_pool = "env=${local.env} envsubst < ${path.module}/manifests/karpenter_nodepool.yaml | kubectl apply -f -"
}

resource "null_resource" "karpenter_node_class" {
  triggers = {
    timestamp = timestamp()
  }

  provisioner "local-exec" {
    interpreter = ["/bin/bash", "-c"]
    command = "${local.aws_cli_add_cluster} && ${local.k_use_ctx} &&  ${local.k_apply_node_class} && ${local.k_rm_ctx}"
  }
  depends_on = [helm_release.karpenter]
}

resource "null_resource" "karpenter_node_pool" {
  triggers = {
    timestamp = timestamp()
  }

  provisioner "local-exec" {
    interpreter = ["/bin/bash", "-c"]
    command = "${local.aws_cli_add_cluster} && ${local.k_use_ctx} &&  ${local.k_apply_node_pool} && ${local.k_rm_ctx}"
  }
  depends_on = [null_resource.karpenter_node_class]
}

@Balamurugannagappan
Copy link

Able to resolve by replacing kubectl_manifest by kubernetes_manifest

resource "kubernetes_manifest" "servicemonitor" {
manifest = yamldecode(file("./manifests/service-monitor.yaml"))
}

@Oliniusz
Copy link

Oliniusz commented Mar 6, 2024

This issue is mostly related to applying custom CRD manifests - kubernetes_manifest will not work if the CRD for ServiceMonitor doesn't exist yet.

@mk01
Copy link

mk01 commented Mar 25, 2024

for me this happened after provider aws got upgraded from 5.41.0 to 5.42.0. downgrade solved the issue.

@vara-bonthu
Copy link

vara-bonthu commented Apr 4, 2024

We encountered comparable challenges using the kubectl_manifest provider within Terraform, leading us to develop a local Helm chart as part of the EKS data add-ons Terraform module specifically for Karpenter resources. Helm provider has proven to be quite reliable in conjunction with Terraform.

Here is the module link along with the code. https://github.com/aws-ia/terraform-aws-eks-data-addons/tree/main/helm-charts/karpenter-resources

You can checkout this example on how to consume this Helm Chart. https://github.com/awslabs/data-on-eks/blob/d0ae18bc8c85ec7e313a9146ec0b1c2a3b8a1550/analytics/terraform/spark-k8s-operator/addons.tf#L174

@andrewbelu
Copy link

andrewbelu commented Apr 5, 2024

If it helps anyone, in my case I noticed this was happening when I set my spec: {}. Instead, I removed the spec entirely when it was blank and the provider successfully provisioned my resource.

@Pursu1tOfHapp1ness
Copy link

for me this happened after provider aws got upgraded from 5.41.0 to 5.42.0. downgrade solved the issue.

It solved the problem for quite short time. After 1 or 2 cycles of plan/apply, everything came back as it was.
I advise as the most convenient and fast solution, update from gavinbunney/kubectl to alekc/kubectl provider, especially if you are working with karpenter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests