Karpenter won't use fallback nodePool on insufficient instance capacity #6168

raychinov · 2024-05-08T13:42:18Z

Description

Observed Behavior:
We have a default nodePool in one availability zone (a) with higher weight and a second fallback nodePool in another availability zone (b) with lower weight. The scenario we test is with AWS Fault Injection Simulator, where all a AZ instances are terminated and new instance launches are paused.
What we observe is new nodeClaims being created but stay in Non-Ready state and nodes fail to launch with an error message like:

creating instance, getting launch template configs, getting launch templates, no instance types satisfy requirements of amis ami-02f420afc14289ede

The concerning thing here is that Karpenter won't try to launch instances from the fallback nodePool even after several minutes of waiting.
The messages in the log are mostly:

{"level":"DEBUG","time":"2024-05-08T11:26:25.025Z","logger":"controller.disruption","message":"waiting on cluster sync","commit":"2c8f2a5"}
{"level":"DEBUG","time":"2024-05-08T11:26:26.026Z","logger":"controller.disruption","message":"waiting on cluster sync","commit":"2c8f2a5"}
{"level":"DEBUG","time":"2024-05-08T11:26:27.027Z","logger":"controller.disruption","message":"waiting on cluster sync","commit":"2c8f2a5"}

If we increase the fallback nodePool weight and bump the workload so new nodeClaims are created, they would start in the b AZ and get provisioned successfully, but the a AZ nodeClaims will stay pending as well as the pods meant to start on them.

Expected Behavior:
Unsuccessful nodeClaims would time out after a couple of seconds, and new ones will be created in an alternative nodePool. Also, having more verbose log messages would be appreciated.

Reproduction Steps (Please include YAML):

NodePools

---
apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
  name: default
  annotations:
    kubernetes.io/description: "Default NodePool"
spec:
  weight: 100
  template:
    spec:
      requirements:
        - key: karpenter.sh/capacity-type
          operator: In
          values: ["spot"]
        - key: node.kubernetes.io/instance-type
          operator: In
          values: ["c7g.large", "c7g.xlarge"]
        - key: topology.kubernetes.io/zone
          operator: In
          values:
            - "eu-west-1a"
      nodeClassRef:
        name: default
      taints:
        - key: karpenter
          value: "true"
          effect: NoSchedule
      kubelet:
        maxPods: 125
---
apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
  name: fallback
  annotations:
    kubernetes.io/description: "Fallback NodePool"
spec:
  weight: 10
  template:
    spec:
      requirements:
        - key: karpenter.sh/capacity-type
          operator: In
          values: ["spot"]
        - key: node.kubernetes.io/instance-type
          operator: In
          values: ["c7g.large", "c7g.xlarge"]
        - key: topology.kubernetes.io/zone
          operator: In
          values:
            - "eu-west-1b"
            - "eu-west-1c"
      nodeClassRef:
        name: default
      taints:
        - key: karpenter
          value: "true"
          effect: NoSchedule
      kubelet:
        maxPods: 125

Workload

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: test
spec:
  replicas: 2
  selector:
    matchLabels:
      app: test
  template:
    metadata:
      labels:
        app: test
    spec:
      securityContext:
        runAsUser: 65534
        fsGroup: 65534
      containers:
        - image: public.ecr.aws/eks-distro/kubernetes/pause:3.2
          name: test
          resources:
            requests:
              cpu: 50m
              memory: 64M
      nodeSelector:
        karpenter.k8s.aws/instance-hypervisor: nitro
      tolerations:
        - key: karpenter
          operator: Equal
          value: "true"
          effect: NoSchedule
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            - labelSelector:
                matchExpressions:
                  - key: app
                    operator: In
                    values:
                      - test
              topologyKey: kubernetes.io/hostname

FIS template

{
    "description": "AZ fail",
    "targets": {
        "Karp-EC2-Instances": {
            "resourceType": "aws:ec2:instance",
            "resourceTags": {
                "karpenter.k8s.aws/ec2nodeclass": "default"
            },
            "filters": [
                {
                    "path": "State.Name",
                    "values": [
                        "running"
                    ]
                },
                {
                    "path": "Placement.AvailabilityZone",
                    "values": [
                        "eu-west-1a"
                    ]
                }
            ],
            "selectionMode": "ALL"
        },
        "Karp-IAM-roles": {
            "resourceType": "aws:iam:role",
            "resourceArns": [
                "arn:aws:iam::111111111111:role/karpenter-controller-role",
                "arn:aws:iam::111111111111:role/karpenter-node-role"
            ],
            "selectionMode": "ALL"
        }
    },
    "actions": {
        "Pause-Instance-Launches": {
            "actionId": "aws:ec2:api-insufficient-instance-capacity-error",
            "parameters": {
                "availabilityZoneIdentifiers": "eu-west-1a",
                "duration": "PT10M",
                "percentage": "100"
            },
            "targets": {
                "Roles": "Karp-IAM-roles"
            }
        },
        "Terminate-Instances": {
            "actionId": "aws:ec2:terminate-instances",
            "parameters": {},
            "targets": {
                "Instances": "Karp-EC2-Instances"
            }
        }
    },
    "stopConditions": [
        {
            "source": "none"
        }
    ],
    "roleArn": "arn:aws:iam::111111111111:role/service-role/AWSFISIAMRole-1714174839723",
    "tags": {
        "Name": "AZ Availability: insufficient-instance-capacity-error"
    },
    "experimentOptions": {
        "accountTargeting": "single-account",
        "emptyTargetResolutionMode": "skip"
    }
}

Versions:

Chart Version: 0.35.0
Kubernetes Version (kubectl version): 1.22.17

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment

The text was updated successfully, but these errors were encountered:

jonathan-innis · 2024-05-09T20:28:03Z

Can you share the NodePool and EC2NodeClass that you are using here? Can you also share the entire set of Karpenter controller logs from the FIS simulation? Can you also share what exactly you are doing/executing during the FIS simulation (it seems like you are just ICE-ing all instance types across the single AZ)?

raychinov · 2024-05-13T12:23:02Z

Hey Jonathan, thank you for looking into this. I've shared the NodePools definition in the issue description, and here are the EC2NodeClass and Controller logs requested. And yes, in the FIS simulation, we are terminating the Kapenter-managed nodes and ICE-ing all instance types across the eu-west-1a AZ. Also, we use a custom AMI in the EC2NodeClass, but I don't think this is what causes the issue.

Looking into the logs again, it seems to me that happens is:

During the FIS simulation, we terminate instances in eu-west-1a and tell the EC2 API to start responding with api-insufficient-instance-capacity-error for eu-west-1a requests for some time
Karpenter asks the API for new nodes in the eu-west-1a AZ to host the pending pods, but the requests get rejected and Karpenter removes the default NodePool instance types (just c7g.xlarge in this case, c7g.large seems to not fit the requirement) from the offerings:

{"level":"DEBUG","time":"2024-05-10T09:17:20.363Z","logger":"controller.nodeclaim.lifecycle","message":"removing offering from offerings","commit":"2c8f2a5","nodeclaim":"default-g6c6x","reason":"InsufficientInstanceCapacity","instance-type":"c7g.xlarge","zone":"eu-west-1a","capacity-type":"spot","ttl":"3m0s"}
{"level":"ERROR","time":"2024-05-10T09:17:20.363Z","logger":"controller.nodeclaim.lifecycle","message":"creating instance, insufficient capacity, with fleet error(s), InsufficientInstanceCapacity: We currently do not have sufficient capacity in the Availability Zone you requested.","commit":"2c8f2a5","nodeclaim":"default-g6c6x"}

The thing is that the offerings seems to be removed not just for the eu-west-1a availability zone, but for the whole region offerings. Resulting in nodes not launched in the fallback NodePool/AZ

{"level":"ERROR","time":"2024-05-10T09:17:29.334Z","logger":"controller","message":"Reconciler error","commit":"2c8f2a5","controller":"nodeclaim.lifecycle","controllerGroup":"karpenter.sh","controllerKind":"NodeClaim","NodeClaim":{"name":"default-6t977"},"namespace":"","name":"default-6t977","reconcileID":"ea21ad18-3206-49f1-a98b-2b19726ae3d5","error":"launching nodeclaim, creating instance, getting launch template configs, getting launch templates, no instance types satisfy requirements of amis ami-02fc209fc12289dde"}
{"level":"DEBUG","time":"2024-05-10T09:17:29.682Z","logger":"controller.disruption","message":"waiting on cluster sync","commit":"2c8f2a5"}
{"level":"DEBUG","time":"2024-05-10T09:17:29.788Z","logger":"controller.provisioner","message":"waiting on cluster sync","commit":"2c8f2a5"}

When we do the same tests with fallback NodePool containing a different set of instance types the fallback do work and new instances in eu-west-1b and eu-west-1c are provisined successfully.

github-actions · 2024-05-28T12:04:49Z

This issue has been inactive for 14 days. StaleBot will close this stale issue after 14 more days of inactivity.

raychinov added bug Something isn't working needs-triage Issues that need to be triaged labels May 8, 2024

jonathan-innis self-assigned this May 9, 2024

jonathan-innis removed bug Something isn't working needs-triage Issues that need to be triaged labels May 13, 2024

jonathan-innis assigned njtran and unassigned jonathan-innis May 13, 2024

github-actions bot added the lifecycle/stale label May 28, 2024

github-actions bot added the lifecycle/closed label Jun 11, 2024

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jun 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Karpenter won't use fallback nodePool on insufficient instance capacity #6168

Karpenter won't use fallback nodePool on insufficient instance capacity #6168

raychinov commented May 8, 2024

jonathan-innis commented May 9, 2024 •

edited

raychinov commented May 13, 2024

github-actions bot commented May 28, 2024

Karpenter won't use fallback nodePool on insufficient instance capacity #6168

Karpenter won't use fallback nodePool on insufficient instance capacity #6168

Comments

raychinov commented May 8, 2024

Description

jonathan-innis commented May 9, 2024 • edited

raychinov commented May 13, 2024

github-actions bot commented May 28, 2024

jonathan-innis commented May 9, 2024 •

edited