Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with kustomize and Karpenter Helm Charts When using kustomize with Karpenter's Helm charts for deployment #6142

Open
praseedasathaye opened this issue May 3, 2024 · 3 comments
Assignees
Labels
lifecycle/stale question Further information is requested

Comments

@praseedasathaye
Copy link
Contributor

Description

There is an issue with kustomize and Karpenter Helm Charts When using kustomize with Karpenter's Helm charts for deployment, customer ran into some undocumented issues/bugs related to resources (https://github.com/aws/karpenter-provider-aws/blob/main/charts/karpenter/templates/rolebinding.yaml#L45) in Unexpected Namespaces. Specifically, Karpenter deploys some resources in the "kube-node-lease" namespace instead of its own namespace, which caused issues that were not documented.

More details below:
The main Karpenter Helm chart includes the custom resource definitions (CRDs) directly, which breaks when using customized installations/rendering with tools like Kustomize. This is considered a bad practice as CRDs should be packaged separately.
The documentation instructs to just install the main Karpenter chart, but doesn't mention installing the separate CRD chart first, leading to confusion.
Within the main Karpenter chart, some resources like Roles and RoleBindings are unexpectedly deployed in different namespaces like "kube-node-lease" instead of the deployment namespace "kube-system".
This cross-namespace deployment of resources is undocumented and violates the principle of keeping all resources contained within the defined namespace for an application.
When using customized deployments that enforce setting the top-level namespace, it overrides the namespaces for these cross-namespace resources, leading to issues like the pod not having permissions to list "kube-node-lease" .
Customer had to manually edit role bindings in the live cluster to fix these permission issues stemming from the non-standard namespace layouts.

@praseedasathaye praseedasathaye added documentation Improvements or additions to documentation needs-triage Issues that need to be triaged labels May 3, 2024
@engedaam
Copy link
Contributor

engedaam commented May 6, 2024

Upstream Kubernetes recently had an issue where node leases were being leaked due to terminating nodes before the underlaying instance had been full terminated link. Clusters that used Karpenter were particular impacted due to the fact that Karpenter does not wait for underlaying instance to be terminated before deleting a node. More information on the bug can be found here: #4363. Karpenter implemented a garbage collection on node lease, which required us to provide these permissions in the kube-node-lease namespace kubernetes-sigs/karpenter#471. The team has all the cluster permission documented in our treat model provided here: https://karpenter.sh/docs/reference/threat-model/#karpenter-controller. Changing the namespace will not allow Karpenter to delete leaked node leases.

Karpenter does have and maintain a separate repository for the CRDs, however it is mentioned to customer as a means to upgrade their CRDs: https://karpenter.sh/docs/upgrading/upgrade-guide/#crd-upgrades. Do you see a benefit of adding that as part of the getting started guide?

@jonathan-innis
Copy link
Contributor

This is considered a bad practice as CRDs should be packaged separately

I'd also disagree with this statement. Adding CRDs to a helm chart isn't seen as bad practice. This is actually the straight-forward, recommended way to install CRDs by the helm project itself. Now, does it allow you to upgrade the CRDs? No. Which is why we created the other chart that you can use to manage that upgrade.

We're having to strike a balance with our Getting Started guide being simple enough for most users to just onboard to without thinking about it and what's the right technical path for the majority of users. I could see an argument for recommending installing the helm chart for CRDs as a separate step in our docs, but it will definitely add an additional step in the process.

@engedaam engedaam added question Further information is requested and removed documentation Improvements or additions to documentation needs-triage Issues that need to be triaged labels May 14, 2024
Copy link
Contributor

This issue has been inactive for 14 days. StaleBot will close this stale issue after 14 more days of inactivity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/stale question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants