Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: Display CPU and Memory on nodepools, with -o wide you also get limits! #1055

Closed
wants to merge 1 commit into from

Conversation

ROunofF
Copy link

@ROunofF ROunofF commented Feb 29, 2024

Fixes #1054

Description: Added information already available so it's visible in the kubectl get

Output :

[karpenter]$ k get nodepools.karpenter.sh
NAME            NODECLASS       CPU   MEMORY
default         default         10    19846992Ki
karpenter-pub   karpenter-pub
karpenter-wl1   karpenter-wl1

[karpenter]$ k get nodepools.karpenter.sh -o wide
NAME            NODECLASS       CPU   CPU_LIMIT   MEMORY       MEM_LIMIT   WEIGHT
default         default         10    500         19846992Ki   20Ti
karpenter-pub   karpenter-pub         1k
karpenter-wl1   karpenter-wl1         400

How was this change tested?: make build && make test && kubectl apply -f karpenter.sh_nodepools.yaml

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: ROunofF
Once this PR has been reviewed and has the lgtm label, please assign mwielgus for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Feb 29, 2024
@k8s-ci-robot
Copy link
Contributor

Welcome @ROunofF!

It looks like this is your first PR to kubernetes-sigs/karpenter 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/karpenter has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Feb 29, 2024
@k8s-ci-robot
Copy link
Contributor

Hi @ROunofF. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Feb 29, 2024
@jonathan-innis
Copy link
Member

jonathan-innis commented Mar 1, 2024

This is a cool idea and something that we are actively thinking about for v1 as we are going through our API design! What's the reason that you think that this would be generally useful to all users? I'm curious how you are using this. I saw that you mentioned in #1054 that you felt limited by the additionalPrinterColumns, so I'm curious in your ideal world what you would like the printer columns to look like for kubectl get nodeclaims, kubectl get nodepools and kubectl get <insert-cloudproivder-type>nodeclasses

Realistically, if you want an easy out to get this printed on your side right now, you can try using kubectl custom columns.

@jonathan-innis
Copy link
Member

@ROunofF If you're interested in proposing this as another change that we should do at v1, we'd definitely welcome you proposing it in #758 and describing your use-case for adding it alongside why you think this could be really useful for all users.

@ROunofF
Copy link
Author

ROunofF commented Mar 1, 2024

This is a cool idea and something that we are actively thinking about for v1 as we are going through our API design! What's the reason that you think that this would be generally useful to all users? I'm curious how you are using this. I saw that you mentioned in #1054 that you felt limited by the additionalPrinterColumns, so I'm curious in your ideal world what you would like the printer columns to look like for kubectl get nodeclaims, kubectl get nodepools and kubectl get <insert-cloudproivder-type>nodeclasses

Realistically, if you want an easy out to get this printed on your side right now, you can try using kubectl custom columns.

Nodeclaims: I'm happy with the current output didn't need anything that isn't included yet.

Nodepools: As the PR did, it's interesting to see the current provisioned "total" of CPU and Memory as well as weight so that I view the cluster current provisioned state vs the nodepools I have configured. I'm using an on-demand pool and bursting into a spot pool as well as specific pool for some applications (think gpu/custom resources). For the limit, I was more interested in the % usage however the additionalPrinterColumns doesn't allow calculation, and the other improvement I would have made was to output using a unit so that number wouldn't be longer than 3(or 4?) digits 5.xy - 500 + units ( 19846992Ki would be printed as 18.9Gi)

My initial Sketch was something like this :

[karpenter]$ k get nodepools.karpenter.sh -o wide
NAME            NODECLASS       CPU / LIMIT %CPU  MEM / LIMIT  %MEM  WEIGHT
default         default          10 / 500     2%  18.9Gi/20Ti   0.1%      10
karpenter-pub   karpenter-pub     - / 1k      0%          -     0%       10
karpenter-wl1   karpenter-wl1     - / 400     0%          -     0%       20

awsnodetemplates: This one is pretty static for me, so I rarely look at it. Looking at the possibilities here, the list of resolved security groups, subnets and AMI-ids could be of interest but the full details will be hard to represent.

@ROunofF ROunofF mentioned this pull request Mar 1, 2024
@nerzhul
Copy link

nerzhul commented Mar 1, 2024

This is a cool idea and something that we are actively thinking about for v1 as we are going through our API design! What's the reason that you think that this would be generally useful to all users? I'm curious how you are using this. I saw that you mentioned in #1054 that you felt limited by the additionalPrinterColumns, so I'm curious in your ideal world what you would like the printer columns to look like for kubectl get nodeclaims, kubectl get nodepools and kubectl get <insert-cloudproivder-type>nodeclasses
Realistically, if you want an easy out to get this printed on your side right now, you can try using kubectl custom columns.

Nodeclaims: I'm happy with the current output didn't need anything that isn't included yet.

Nodepools: As the PR did, it's interesting to see the current provisioned "total" of CPU and Memory as well as weight so that I view the cluster current provisioned state vs the nodepools I have configured. I'm using an on-demand pool and bursting into a spot pool as well as specific pool for some applications (think gpu/custom resources). For the limit, I was more interested in the % usage however the additionalPrinterColumns doesn't allow calculation, and the other improvement I would have made was to output using a unit so that number wouldn't be longer than 3(or 4?) digits 5.xy - 500 + units ( 19846992Ki would be printed as 18.9Gi)

My initial Sketch was something like this :

[karpenter]$ k get nodepools.karpenter.sh -o wide
NAME            NODECLASS       CPU / LIMIT %CPU  MEM / LIMIT  %MEM  WEIGHT
default         default          10 / 500     2%  18.9Gi/20Ti   0.1%      10
karpenter-pub   karpenter-pub     - / 1k      0%          -     0%       10
karpenter-wl1   karpenter-wl1     - / 400     0%          -     0%       20

awsnodetemplates: This one is pretty static for me, so I rarely look at it. Looking at the possibilities here, the list of resolved security groups, subnets and AMI-ids could be of interest but the full details will be hard to represent.

Clearly a very nice output. We have a custom node controller at work and we are moving bricks to karpenter and evaluating, people like very much the % use output in kubectl, if we can have the same on nodepool it would be great

@jonathan-innis
Copy link
Member

people like very much the % use output in kubectl, if we can have the same on nodepool it would be great

This is super interesting. If we didn't have the percentage output, would that be enough? I'm trying to avoid adding a status field only for the purpose of having that value output in printer columns. The other thing: Do we need the same thing for ephemeral storage or do we think that everyone largely just cares about CPU and memory? What if we printed the number of nodes that were tied to the NodePool? Would you want that printed as part of the printer columns? What if we had details on the number of disruptions that are currently allowed for the NodePool through the budgets? Would you want this printed in the printer columns? What about the current number of disrupted nodes?

There's going to be a lot of discussion on this once the v1 API RFC comes out, but I'm glad that we're starting the discussion on this now.

Copy link

This PR has been inactive for 14 days. StaleBot will close this stale PR after 14 more days of inactivity.

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 17, 2024
@ROunofF
Copy link
Author

ROunofF commented Mar 18, 2024

people like very much the % use output in kubectl, if we can have the same on nodepool it would be great

This is super interesting. If we didn't have the percentage output, would that be enough?

Yes, the percentage was just my initial idea, and it's easier to use in some kind of monitoring tool. But as humain, we can still figure out head-room with the real count...

@github-actions github-actions bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 19, 2024
Copy link

github-actions bot commented Apr 2, 2024

This PR has been inactive for 14 days. StaleBot will close this stale PR after 14 more days of inactivity.

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 2, 2024
@jonathan-innis
Copy link
Member

/assign @jonathan-innis

@github-actions github-actions bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 17, 2024
@igor-nikiforov
Copy link

igor-nikiforov commented Apr 26, 2024

If we didn't have the percentage output, would that be enough?

I personally think that both values and percentage is important, when it combined it immediately helps to get overall picture. Like when you use kubectl get node.

What if we printed the number of nodes that were tied to the NodePool?

This is definitely what is missing here (especially when combined with #1160)

What if we had details on the number of disruptions that are currently allowed for the NodePool through the budgets? Would you want this printed in the printer columns? What about the current number of disrupted nodes?

This could part of -owide but most time users care only about used and available resources.

The ideal sketch from me is something like this:

NAME              NODECLASS         CPU                MEMORY                     NODES      WEIGHT
default           default           46 / -  (0 %)      16 Gi / 128 Gi (12.5%)     7 / -      10
karpenter-pub     karpenter-pub     10 / 500 (2 %)     72 Gi / - (0 %)            2 / 10     10

Copy link

This PR has been inactive for 14 days. StaleBot will close this stale PR after 14 more days of inactivity.

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 11, 2024
@github-actions github-actions bot closed this May 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lifecycle/closed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add CPU/Memory visibility on nodepools CRD for kubectl get (through additionalPrinterColumns)
5 participants