New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DOC] adding prometheus-operator-cli design proposal #6435
base: main
Are you sure you want to change the base?
[DOC] adding prometheus-operator-cli design proposal #6435
Conversation
Just for reference I have some draft CLI working on this repository: |
739a09c
to
249c40b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we changed our approach for new proposals and we're putting everything under Documentation/proposals
instead of Documentation/designs
. We still have the agent design over there but I think we agreed on moving it some months ago 🤔 Could we also follow the design template? 🙈
But anyway, thanks for starting the proposal! The idea sounds awesome
### Why my Prometheus resource is not being created? | ||
|
||
People were struggling to troubleshoot why their Prometheus resources were not being created. They were not sure if the issue was with the Prometheus Operator or with the Prometheus resource itself. | ||
After long investigation, they found out that the Prometheus Operator were only watching the namespace where the Prometheus Operator was deployed, and the Prometheus resource was being created in a different namespace. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I remember correctly, the problem was not about creating Prometheus resources but correctly configuring them. Creating just a single "Prometheus" resource is fairly simple, but also setting up the right permissions, attaching service accounts, troubleshooting target discovery was not straight forward
|
||
These are just some common issues that users reported, but there are many other issues that users face when managing Prometheus Operator resources, and we believe that a CLI tool can help users to manage Prometheus Operator resources more easily and efficiently. | ||
|
||
## Proposal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not entirely clear how the CLI (as a kubectl plugin) could help with CI validations 🤔
For CI we don't need the kubernetes cluster context, so I wonder if a kubectl plugin is the best choice here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm impatient to experiment with this tool :)
b5416ca
to
902797a
Compare
I know we don't have an "official proposal template", but all proposals we have today follow the same structure (documented by the thanos' folks[1]) The template is organized in a way to make sure both the problem and the solution are clear, that we know who are the interested parties, etc. A well-written proposal also attracts more people to contribute in case the current owners eventually fall short on capacity :) Could we follow the proposal template? |
902797a
to
fbb1350
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! It's enough for me to get started with the new repository and experiment 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies if I'm being too picky 😅
I still think the proposal is confusing, but it's definitely progressing!
# Pitfalls of the current solution | ||
|
||
At the moment people are struggling to manage Prometheus Operator resources, they have to manually create, update, and delete Prometheus Operator resources using `kubectl` or other tools, and troubleshooting and debugging Prometheus Operator resources is challenging. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the issues you pointed out above can be explored and explained in this section instead of having subsections under Why
.
Troubleshooting Alertmanager, Prometheus and Thanos ruler pod creation is very manual and requires domain knowledge about what RBAC permissions are needed.
Target discovery also requires domain knowledge, understanding label selectors, RBAC permissions, etc.
And we could add links to existing support issues here in Github, StackOverflow, or Slack to show that we have data to back this up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And what do you think about not adding a subsection for the problems mentioned? I feel like it's an overkill to add very big title just to say 1 paragraph 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ArthurSens could you review?
|
||
### Linting and Validation | ||
|
||
Allow users to validate Prometheus Operator resources before creating them, the CLI should check if the resource manifest is valid. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by valid? Kubernetes already does validation when applying objects 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first use case I can see is the validation of the expressions used on rules
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can do that on apply as well, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not something that the Kube API can do on its own. You need an admission webhook service: https://prometheus-operator.dev/docs/user-guides/webhook/#prometheusrule
I agree with this part!!! Let's create the repository and play a little bit, but I think we can improve the proposal before merging it |
7b7921a
to
6e09ada
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quick review, nothing blocking. Exciting proposal 🎉
|
||
For example, the CLI should allow the creation of a Prometheus object and the related Kubernetes objects such as service account, RBAC, service, pod disruption budget, and other related objects. | ||
|
||
### Linting and Validation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is one of the goals to replace po-lint
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a fair point!
I think we could review it.
wdyt @simonpasquier @ArthurSens @slashpai ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Funnily enough, #6474 just popped up 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To the best of my knowledge, po-lint
is mostly unmaintained and we don't ship it in the releases page or containre images.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there was a idea to use kubeconform instead of po-lint
7ecd618
to
c661b68
Compare
Just a question: In Contribfest, did participants report similar issues to GMP? Whether yes or no, do you know how they solve this problem? Do they solve it with a similar approach like this? |
8e963fa
to
8851894
Compare
Signed-off-by: Nicolas Takashi <nicolas.tcs@hotmail.com>
Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Co-authored-by: Simon Pasquier <spasquie@redhat.com>
Co-authored-by: Arthur Silva Sens <arthursens2005@gmail.com>
Co-authored-by: Arthur Silva Sens <arthursens2005@gmail.com>
Co-authored-by: Arthur Silva Sens <arthursens2005@gmail.com>
Co-authored-by: Arthur Silva Sens <arthursens2005@gmail.com>
Co-authored-by: Arthur Silva Sens <arthursens2005@gmail.com>
Co-authored-by: Arthur Silva Sens <arthursens2005@gmail.com>
Co-authored-by: Arthur Silva Sens <arthursens2005@gmail.com>
8851894
to
4b93524
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some small nits, but good enough to get started :)
Co-authored-by: Arthur Silva Sens <arthursens2005@gmail.com>
Co-authored-by: Arthur Silva Sens <arthursens2005@gmail.com>
Co-authored-by: Arthur Silva Sens <arthursens2005@gmail.com>
@simonpasquier @slashpai @xiu @ArthurSens could you review it, I think I addressed the comments already. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's get it started :)
Description
Describe the big picture of your changes here to communicate to the maintainers why we should accept this pull request.
If it fixes a bug or resolves a feature request, be sure to link to that issue.
Document Proposal for
prometheus-operator-cli
Related to: #6423
Type of change
What type of changes does your code introduce to the Prometheus operator? Put an
x
in the box that apply.CHANGE
(fix or feature that would cause existing functionality to not work as expected)FEATURE
(non-breaking change which adds functionality)BUGFIX
(non-breaking change which fixes an issue)ENHANCEMENT
(non-breaking change which improves existing functionality)NONE
(if none of the other choices apply. Example, tooling, build system, CI, docs, etc.)Verification
Please check the Prometheus-Operator testing guidelines for recommendations about automated tests.
Changelog entry
Please put a one-line changelog entry below. This will be copied to the changelog file during the release process.