New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: design doc for supporting sa authencation for RGW with vault #10319
base: master
Are you sure you want to change the base?
Conversation
There are different ways RGW can authenticate with help of [vault agent](https://docs.ceph.com/en/latest/radosgw/vault/#vault-agent), but only service account authentication is supported here. | ||
|
||
## Proposal details | ||
The service account details will specified in `ConnectionDetails` of `KeyManagementServiceSpec`, using that info plus the other details a config map with <ceph-object-store>-vault-agent-cm populated for the vault agent sidecar container. The side container details will be added RGW pod spec and start sidecar container with RGW pod. The RGW will be configured with vault agent specific options. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- What are the contents of the configmap? How about an example yaml of the configmap contents?
- Why do we need a configmap? For example, could Rook just generate the contents of the configmap in a file in an init container? Not sure if that makes sense in this case, but I'm curious to explore other approaches.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The configmap contains configurations for the start vault agent, I will add sample example here
The service account details will specified in `ConnectionDetails` of `KeyManagementServiceSpec`, using that info plus the other details a config map with <ceph-object-store>-vault-agent-cm populated for the vault agent sidecar container. The side container details will be added RGW pod spec and start sidecar container with RGW pod. The RGW will be configured with vault agent specific options. | ||
|
||
### Risks and Mitigation | ||
User will be able to modify the `vault-agent-cm` which is not preferable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Who owns the content of the CM? Does the operator just generate it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rook Operator generates based on values from Connection Details
in KMS config. please refer here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This risk doesn't seem necessary. Only admins are expected to have access to the rook namespace, so normal users cannot modify the configmap. If a user has access to the rook namespace, they could destroy everything, so no need to call this risk out.
a5e8b68
to
6ccfe07
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are different ways RGW can authenticate with help of [vault agent](https://docs.ceph.com/en/latest/radosgw/vault/#vault-agent), but only service account authentication is supported here. | ||
|
||
## Proposal details | ||
The service account details will specified in `ConnectionDetails` of `KeyManagementServiceSpec`, using that info plus the other details a config map with <ceph-object-store>-vault-agent-cm populated for the vault agent sidecar container. The side container details will be added RGW pod spec and start sidecar container with RGW pod. The RGW will be configured with vault agent specific options. For the following details in `ConnectionDetails`: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this sidecar required? It is non-ideal for Rook to hard-code support for Vault, especially in today's world where the vault injector is available. Why don't we instruct users to use the injector instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the RGW can connect directly to Vault, why should we also implement this? I think that would have the same effect as this, no?
What does this do for users beyond add confusion due to a second way of configuring the same feature? Why should we go through the development and maintenance effort for this if there is a simpler alternative that accomplishes the same thing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this sidecar required? It is non-ideal for Rook to hard-code support for Vault, especially in today's world where the vault injector is available. Why don't we instruct users to use the injector instead?
Vault injector Job is different, it does not authenticate with applications. It just inject vault secret directly to application pod as file kinda of webhook
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the RGW can connect directly to Vault, why should we also implement this? I think that would have the same effect as this, no?
What does this do for users beyond add confusion due to a second way of configuring the same feature? Why should we go through the development and maintenance effort for this if there is a simpler alternative that accomplishes the same thing?
RGW can authenticate with vault directly using token
it is considered to be the primitive method. The vault agent provides different flavours and hence it preferred token
authentication. Since the workload is in k8s environments, users expect a way to authenticate with the Service account. Hence it was added for OSD encryption as well but a vault agent was not used. But in RGW the authentication requires a vault agent. Another approach make RGW authenticate with vault directly using the service account. But upstream developers are not keen since ceph does not know k8s or service accounts etc.
There are different ways RGW can authenticate with help of [vault agent](https://docs.ceph.com/en/latest/radosgw/vault/#vault-agent), but only service account authentication is supported here. | ||
|
||
## Proposal details | ||
The service account details will specified in `ConnectionDetails` of `KeyManagementServiceSpec`, using that info plus the other details a config map with <ceph-object-store>-vault-agent-cm populated for the vault agent sidecar container. The side container details will be added RGW pod spec and start sidecar container with RGW pod. The RGW will be configured with vault agent specific options. For the following details in `ConnectionDetails`: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the RGW can connect directly to Vault, why should we also implement this? I think that would have the same effect as this, no?
What does this do for users beyond add confusion due to a second way of configuring the same feature? Why should we go through the development and maintenance effort for this if there is a simpler alternative that accomplishes the same thing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I decided to spend my morning looking into Vault and SA-based Auth. I think the KMS and Kubernetes integration landscape has grown compared to when Vault support was initially added. Of note, if users set up the kubernetes auth method to allow Vault to give KMS secrets to apps via service account, I don't see anything that suggests a Vault sidecar container is required. I think it will be best if we don't have to bind Vault-awareness into Rook more than we have to, and I think we should try to avoid creating a Vault pod (sidecar) if possible. While starting a vault agent on nodes may be the best strategy for deploying RGW on bare metal, I suspect there are more K-native methods available to us for Rook. We shouldn't mirror Ceph's complexity in Rook unless it's critical. Vault's agent injector exists, and I think this is a way we can add the vault sidecar without having to code the pod details into Rook. Or, I think there may be another option, which seems from my research today that it won't require a sidecar or the agent injector. From what I can tell, if the "Kubernetes" auth method is set up for Vault, and the Kubernetes version is 1.21+, a key will be added to These are the chief resources I found that break down the "Kubernetes" auth method: |
For service account authentication vault sidecar is not a must. But here the issues RGW cannot directly authenticate with vault using the service account. But it can authenticate it with help of a vault agent. Hence vault agent is added as sidecar to rgw pod. So RGW authenticates with vault via vault agent using service account
Yes that's correct and even I tried to add that support in RGW codebase ceph/ceph#37868, this makes RGW understands the service account tokens of k8s. From design point of I felt the right approach to vault agent. At the these service account toke is jwt token which was already supported by RGW with vault agent.
|
use_auto_auth_token = true | ||
} | ||
listener "tcp" { | ||
address = "127.0.0.1:8100" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does a different port ever need to be configured? For example, I see port 8200 used above on line 25.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
The service account details will specified in `ConnectionDetails` of `KeyManagementServiceSpec`, using that info plus the other details a config map with <ceph-object-store>-vault-agent-cm populated for the vault agent sidecar container. The side container details will be added RGW pod spec and start sidecar container with RGW pod. The RGW will be configured with vault agent specific options. | ||
|
||
### Risks and Mitigation | ||
User will be able to modify the `vault-agent-cm` which is not preferable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This risk doesn't seem necessary. Only admins are expected to have access to the rook namespace, so normal users cannot modify the configmap. If a user has access to the rook namespace, they could destroy everything, so no need to call this risk out.
There are different ways RGW can authenticate with help of [vault agent](https://docs.ceph.com/en/latest/radosgw/vault/#vault-agent), but only service account authentication is supported here. | ||
|
||
## Proposal details | ||
The service account details will specified in `ConnectionDetails` of `KeyManagementServiceSpec`, using that info plus the other details a config map with <ceph-object-store>-vault-agent-cm populated for the vault agent sidecar container. The side container details will be added RGW pod spec and start sidecar container with RGW pod. The RGW will be configured with vault agent specific options. For the following details in `ConnectionDetails`: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- What image would the sidecar use?
- What are recommended resource requests/limits?
- How about adding a link to the vault agent docs for this?
User will be able to modify the `vault-agent-cm` which is not preferable | ||
|
||
## Config Commands | ||
The user can itself bring up `vault agent` as separate deployment configuring with service account authentication details. Then set following for RGW via toolbox pod: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the vault agent can be started as a separate deployment, is only one agent needed for the whole cluster? If so, should we anyway create a vault agent deployment instead of an rgw sidecar? Or why should it be a sidecar for rgw?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question!
Also a potential consideration: can we expect users to run the agent themselves so they can have as much control over its configuration as they want/need?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have done a fair bit more research on this. I believe that Vault's model that the vault agent runs as a sidecar is merely a requirement because it assumes the agent injects secrets into a shared pod volume. If RGW is talking to vault agent directly, there is no need to run the agent as a sidecar, because there is no need to have the agent inject anything into pod shared directories.
This means we can have (and capture in the design doc) the decision to either use a sidecar or to use a standalone vault agent that is shared by all RGWs.
Using sidecars will naturally increase our resource footprint, a negative. However, the RGW can contact the sidecar on localhost
within the Pod.
Using a standalone agent will decrease resources, especially for large numbers of RGWs. However, RGWs will likely be run on many nodes, and a shared agent will add additional host-to-host latency to each S3 call IIUC.
Regarding the agent injector, while it is intended to be used to inject specific secrets into Pods, I believe it has the flexibility to configure the agent as a listener as well by modifying the ConfigMap as documented here: https://www.vaultproject.io/docs/platform/k8s/injector/examples#configmap-example
We can set the listener config in the ConfigMap to have the agent sidecar automatically injected and listening on a port expected by the RGW.
In my opinion, a better design may be to allow users the flexibility to choose whether they want to run a standalone vault agent or whether they want sidecars. It would be easy to add an option to the CephObjectStore security
spec that defines the address of the vault agent: spec.security.vaultAgentHostname
.
If they want sidecars, then they can...
- set
spec.security.vaultAgentHostname
tolocalhost:<port>
- configure the agent
listener
tolocalhost:<port>
in the configmap - add the below annotations on the CephObjectStore.
annotations: vault.hashicorp.com/agent-inject: 'true' vault.hashicorp.com/agent-configmap: 'my-configmap'
If they want to use a standalone vault agent, then they merely need to set spec.security.vaultAgentHostname
to the Service that provides the vault agent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@BlaineEXE @travisn: A standalone vault agent is also an option. User needs to set it up independently, as mentioned but not sure about any performance impact, for ceph it is always local to rgw server. We can add an option either CephObjectstore
or in the Connection details
string map. For RGW it is just an endpoint, it can be vault-server directly, vault-agent locally or separate deplotment.
After checking the examples of vault injector I still can't figure out the listener mode mentioned above. The vault injector can mount the config map of vault-agent
to the pod apart from that I don't see much use case wrt the vault injector. Am I missing something??
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the vault docs, listener
specifies the address on which the vault agent listens. I believe this means we can (using the confimap) tell the agent that is injected to listen to RGW requests in the pod at something like localhost:6543
. We can then configure vaultAgentHostname
to be localhost:6543
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@BlaineEXE: Yaa I can understand listerner
with vault-agent, sorry I was confused about how it can be related to vault-injector
The PR #9872 have a similar configuration or approach
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in two weeks if no further activity occurs. Thank you for your contributions. |
@thotz What's the status of this PR? |
Closing this PR since I am not planning to work on it and there is very little interest for the feature |
I might think we can have it in, Whatever we have now or open a issue tracking this so any one else can pick it up |
If someone is interested to pick it up, we have already the design doc, proposed PR and tracker issue etc. @travisn do u way to track the features like this even if its closed, like new label or something?? |
A github issue should be opened if we still need to track the feature request. |
6ccfe07
to
69d80b4
Compare
@BlaineEXE @travisn please review latest verison of this PR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@thotz Can you provide more context on reopening this old design doc? Was there a request again for this feature?
@@ -0,0 +1,143 @@ | |||
--- | |||
Service authentication with vault for RGW | |||
target-version: release-1.14 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're planning on implementing it immediately for the release soon?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, thats the plan. The code changes for this implementation are pretty minimal. I will push the PR along with the design PR
use_auto_auth_token = true | ||
} | ||
listener "tcp" { | ||
address = "127.0.0.1:8100" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
69d80b4
to
c0e4582
Compare
VAULT_AUTH_KUBERNETES_ROLE: rook-ceph | ||
|
||
--- | ||
kind: ConfigMap |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the user create this configmap, or does Rook generate it? We need to be clear about what the admin needs to create, and what Rook will generate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO since there are a lot of options I prefer the user to configure cm than by Rook Operator. We can provide the sample configuration(minimal configuration) as above
The design doc for supporting service account authentication for RGW while configuring with Vault. The OSD encryption already support it. Signed-off-by: Jiffin Tony Thottan <thottanjiffin@gmail.com>
c0e4582
to
8444479
Compare
Description of your changes:
The design doc for supporting service account authentication for RGW
while configuring with Vault. The OSD encryption already support it.
Signed-off-by: Jiffin Tony Thottan jthottan@redhat.com
Which issue is resolved by this Pull Request:
Resolves #
Checklist:
skip-ci
on the PR.