docs: design doc for supporting sa authencation for RGW with vault #10319

thotz · 2022-05-24T19:05:35Z

Description of your changes:
The design doc for supporting service account authentication for RGW
while configuring with Vault. The OSD encryption already support it.

Signed-off-by: Jiffin Tony Thottan jthottan@redhat.com

Which issue is resolved by this Pull Request:
Resolves #

Checklist:

Commit Message Formatting: Commit titles and messages follow guidelines in the developer guide.
Skip Tests for Docs: If this is only a documentation change, add the label skip-ci on the PR.
Reviewed the developer guide on Submitting a Pull Request
Pending release notes updated with breaking and/or notable changes for the next minor release.
Documentation has been updated, if necessary.
Unit tests have been added, if necessary.
Integration tests have been added, if necessary.

design/ceph/object/ceph-rgw-k8s-sa-authentication.md

travisn · 2022-05-24T21:45:20Z

design/ceph/object/ceph-rgw-k8s-sa-authentication.md

+There are different ways RGW can authenticate with help of [vault agent](https://docs.ceph.com/en/latest/radosgw/vault/#vault-agent), but only service account authentication is supported here.
+
+## Proposal details
+The service account details will specified in `ConnectionDetails` of `KeyManagementServiceSpec`, using that info plus the other details a config map with <ceph-object-store>-vault-agent-cm populated for the vault agent sidecar container. The side container details will be added RGW pod spec and start sidecar container with RGW pod. The RGW will be configured with vault agent specific options. 


What are the contents of the configmap? How about an example yaml of the configmap contents?

Why do we need a configmap? For example, could Rook just generate the contents of the configmap in a file in an init container? Not sure if that makes sense in this case, but I'm curious to explore other approaches.

The configmap contains configurations for the start vault agent, I will add sample example here

travisn · 2022-05-24T21:45:41Z

design/ceph/object/ceph-rgw-k8s-sa-authentication.md

+The service account details will specified in `ConnectionDetails` of `KeyManagementServiceSpec`, using that info plus the other details a config map with <ceph-object-store>-vault-agent-cm populated for the vault agent sidecar container. The side container details will be added RGW pod spec and start sidecar container with RGW pod. The RGW will be configured with vault agent specific options. 
+
+### Risks and Mitigation
+User will be able to modify the `vault-agent-cm` which is not preferable


Who owns the content of the CM? Does the operator just generate it?

Rook Operator generates based on values from Connection Details in KMS config. please refer here

This risk doesn't seem necessary. Only admins are expected to have access to the rook namespace, so normal users cannot modify the configmap. If a user has access to the rook namespace, they could destroy everything, so no need to call this risk out.

design/ceph/object/ceph-rgw-k8s-sa-authentication.md

BlaineEXE

Is there a way for users to manually configure this today? If so, the design should also include those steps so we have an idea of what the operator needs to do to implement this?

What is the relative priority of this work compared to #10316 and #10318?

BlaineEXE · 2022-05-25T18:10:47Z

design/ceph/object/ceph-rgw-k8s-sa-authentication.md

+There are different ways RGW can authenticate with help of [vault agent](https://docs.ceph.com/en/latest/radosgw/vault/#vault-agent), but only service account authentication is supported here.
+
+## Proposal details
+The service account details will specified in `ConnectionDetails` of `KeyManagementServiceSpec`, using that info plus the other details a config map with <ceph-object-store>-vault-agent-cm populated for the vault agent sidecar container. The side container details will be added RGW pod spec and start sidecar container with RGW pod. The RGW will be configured with vault agent specific options. For the following details in `ConnectionDetails`:


Is this sidecar required? It is non-ideal for Rook to hard-code support for Vault, especially in today's world where the vault injector is available. Why don't we instruct users to use the injector instead?

If the RGW can connect directly to Vault, why should we also implement this? I think that would have the same effect as this, no?

What does this do for users beyond add confusion due to a second way of configuring the same feature? Why should we go through the development and maintenance effort for this if there is a simpler alternative that accomplishes the same thing?

Is this sidecar required? It is non-ideal for Rook to hard-code support for Vault, especially in today's world where the vault injector is available. Why don't we instruct users to use the injector instead?

Vault injector Job is different, it does not authenticate with applications. It just inject vault secret directly to application pod as file kinda of webhook

If the RGW can connect directly to Vault, why should we also implement this? I think that would have the same effect as this, no?

What does this do for users beyond add confusion due to a second way of configuring the same feature? Why should we go through the development and maintenance effort for this if there is a simpler alternative that accomplishes the same thing?

RGW can authenticate with vault directly using token it is considered to be the primitive method. The vault agent provides different flavours and hence it preferred token authentication. Since the workload is in k8s environments, users expect a way to authenticate with the Service account. Hence it was added for OSD encryption as well but a vault agent was not used. But in RGW the authentication requires a vault agent. Another approach make RGW authenticate with vault directly using the service account. But upstream developers are not keen since ceph does not know k8s or service accounts etc.

BlaineEXE · 2022-05-25T18:35:18Z

design/ceph/object/ceph-rgw-k8s-sa-authentication.md

+There are different ways RGW can authenticate with help of [vault agent](https://docs.ceph.com/en/latest/radosgw/vault/#vault-agent), but only service account authentication is supported here.
+
+## Proposal details
+The service account details will specified in `ConnectionDetails` of `KeyManagementServiceSpec`, using that info plus the other details a config map with <ceph-object-store>-vault-agent-cm populated for the vault agent sidecar container. The side container details will be added RGW pod spec and start sidecar container with RGW pod. The RGW will be configured with vault agent specific options. For the following details in `ConnectionDetails`:


If the RGW can connect directly to Vault, why should we also implement this? I think that would have the same effect as this, no?

What does this do for users beyond add confusion due to a second way of configuring the same feature? Why should we go through the development and maintenance effort for this if there is a simpler alternative that accomplishes the same thing?

thotz · 2022-05-26T17:18:21Z

Is there a way for users to manually configure this today? If so, the design should also include those steps so we have an idea of what the operator needs to do to implement this?

What is the relative priority of this work compared to #10316 and #10318?

Please check comment

BlaineEXE

Requesting changes while this is waiting in priority line behind #10318 and #10323 (in that order).

BlaineEXE · 2022-06-10T19:19:04Z

I decided to spend my morning looking into Vault and SA-based Auth. I think the KMS and Kubernetes integration landscape has grown compared to when Vault support was initially added.

Of note, if users set up the kubernetes auth method to allow Vault to give KMS secrets to apps via service account, I don't see anything that suggests a Vault sidecar container is required. I think it will be best if we don't have to bind Vault-awareness into Rook more than we have to, and I think we should try to avoid creating a Vault pod (sidecar) if possible. While starting a vault agent on nodes may be the best strategy for deploying RGW on bare metal, I suspect there are more K-native methods available to us for Rook. We shouldn't mirror Ceph's complexity in Rook unless it's critical.

Vault's agent injector exists, and I think this is a way we can add the vault sidecar without having to code the pod details into Rook.

Or, I think there may be another option, which seems from my research today that it won't require a sidecar or the agent injector. From what I can tell, if the "Kubernetes" auth method is set up for Vault, and the Kubernetes version is 1.21+, a key will be added to /var/run/secrets/kubernetes.io/serviceaccount automatically. What I hope this means is that we can direct the RGW to find the auth details in that directory to authenticate with Vault.

These are the chief resources I found that break down the "Kubernetes" auth method:

thotz · 2022-06-13T07:39:43Z

I decided to spend my morning looking into Vault and SA-based Auth. I think the KMS and Kubernetes integration landscape has grown compared to when Vault support was initially added.

Of note, if users set up the kubernetes auth method to allow Vault to give KMS secrets to apps via service account, I don't see anything that suggests a Vault sidecar container is required.

For service account authentication vault sidecar is not a must. But here the issues RGW cannot directly authenticate with vault using the service account. But it can authenticate it with help of a vault agent. Hence vault agent is added as sidecar to rgw pod. So RGW authenticates with vault via vault agent using service account

I think it will be best if we don't have to bind Vault-awareness into Rook more than we have to, and I think we should try to avoid creating a Vault pod (sidecar) if possible. While starting a vault agent on nodes may be the best strategy for deploying RGW on bare metal, I suspect there are more K-native methods available to us for Rook. We shouldn't mirror Ceph's complexity in Rook unless it's critical.

Vault's agent injector exists, and I think this is a way we can add the vault sidecar without having to code the pod details into Rook.

Or, I think there may be another option, which seems from my research today that it won't require a sidecar or the agent injector.

From what I can tell, if the "Kubernetes" auth method is set up for Vault, and the Kubernetes version is 1.21+, a key will be added to /var/run/secrets/kubernetes.io/serviceaccount automatically. What I hope this means is that we can direct the RGW to find the auth details in that directory to authenticate with Vault.

Yes that's correct and even I tried to add that support in RGW codebase ceph/ceph#37868, this makes RGW understands the service account tokens of k8s. From design point of I felt the right approach to vault agent. At the these service account toke is jwt token which was already supported by RGW with vault agent.

These are the chief resources I found that break down the "Kubernetes" auth method:
* https://www.vaultproject.io/docs/auth/kubernetes#kubernetes-1-21

* https://kubernetes.io/docs/reference/access-authn-authz/service-accounts-admin/#serviceaccount-admission-controller

travisn · 2022-06-17T22:54:33Z

design/ceph/object/ceph-rgw-k8s-sa-authentication.md

+        use_auto_auth_token = true
+   }
+   listener "tcp" {
+        address = "127.0.0.1:8100"


Does a different port ever need to be configured? For example, I see port 8200 used above on line 25.

travisn · 2022-06-17T22:55:46Z

design/ceph/object/ceph-rgw-k8s-sa-authentication.md

+The service account details will specified in `ConnectionDetails` of `KeyManagementServiceSpec`, using that info plus the other details a config map with <ceph-object-store>-vault-agent-cm populated for the vault agent sidecar container. The side container details will be added RGW pod spec and start sidecar container with RGW pod. The RGW will be configured with vault agent specific options. 
+
+### Risks and Mitigation
+User will be able to modify the `vault-agent-cm` which is not preferable


This risk doesn't seem necessary. Only admins are expected to have access to the rook namespace, so normal users cannot modify the configmap. If a user has access to the rook namespace, they could destroy everything, so no need to call this risk out.

travisn · 2022-06-17T22:59:35Z

design/ceph/object/ceph-rgw-k8s-sa-authentication.md

+There are different ways RGW can authenticate with help of [vault agent](https://docs.ceph.com/en/latest/radosgw/vault/#vault-agent), but only service account authentication is supported here.
+
+## Proposal details
+The service account details will specified in `ConnectionDetails` of `KeyManagementServiceSpec`, using that info plus the other details a config map with <ceph-object-store>-vault-agent-cm populated for the vault agent sidecar container. The side container details will be added RGW pod spec and start sidecar container with RGW pod. The RGW will be configured with vault agent specific options. For the following details in `ConnectionDetails`:


What image would the sidecar use?

What are recommended resource requests/limits?

How about adding a link to the vault agent docs for this?

travisn · 2022-06-17T23:02:28Z

design/ceph/object/ceph-rgw-k8s-sa-authentication.md

+User will be able to modify the `vault-agent-cm` which is not preferable
+
+## Config Commands
+The user can itself bring up `vault agent` as separate deployment configuring with service account authentication details. Then set following for RGW via toolbox pod:


If the vault agent can be started as a separate deployment, is only one agent needed for the whole cluster? If so, should we anyway create a vault agent deployment instead of an rgw sidecar? Or why should it be a sidecar for rgw?

Good question!

Also a potential consideration: can we expect users to run the agent themselves so they can have as much control over its configuration as they want/need?

I have done a fair bit more research on this. I believe that Vault's model that the vault agent runs as a sidecar is merely a requirement because it assumes the agent injects secrets into a shared pod volume. If RGW is talking to vault agent directly, there is no need to run the agent as a sidecar, because there is no need to have the agent inject anything into pod shared directories.

This means we can have (and capture in the design doc) the decision to either use a sidecar or to use a standalone vault agent that is shared by all RGWs.

Using sidecars will naturally increase our resource footprint, a negative. However, the RGW can contact the sidecar on localhost within the Pod.

Using a standalone agent will decrease resources, especially for large numbers of RGWs. However, RGWs will likely be run on many nodes, and a shared agent will add additional host-to-host latency to each S3 call IIUC.

Regarding the agent injector, while it is intended to be used to inject specific secrets into Pods, I believe it has the flexibility to configure the agent as a listener as well by modifying the ConfigMap as documented here: https://www.vaultproject.io/docs/platform/k8s/injector/examples#configmap-example

We can set the listener config in the ConfigMap to have the agent sidecar automatically injected and listening on a port expected by the RGW.

In my opinion, a better design may be to allow users the flexibility to choose whether they want to run a standalone vault agent or whether they want sidecars. It would be easy to add an option to the CephObjectStore security spec that defines the address of the vault agent: spec.security.vaultAgentHostname.

If they want sidecars, then they can...

set spec.security.vaultAgentHostname to localhost:<port>

configure the agent listener to localhost:<port> in the configmap

add the below annotations on the CephObjectStore.
annotations: vault.hashicorp.com/agent-inject: 'true' vault.hashicorp.com/agent-configmap: 'my-configmap'

If they want to use a standalone vault agent, then they merely need to set spec.security.vaultAgentHostname to the Service that provides the vault agent.

@BlaineEXE @travisn: A standalone vault agent is also an option. User needs to set it up independently, as mentioned but not sure about any performance impact, for ceph it is always local to rgw server. We can add an option either CephObjectstore or in the Connection details string map. For RGW it is just an endpoint, it can be vault-server directly, vault-agent locally or separate deplotment.
After checking the examples of vault injector I still can't figure out the listener mode mentioned above. The vault injector can mount the config map of vault-agent to the pod apart from that I don't see much use case wrt the vault injector. Am I missing something??

From the vault docs, listener specifies the address on which the vault agent listens. I believe this means we can (using the confimap) tell the agent that is injected to listen to RGW requests in the pod at something like localhost:6543. We can then configure vaultAgentHostname to be localhost:6543.

@BlaineEXE: Yaa I can understand listerner with vault-agent, sorry I was confused about how it can be related to vault-injector The PR #9872 have a similar configuration or approach

github-actions · 2022-08-05T20:01:57Z

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in two weeks if no further activity occurs. Thank you for your contributions.

travisn · 2023-04-04T20:30:39Z

@thotz What's the status of this PR?

thotz · 2023-08-09T15:54:27Z

Closing this PR since I am not planning to work on it and there is very little interest for the feature

parth-gr · 2023-08-09T16:35:53Z

I might think we can have it in,

Whatever we have now or open a issue tracking this so any one else can pick it up

thotz · 2023-08-10T07:10:15Z

If someone is interested to pick it up, we have already the design doc, proposed PR and tracker issue etc. @travisn do u way to track the features like this even if its closed, like new label or something??

travisn · 2023-08-10T18:24:23Z

If someone is interested to pick it up, we have already the design doc, proposed PR and tracker issue etc. @travisn do u way to track the features like this even if its closed, like new label or something??

A github issue should be opened if we still need to track the feature request.

thotz · 2024-03-12T14:54:20Z

@BlaineEXE @travisn please review latest verison of this PR

travisn

@thotz Can you provide more context on reopening this old design doc? Was there a request again for this feature?

travisn · 2024-03-12T18:21:08Z

design/ceph/object/ceph-rgw-k8s-sa-authentication.md

@@ -0,0 +1,143 @@
+---
+Service authentication with vault for RGW
+target-version: release-1.14


You're planning on implementing it immediately for the release soon?

Yes, thats the plan. The code changes for this implementation are pretty minimal. I will push the PR along with the design PR

travisn · 2024-03-12T18:21:22Z

design/ceph/object/ceph-rgw-k8s-sa-authentication.md

+        use_auto_auth_token = true
+   }
+   listener "tcp" {
+        address = "127.0.0.1:8100"


design/ceph/object/ceph-rgw-k8s-sa-authentication.md

travisn · 2024-03-14T20:28:35Z

design/ceph/object/ceph-rgw-k8s-sa-authentication.md

+          VAULT_AUTH_KUBERNETES_ROLE: rook-ceph
+
+---
+kind: ConfigMap


Does the user create this configmap, or does Rook generate it? We need to be clear about what the admin needs to create, and what Rook will generate.

IMO since there are a lot of options I prefer the user to configure cm than by Rook Operator. We can provide the sample configuration(minimal configuration) as above

The design doc for supporting service account authentication for RGW while configuring with Vault. The OSD encryption already support it. Signed-off-by: Jiffin Tony Thottan <thottanjiffin@gmail.com>

thotz added docs skip-ci labels May 24, 2022

thotz requested review from leseb, travisn and BlaineEXE May 24, 2022 19:05

thotz mentioned this pull request May 24, 2022

object: supporting service account authencation for RGW with vault #10320

Closed

travisn requested changes May 24, 2022

View reviewed changes

subhamkrai reviewed May 25, 2022

View reviewed changes

design/ceph/object/ceph-rgw-k8s-sa-authentication.md Outdated Show resolved Hide resolved

thotz force-pushed the design-rgw-k8s-sa-authentication branch from a5e8b68 to 6ccfe07 Compare May 25, 2022 18:09

thotz requested a review from travisn May 25, 2022 18:09

BlaineEXE requested changes May 25, 2022

View reviewed changes

thotz mentioned this pull request May 26, 2022

docs: design doc for supporting aws sse:s3 in RGW #10318

Merged

7 tasks

thotz requested a review from BlaineEXE May 26, 2022 17:18

BlaineEXE requested changes Jun 7, 2022

View reviewed changes

thotz requested a review from BlaineEXE June 13, 2022 07:39

travisn requested changes Jun 17, 2022

View reviewed changes

travisn added this to In progress in v1.10 via automation Jul 12, 2022

github-actions bot added the stale Labeled by the stale bot label Aug 5, 2022

thotz added keepalive and removed stale Labeled by the stale bot labels Aug 9, 2022

travisn removed this from In progress in v1.10 Oct 18, 2022

travisn added this to In progress in v1.11 via automation Oct 18, 2022

travisn removed this from In progress in v1.11 Oct 18, 2022

thotz closed this Aug 9, 2023

thotz reopened this Mar 12, 2024

thotz force-pushed the design-rgw-k8s-sa-authentication branch from 6ccfe07 to 69d80b4 Compare March 12, 2024 14:53

thotz requested review from travisn and subhamkrai March 12, 2024 14:53

travisn requested changes Mar 12, 2024

View reviewed changes

thotz force-pushed the design-rgw-k8s-sa-authentication branch from 69d80b4 to c0e4582 Compare March 13, 2024 16:42

thotz requested a review from travisn March 14, 2024 04:56

thotz mentioned this pull request Mar 14, 2024

object: add support for service account authentication in vault #13934

Open

6 tasks

travisn requested changes Mar 14, 2024

View reviewed changes

docs: design doc for supporting sa authencation for RGW with vault

8444479

The design doc for supporting service account authentication for RGW while configuring with Vault. The OSD encryption already support it. Signed-off-by: Jiffin Tony Thottan <thottanjiffin@gmail.com>

thotz force-pushed the design-rgw-k8s-sa-authentication branch from c0e4582 to 8444479 Compare March 15, 2024 16:42

thotz requested a review from travisn March 15, 2024 16:42

docs: design doc for supporting sa authencation for RGW with vault #10319

Are you sure you want to change the base?

docs: design doc for supporting sa authencation for RGW with vault #10319

Conversation

thotz commented May 24, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

BlaineEXE left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

thotz commented May 26, 2022

BlaineEXE left a comment

Choose a reason for hiding this comment

BlaineEXE commented Jun 10, 2022

thotz commented Jun 13, 2022 • edited by BlaineEXE

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

BlaineEXE Jul 1, 2022 • edited

Choose a reason for hiding this comment

BlaineEXE Jul 1, 2022 • edited

Choose a reason for hiding this comment

thotz Jul 5, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Aug 5, 2022

travisn commented Apr 4, 2023

thotz commented Aug 9, 2023

parth-gr commented Aug 9, 2023

thotz commented Aug 10, 2023

travisn commented Aug 10, 2023

thotz commented Mar 12, 2024

travisn left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

thotz commented Jun 13, 2022 •

edited by BlaineEXE

BlaineEXE Jul 1, 2022 •

edited

BlaineEXE Jul 1, 2022 •

edited

thotz Jul 5, 2022 •

edited