Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rgw: implement support for authentication using keystone for s3 and swift #13807

Open
wants to merge 14 commits into
base: master
Choose a base branch
from

Conversation

jklippel
Copy link
Contributor

This PR implements the authentication using keystone of swift and s3 as described in the already existing design document swift-and-keystone-integration (see #9444 for the design request and #9088 for the general issue).

The design document describes another feature namely using k8s custom resources to create swift sub-users which will be implemented in a another PR.

Resolves #9088

Checklist:

  • Commit Message Formatting: Commit titles and messages follow guidelines in the developer guide.
  • Reviewed the developer guide on Submitting a Pull Request
  • Pending release notes updated with breaking and/or notable changes for the next minor release.
  • Documentation has been updated, if necessary.
  • Unit tests have been added, if necessary.
  • Integration tests have been added, if necessary.

@jklippel jklippel changed the title Feature/swift and keystone rgw: implement support for authentication using keystone for s3 and swift Feb 22, 2024
export TEST_HELM_PATH=/tmp/rook-tests-scripts-helm/linux-amd64/helm
export TEST_HELM_PATH=/tmp/rook-tests-scripts-helm/helm
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious about this change. Was this fixing a CI bug, or is this related to a change here?
@subhamkrai do you have insights here?

Copy link

@Lykos153 Lykos153 Feb 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After running helm.sh up, I have:

$ ls /tmp/rook-tests-scripts-helm/
helm  helm.tar.gz  LICENSE  README.md

I suspect the layout in the Helm archive changed (again). Last time appears to have been in 2020, when --strip-components 1 was introduced in 27c7146. Maybe this needs to be reverted now instead of changing the path in the docs, but either way export TEST_HELM_PATH=/tmp/rook-tests-scripts-helm/linux-amd64/helm did not work for me.

Copy link
Member

@BlaineEXE BlaineEXE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm impressed how well this PR is put together. There are detailed docs, full test files, and good comments.

I've also really only started scratching the surface of this review. There is a lot here, and it will take time.

One question related to #13808 : Is one of these PRs needed as a basis for the other?

Comment on lines 12 to 13
!!! note
This sample requires *at least 3 bluestore OSDs*, with each OSD located on a *different node*.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this note makes the most sense immediately before the yaml below. I was confused about what "sample" meant here until I read and re-read this a couple times.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines 17 to 18
!!! note
This example assumes that there already is an existing OpenStack Keystone instance ready to use for authentication.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably best to include this in the same note which will be moved below, to just above the yaml example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

title: Object Store with Keystone and Swift
---

You can configure the Swift API and the Keystone integration of Ceph RGW natively via the Object Store CRD, which allows native integration of the Rook operated Ceph RGW into [OpenStack](https://www.openstack.org/) clouds.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a little pedantic, but when we had the Ceph foundation's doc expert review our docs, the biggest suggestion he gave was for us to avoid use of "you," "our," "we," etc.

He suggested that we use "imperative" language in our docs. So this doc line should be shifted around to read something like this:

"Ceph RGW can integrate natively with the Swift API and Keystone via the CephObjectStore CRD. This allows native integration of Rook-operated Ceph RGWs into OpenStack"

Please adjust the rest of the document as well. Doing a file search for "you" is usually sufficient to find which lines should be adjusted.

Copy link
Contributor Author

@jklippel jklippel Mar 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

c6495fc

This is not pedantic but good advice! Thanks. I rephrased the document accordingly.
I would suggest to add a section on the imperative language in the Contributing/documentation.md file.

Alternatively, if you have an existing Ceph cluster with Rados Gateways, see the
[external section](#connect-to-an-external-object-store) to consume it from Rook.
Most commonly, the object store will be configured in Kubernetes by Rook.
Alternatively, if you have an existing Ceph cluster with [Rados Gateways](https://docs.ceph.com/en/quincy/radosgw/index.html), see the [external section](#connect-to-an-external-object-store) to consume it from Rook.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please adjust based on "you" usage here.

Copy link
Contributor Author

@jklippel jklippel Mar 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@BlaineEXE BlaineEXE Mar 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see that commit as part of the PR any more.

Also, as a preparation, we will ask you to squash all the commits before final merge. It's fine to keep each commit separate while we are working, however.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems as if this commit has already been squased already. Trying to cherry-pick it I ended up with an empty commit.
If you could please do a quick check whether the "you" based language is not gone.

@@ -1440,6 +1440,16 @@ type ObjectStoreSpec struct {
// +nullable
Gateway GatewaySpec `json:"gateway"`

// The protocol specification
// +optional
// +nullable
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not nullable because not a pointer to a struct.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 556765c


// The authentication configuration
// +optional
// +nullable
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not nullable because not a pointer to a struct.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 556765c

Comment on lines 1617 to 1642
Enabled *bool `json:"enabled,omitempty"`
AuthUseKeystone *bool `json:"authUseKeystone,omitempty"`
}

// SwiftSpec represents Ceph Object Store specification for the Swift API
type SwiftSpec struct {
AccountInUrl *bool `json:"accountInUrl,omitempty"`
UrlPrefix *string `json:"urlPrefix,omitempty"`
VersioningEnabled *bool `json:"versioningEnabled,omitempty"`
}

// AuthSpec represents the authentication protocol configuration of a Ceph Object Store Gateway
type AuthSpec struct {
// +optional
// +nullable
Keystone *KeystoneSpec `json:"keystone,omitempty"`
}

// KeystoneSpec represents the Keystone authentication configuration of a Ceph Object Store Gateway
type KeystoneSpec struct {
Url string `json:"url"`
ServiceUserSecretName string `json:"serviceUserSecretName"`
AcceptedRoles []string `json:"acceptedRoles"`
ImplicitTenants ImplicitTenantSetting `json:"implicitTenants,omitempty"`
TokenCacheSize *int `json:"tokenCacheSize,omitempty"`
RevocationInterval *int `json:"revocationInterval,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add godoc comments to all fields, and mark pointer types as nullable.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in 556765c

@@ -1763,6 +1826,7 @@ type ObjectUserQuotaSpec struct {
MaxObjects *int64 `json:"maxObjects,omitempty"`
}

// CephObjectRealm represents a Ceph Object Store Gateway Realm
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this line duplicated here?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That looks like a rebase mess up. Will fix.

Copy link

@Lykos153 Lykos153 Mar 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// https://docs.ceph.com/docs/master/rados/configuration/ceph-conf/#monitor-configuration-database
// https://docs.ceph.com/en/latest/rados/configuration/ceph-conf/#monitor-configuration-database
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Were these links broken?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, docs/master was broken when we tried to access them. I see it now redirects to en/latest. Shall we revert this change?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the old link is working, I think it's best to revert this. If it's still broken, I'm not too concerned.
But most of our links are in this format, and if they all break, we would want to fix them all at once I think.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pkg/operator/ceph/object/config.go Outdated Show resolved Hide resolved
Comment on lines 196 to 177
if ks := rgwConfig.Auth.Keystone; ks != nil {

logger.Info("Configuring authentication with keystone")

configOptions["rgw_keystone_url"] = ks.Url
configOptions["rgw_keystone_accepted_roles"] = strings.Join(ks.AcceptedRoles, ",")
if ks.ImplicitTenants != "" {

// only four values are valid here (swift, s3, true and false)
//
// https://docs.ceph.com/en/latest/radosgw/keystone/#integrating-with-openstack-keystone
if strings.ToLower(string(ks.ImplicitTenants)) != "true" &&
strings.ToLower(string(ks.ImplicitTenants)) != "false" &&
strings.ToLower(string(ks.ImplicitTenants)) != "swift" &&
strings.ToLower(string(ks.ImplicitTenants)) != "s3" {

errString := fmt.Sprintf("ImplicitTenantSetting can only be 'swift', 's3', 'true' or 'false', not %s", string(ks.ImplicitTenants))
logger.Errorf(errString)
return errors.New(errString)

}

configOptions["rgw_keystone_implicit_tenants"] = string(ks.ImplicitTenants)

}
if ks.TokenCacheSize != nil {
configOptions["rgw_keystone_token_cache_size"] = fmt.Sprintf("%d", *ks.TokenCacheSize)
}
if rgwConfig.KeystoneSecret == nil {
logger.Error("Cannot find keystone secret!")
return errors.New("Cannot find keystone secret")
}

var err error
configOptions, err = mapKeystoneSecretToConfig(configOptions, rgwConfig.KeystoneSecret)
if err != nil {
logger.Infof("error mapping keystone secret %s to config: %s", rgwConfig.KeystoneSecret.Name, err)
return err
}

} else {
logger.Info("Authentication with keystone is disabled")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this block seems as a good candidate for a proper function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


// test that a second object store can be created (and deleted) while the first exists
s.T().Run("run a second object store", func(t *testing.T) {
otherStoreName := "other-" + storeName
// The lite e2e test is perfect, as it only creates a cluster, checks that it is healthy,
// and then deletes it.
deleteStore := true
runObjectE2ETestLite(t, helper, k8sh, installer, namespace, otherStoreName, 1, deleteStore, tlsEnable)
// TODO: Why ist swiftAndKeystone always set to false?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please, address the TODO

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0a9fdf9
(These TODOs were already dealt with and are now removed.)

tests/integration/ceph_object_test.go Outdated Show resolved Hide resolved
tests/integration/ceph_object_test.go Outdated Show resolved Hide resolved
Copy link

mergify bot commented Feb 28, 2024

This pull request has merge conflicts that must be resolved before it can be merged. @jklippel please rebase it. https://rook.io/docs/rook/latest/Contributing/development-flow/#updating-your-fork

Copy link
Member

@BlaineEXE BlaineEXE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From an upstream maintainer perspective, I am really impressed by the implementation and thoroughness that I see in this PR. At the same time, I have some concerns about how this feature fits into our community, and I want to acknowledge and address those before we get too far along with reviewing and merging this code.

There do seem to be users that are interested in this feature, but that interest also seems pretty small. This is evidenced by the fact that the design doc for this feature is over 2 years old at this point, and the person who created it didn't see fit to continue to complete the work as designed.

I am concerned that this may be a feature that gets added but then doesn't have anyone sticking around to maintain and troubleshoot for other Rook users. As it is, none of the current Rook maintainers have much OpenStack experience that would allow us to (or motivate us to) keep this feature up-to-date.

While we could declare this feature "experimental," I am still concerned about the possibility of it becoming a long-term maintenance burden that could lead us to ultimately remove it down the road.

@jklippel and @Lykos153 could you tell us more about what your and your company's positions are on feature maintenance for Keystone/Swift integration? I'd like to hear more about what is driving you implement this functionality and what the time scale is for how long it will be used in production. It's also fine if you want to message me and @travisn directly on slack about this since I understand it could be slightly sensitive.

@jklippel
Copy link
Contributor Author

jklippel commented Mar 1, 2024

@jklippel and @Lykos153 could you tell us more about what your and your company's positions are on feature maintenance for Keystone/Swift integration? I'd like to hear more about what is driving you implement this functionality and what the time scale is for how long it will be used in production. It's also fine if you want to message me and @travisn directly on slack about this since I understand it could be slightly sensitive.

@BlaineEXE @Lykos153 and I (@jklippel) just sat together and discussed the points you brought up.
All valid points that we would like to address:

  1. Pretty small user interest / slow progress over the time of 2 years
    We agree that user interest as provided on the design document seems to be low.
    We see however that questions were raised occasionally as to when this feature will be implemented.
    We are both from different companies and joined forces to implement this feature as both of our companies independent of each other saw the need to have this feature implemented for our products and within the Yaook oecosystem.
    Unfortunately time and other factors slowed down and even stopped progress on this matter for a while, but both of our companies/products still see the need for this feature to integrate rook with keystone authentication in our OpenStack clouds. The person who created it is still involved in the company and still supports the implementation of this feature. He was also involved in bringing our companies together to form a joined effort.

  2. regarding your concerns on low interest but no one supporting the feature
    We and both of our companies are committed to support this feature as a contributor to this project with regards to occasional questions/problems in the issue tracker and with regards to adapting code to future changes if necessary.

  3. regarding the long time maintenance burden
    Both of our companies are committed to help you in this regards. Additionally the functional changes are not that big. The heavy lifting (if any) is done by Ceph. The OpenStack knowledge to support this feature is nearly limited to the integration tests and the tests only involve the OpenStack keystone component.
    That said, we want to support and use these features in our products and will be here to maintain them if necessary. Just mention us (@Lykos153 and me @jklippel) in any issue if you think we should have a more active involvement in an issue.

Does this suffice as commitment to the cause from our side or do you need more/something else entirely?

@jklippel
Copy link
Contributor Author

jklippel commented Mar 1, 2024

We created a list of TODO-issues in "our" forked repository to handle the mentioned issues. We linked back to this issue, so do not mind the long list of issues listed here.

@horazont
Copy link
Contributor

horazont commented Mar 1, 2024

Hi @travisn, @BlaineEXE,

I'm working at CLOUD&HEAT Technologies GmbH, as team lead of @Lykos153 and former rook contributor (#5483), and I'd like to add a few bits from our perspective.

We have been using Ceph RadosGW with OpenStack in production since 2018 (at least) and we intend to migrate the data from our main Ceph cluster (based on chef-ceph. Yes.) to rook. The lack of RadosGW/OpenStack integration has been a partial blocker (we are currently migrating block storage only because of this), so we are committed to getting this implemented. Likewise, we need this to work in the future, as our object store offering will depend on it.

With all data and information available to me to date, I expect that we offer OpenStack+Rook-based Object Store for the lifetime of the company, however long that may be ("the foreseeable future"). That also implies that we can invest some resources in keeping this feature going, if there turn out to be issues with it in the future.

Hope that helps :).

P.S.: This is my last day before a week of PTO, so if you're pinging me specifically, I may only reply from March 11th onward.

Copy link

mergify bot commented Mar 4, 2024

This pull request has merge conflicts that must be resolved before it can be merged. @jklippel please rebase it. https://rook.io/docs/rook/latest/Contributing/development-flow/#updating-your-fork

@jklippel jklippel force-pushed the feature/swift-and-keystone branch 2 times, most recently from fa106d4 to c6495fc Compare March 5, 2024 15:16
@BlaineEXE
Copy link
Member

Thanks @jklippel and @horazont! We will continue reviewing the PR with intent to merge it. Due to the size and because we are gearing up for kubecon, I think it could be several weeks until we all get a chance to give feedback.

Copy link

mergify bot commented Apr 15, 2024

This pull request has merge conflicts that must be resolved before it can be merged. @jklippel please rebase it. https://rook.io/docs/rook/latest/Contributing/development-flow/#updating-your-fork

@jklippel jklippel force-pushed the feature/swift-and-keystone branch 4 times, most recently from ecaa4b0 to 5b8b850 Compare April 18, 2024 12:13
Copy link

mergify bot commented Apr 18, 2024

This pull request has merge conflicts that must be resolved before it can be merged. @jklippel please rebase it. https://rook.io/docs/rook/latest/Contributing/development-flow/#updating-your-fork

@jklippel jklippel force-pushed the feature/swift-and-keystone branch 2 times, most recently from e188a29 to 5b8b850 Compare April 18, 2024 12:19
Copy link

mergify bot commented Apr 18, 2024

This pull request has merge conflicts that must be resolved before it can be merged. @jklippel please rebase it. https://rook.io/docs/rook/latest/Contributing/development-flow/#updating-your-fork

Lykos153 and others added 7 commits April 18, 2024 14:20
The design document for the swift and keystone integration contains
incorrectly numbered annotations.

Thie commit fixes the numeration of the annotations. Additionally
some minor wording changes are made.

Signed-off-by: Silvio Ankermann <silvio.ankermann@cloudandheat.com>
For the specification see:
<https://github.com/rook/rook/blob/master/design/ceph/object/swift-and-keystone-integration.md>

Signed-off-by: Sebastian Riese <sebastian.riese@cloudandheat.com>
Signed-off-by: Silvio Ankermann <silvio.ankermann@cloudandheat.com>
* The parameter lists of the API call have changes, as parameters
  ignored by the RGW Admin Ops API are no longer serialized,
  therefore the mock has to be adapted.

* There is now validation for the user keys that are passed to
  the User get API, therefore things failed when we had empty
  keys in our User proxy object.

Co-authored-by: Jan Klippel <jan.klippel@uhurutec.com>
Signed-off-by: Sebastian Riese <sebastian.riese@cloudandheat.com>
Signed-off-by: Silvio Ankermann <silvio.ankermann@cloudandheat.com>
For the specification see:
<https://github.com/rook/rook/blob/master/design/ceph/object/swift-and-keystone-integration.md>

Co-authored-by: Jan Klippel <jan.klippel@uhurutec.com>
Signed-off-by: Sebastian Riese <sebastian.riese@cloudandheat.com>
Signed-off-by: Silvio Ankermann <silvio.ankermann@cloudandheat.com>
Minikube decides CPU cores and memory automatically based on the
available resources on the machine which may be insufficient to
run rook. This commit adds an environment variable to add arbitrary
arguments to the minikube command, so both can be specified if desired.

Signed-off-by: Silvio Ankermann <silvio.ankermann@cloudandheat.com>
The new integration of swift or s3 and keystone support by rook
does not have any integration tests yet.

This commit introduces integration tests for swift and keystone.
The tests are done against a minimal keystone setup
(keystone container image from Yaook-project (https://yaook.cloud),
sqlite as database backend, cert-manager and trust-manager for test
certificate setup).
To prevent hardcoded credentials, passwords are generated by the
tests. The integration tests use the openstack client (keystone- and
swift-functionality) (https://docs.openstack.org/python-openstackclient/
latest/). This was a concious design decision to use client tooling as
close as possible to the end user instead of using other go-libraries
(such as gophercloud).

Signed-off-by: Jan Klippel <jan.klippel@uhurutec.com>
Currently there is no documentation on the use of Swift to access
an object store as well as the use of OpenStack keystone for authentication.

This commit adds documentation on the use of Swift and OpenStack keystone,
as well as CRD-related documentation and an example setup.

Signed-off-by: Jan Klippel <jan.klippel@uhurutec.com>
This commit introduces integration tests for s3 and keystone. The tests
are run against the same minimal keystone setup that the tests for swift
and keystone use.

The integration tests use the aws s3 client to use client tooling as
close as possible to the end user instead of using other go-libraries.

Signed-off-by: Silvio Ankermann <silvio.ankermann@cloudandheat.com>
@jklippel jklippel force-pushed the feature/swift-and-keystone branch 2 times, most recently from 176a755 to 30ece5d Compare April 18, 2024 14:13
Signed-off-by: Jan Klippel <jan.klippel@uhurutec.com>
…gration

Signed-off-by: Jan Klippel <jan.klippel@uhurutec.com>
Lykos153 and others added 4 commits April 29, 2024 20:10
Signed-off-by: Jan Klippel <jan.klippel@uhurutec.com>
Signed-off-by: Jan Klippel <jan.klippel@uhurutec.com>
…gration

Signed-off-by: Jan Klippel <jan.klippel@uhurutec.com>
@jklippel jklippel force-pushed the feature/swift-and-keystone branch from 1ced5e8 to 32c00ca Compare May 3, 2024 18:07
implicitTenants:
description: Create new users in their own tenants of the same name. Possible values are true, false, swift and s3. The latter have the effect of splitting the identity space such that only the indicated protocol will use implicit tenants.
type: string
revocationInterval:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see that this value is used anywhere to configure rgw. Also in my CEPH env I cannot set this option:

Error EINVAL: unrecognized config option 'rgw_keystone_revocation_interval'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! This parameter was removed sometime in the past (last seen in the pacific documentation).
We will remove it.
Thanks again!

Copy link
Member

@BlaineEXE BlaineEXE May 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume there isn't a replacement config either? I don't see anything similar in current Ceph docs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was removed with this commit ceph/ceph@6b4c985
If I understand the commit message correctly there was no need for a replacement.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support Swift API and Keystone Integration
8 participants