Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ceph: retry object health check if creation fails #8708

Merged

Conversation

BlaineEXE
Copy link
Member

If the CephObjectStore health checker fails to be created, return a
reconcile failure so that the reconcile will be run again and Rook will
retry creating the health checker. This also means that Rook will not
list the CephObjectStore as ready if the health checker can't be
started.

Signed-off-by: Blaine Gardner blaine.gardner@redhat.com

Description of your changes:

Which issue is resolved by this Pull Request:
Resolves #

Checklist:

  • Commit Message Formatting: Commit titles and messages follow guidelines in the developer guide.
  • Skip Tests for Docs: Add the flag for skipping the build if this is only a documentation change. See here for the flag.
  • Skip Unrelated Tests: Add a flag to run tests for a specific storage provider. See test options.
  • Reviewed the developer guide on Submitting a Pull Request
  • Documentation has been updated, if necessary.
  • Unit tests have been added, if necessary.
  • Integration tests have been added, if necessary.
  • Pending release notes updated with breaking and/or notable changes, if necessary.
  • Upgrade from previous release is tested and upgrade user guide is updated, if necessary.
  • Code generation (make codegen) has been run to update object specifications, if necessary.

@BlaineEXE BlaineEXE added the object Object protocol - S3 label Sep 13, 2021
@mergify mergify bot added the ceph main ceph tag label Sep 13, 2021
Copy link
Member

@leseb leseb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with the change but other controllers like pool and cephfs-mirror needs something similar too.

pkg/operator/ceph/object/controller_test.go Outdated Show resolved Hide resolved
Copy link
Member

@travisn travisn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a nit

pkg/operator/ceph/object/rgw.go Outdated Show resolved Hide resolved
@mergify
Copy link

mergify bot commented Sep 17, 2021

This pull request has merge conflicts that must be resolved before it can be merged. @BlaineEXE please rebase it. https://rook.io/docs/rook/master/development-flow.html#updating-your-fork

@BlaineEXE BlaineEXE force-pushed the health-checker-failure-should-fail-reconcile branch 2 times, most recently from 6f9eb5e to 7b4f2e6 Compare September 17, 2021 21:33
@BlaineEXE BlaineEXE changed the title ceph: retry object health check if creation fails [WIP] ceph: retry object health check if creation fails Sep 17, 2021
@BlaineEXE
Copy link
Member Author

BlaineEXE commented Sep 17, 2021

Need to update unit tests still...

done

@BlaineEXE BlaineEXE force-pushed the health-checker-failure-should-fail-reconcile branch from 7b4f2e6 to 7e8cfcf Compare September 17, 2021 22:22
If the CephObjectStore health checker fails to be created, return a
reconcile failure so that the reconcile will be run again and Rook will
retry creating the health checker. This also means that Rook will not
list the CephObjectStore as ready if the health checker can't be
started.

Signed-off-by: Blaine Gardner <blaine.gardner@redhat.com>
@BlaineEXE BlaineEXE force-pushed the health-checker-failure-should-fail-reconcile branch from 7e8cfcf to 5383ba2 Compare September 17, 2021 22:24
TypeMeta: controllerTypeMeta,
}
objectStore.Spec.Gateway.Port = 80
setupNewEnvironment := func(additionalObjects ...runtime.Object) *ReconcileCephObjectStore {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add this function to setup a new test environment ReconcileCephObjectStore so each test is independent.

res, err := r.Reconcile(ctx, req)
assert.NoError(t, err)
assert.True(t, res.Requeue)
})

t.Run("success - object store is running", func(t *testing.T) {
// set up an environment that has a ready ceph cluster, and return the reconciler for it
setupEnvironmentWithReadyCephCluster := func() *ReconcileCephObjectStore {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add this additional function to help set up test environment ReconcileCephObjectStores where the CephCluster should be ready, allowing reconcile to proceed. (used in the previously existing successful test and the new test added)

@BlaineEXE BlaineEXE changed the title [WIP] ceph: retry object health check if creation fails ceph: retry object health check if creation fails Sep 17, 2021
Copy link
Member

@travisn travisn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice testing!

Copy link
Member

@leseb leseb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. As a follow-up, we need to take the same approach for all health checker, ceph status, osd status, pool mirror, fs mirror. @BlaineEXE are you planning this?

@BlaineEXE
Copy link
Member Author

LGTM. As a follow-up, we need to take the same approach for all health checker, ceph status, osd status, pool mirror, fs mirror. @BlaineEXE are you planning this?

I didn't notice any place where this would be a quick fix for any of the other health checkers, I think because RGW relies on an HTTP API where the rest use the ceph cli IIUC.

@BlaineEXE BlaineEXE merged commit acbda93 into rook:master Sep 20, 2021
@BlaineEXE BlaineEXE deleted the health-checker-failure-should-fail-reconcile branch September 20, 2021 16:27
@BlaineEXE
Copy link
Member Author

I do want to backport this to 1.7 I think.

travisn added a commit that referenced this pull request Sep 21, 2021
ceph: retry object health check if creation fails (backport #8708)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ceph main ceph tag object Object protocol - S3
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants