Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ceph: initialize rbd block pool after creation #8923

Merged
merged 1 commit into from Oct 6, 2021

Conversation

Rakshith-R
Copy link
Member

@Rakshith-R Rakshith-R commented Oct 6, 2021

This is done in order to prevent deadlock when parallel
PVC create requests are issued on a new uninitialized
rbd block pool due to https://tracker.ceph.com/issues/52537.

Fixes: #8696

Signed-off-by: Rakshith R rar@redhat.com

verified by testing it locally

[rakshith@fedora ceph-csi]$ ./scripts/rook.sh create-block-pool
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   525  100   525    0     0   1299      0 --:--:-- --:--:-- --:--:--  1299
cephblockpool.ceph.rook.io/new-pool created
Checking RBD (new-pool) stats... 0s
RBD (new-pool) is successfully created...
[rakshith@fedora ceph-csi]$ k apply -f rbd/storageclass-test.yaml 
storageclass.storage.k8s.io/rook-ceph-block-1 created
[rakshith@fedora ceph-csi]$ k apply -f rbd/pvc.yaml 
persistentvolumeclaim/rbd-1 created
persistentvolumeclaim/rbd-2 created
persistentvolumeclaim/rbd-3 created
persistentvolumeclaim/rbd-4 created
[rakshith@fedora ceph-csi]$ k get pvc
NAME    STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS        AGE
rbd-1   Bound    pvc-81fbaf0e-aba8-4f18-8e4b-aeb18fb17527   1Gi        RWO            rook-ceph-block-1   34s
rbd-2   Bound    pvc-7e599340-1b14-42f7-9517-bb9733ded3f2   1Gi        RWO            rook-ceph-block-1   33s
rbd-3   Bound    pvc-ebb4d196-ee25-4369-b908-c91908324c9f   1Gi        RWO            rook-ceph-block-1   33s
rbd-4   Bound    pvc-e513679e-4c37-4236-84f0-ce0d044f87fe   1Gi        RWO            rook-ceph-block-1   33s

Checklist:

  • Commit Message Formatting: Commit titles and messages follow guidelines in the developer guide.
  • Skip Tests for Docs: Add the flag for skipping the build if this is only a documentation change. See here for the flag.
  • Skip Unrelated Tests: Add a flag to run tests for a specific storage provider. See test options.
  • Reviewed the developer guide on Submitting a Pull Request
  • Documentation has been updated, if necessary.
  • Unit tests have been added, if necessary.
  • Integration tests have been added, if necessary.
  • Pending release notes updated with breaking and/or notable changes, if necessary.
  • Upgrade from previous release is tested and upgrade user guide is updated, if necessary.
  • Code generation (make codegen) has been run to update object specifications, if necessary.

@mergify mergify bot added the ceph main ceph tag label Oct 6, 2021
pkg/daemon/ceph/client/pool.go Outdated Show resolved Hide resolved
pkg/daemon/ceph/client/pool.go Outdated Show resolved Hide resolved
@@ -357,6 +357,12 @@ func CreateECPoolForApp(context *clusterd.Context, clusterInfo *ClusterInfo, poo
return errors.Wrapf(err, "failed to create EC pool %s. %s", poolName, string(output))
}

args = []string{"pool", "init", poolName}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is pool init idempotent?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK and according to my tests, pool init is idempotent.

I'll let @idryomov confirm this (and please review the pr too thanks).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is idempotent.

@Rakshith-R Rakshith-R force-pushed the add-rbdpool-init branch 2 times, most recently from 668d6b4 to 1724211 Compare October 6, 2021 09:47
pkg/daemon/ceph/client/pool.go Outdated Show resolved Hide resolved
pkg/daemon/ceph/client/pool.go Outdated Show resolved Hide resolved
@Rakshith-R
Copy link
Member Author

Moved it to higher layer, Since adding check for rbd through the function calls will make it complex.

@idryomov Shall i add a TODO to remove the cmd once we get the fix in cephcsi build or can it be considered a good practise to initialize the rbd blockpool after creation in rook?

@Rakshith-R Rakshith-R requested a review from leseb October 6, 2021 12:31
Copy link
Member

@leseb leseb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to add unit tests just like you did before.

@idryomov
Copy link

idryomov commented Oct 6, 2021

@idryomov Shall i add a TODO to remove the cmd once we get the fix in cephcsi build or can it be considered a good practise to initialize the rbd blockpool after creation in rook?

It is good to have this. This initialization can't be skipped -- it happens one way or the other anyway. With an explicit rbd pool init it is offloaded from the first image creation request, making all image creation requests the same and avoiding a couple of potential race conditions.

Copy link

@idryomov idryomov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM from rbd perspective

@Rakshith-R
Copy link
Member Author

It would be nice to add unit tests just like you did before.

Added them, PTAL.

Copy link
Member

@leseb leseb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more thing, logging is important so let's go before and after the command has run, like initializing rbd pool %q and successfully initialized rbd pool %q. Thanks!

pkg/operator/ceph/pool/controller.go Outdated Show resolved Hide resolved
pkg/operator/ceph/pool/controller.go Outdated Show resolved Hide resolved
This is done in order to prevent deadlock when parallel
PVC create requests are issued on a new uninitialized
rbd block pool due to https://tracker.ceph.com/issues/52537.

Fixes: rook#8696

Signed-off-by: Rakshith R <rar@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ceph main ceph tag
Projects
None yet
Development

Successfully merging this pull request may close these issues.

New BlockPool / SC + Parallel RBD Volume Creation hangs and fails
4 participants