New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EC filesystem fails to deploy with latest release of Rook-Ceph #8210
Comments
This looks expected. Please check https://docs.ceph.com/en/latest/cephfs/createfs/#creating-pools ( last point). |
@sp98 So it is not enough to follow this documentation https://rook.io/docs/rook/v1.6/ceph-filesystem-crd.html#erasure-coded to deploy a working EC filesystem. Maybe some indications to solve the problem with metadata pool should be added. |
This is a result of #8130, a couple of users have also reported this in slack. |
The workaround would be to use Rook v1.6.5 until this is fixed, since v1.6.6 removed the |
@batrick is there another way apart from using |
@subhamkrai do not use ec pools for the default (primary) data pool. Add a directory layout on root for a secondary ec data pool instead. |
@batrick can you share more details about it or maybe a doc link from where I can get some ideas about directory layout and primary/secondary pools? |
Create an erasure coded pool and then: https://docs.ceph.com/en/latest/cephfs/file-layouts/#adding-a-data-pool-to-the-file-system |
@travisn adding this link in the doc only will work? |
First, let's update the Filesystem EC example to have two data pools: spec:
metadataPool:
replicated:
size: 3
dataPools:
- replicated:
size: 3
- erasureCoded:
dataChunks: 2
codingChunks: 1
metadataServer:
activeCount: 1
activeStandby: true This way Rook will create the pools and add them to the filesystem. Then I understand the user will need to go ahead and provision their cephfs volume, and inside the volume set this attribute, where
@Madhu-1 Or does the CSI driver allow setting an attribute like this for the directory to use a pool? |
CSI sets pool layout when creating cephfs subvolume https://github.com/ceph/ceph-csi/blob/d85304c7c22de19e20efbcb34140f2c5b130bb70/internal/cephfs/volume.go#L175 |
when creating ec fs, create replicated pool as primary pool and ec pool as secondary pool, creating ec pool as primary is not encouraged and it will lead to failure. Also after this, user need to add a directory layout on root for a secondary ec data pool. Closes: rook#8210 Signed-off-by: subhamkrai <srai@redhat.com>
when creating ec fs, create replicated pool as primary pool and ec pool as secondary pool, creating ec pool as primary is not encouraged and it will lead to failure. Also after this, user need to add a directory layout on root for a secondary ec data pool. Closes: rook#8210 Signed-off-by: subhamkrai <srai@redhat.com>
Specifying a
After applying both the CephFilesystem and the StorageClass, I get the error:
The cephfs exists, but with error:
...as does the sc:
I reviewed the draft guidance in PR #8452 (and this thread). I'm not sure what else is missing here. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs. Thank you for your contributions. |
when creating ec fs, create replicated pool as primary pool and ec pool as secondary pool, creating ec pool as primary is not encouraged and it will lead to failure. Closes: rook#8210 Signed-off-by: subhamkrai <srai@redhat.com>
when creating ec fs, create replicated pool as primary pool and ec pool as secondary pool, creating ec pool as primary is not encouraged and it will lead to failure. Closes: rook#8210 Signed-off-by: subhamkrai <srai@redhat.com>
when creating ec fs, create replicated pool as primary pool and ec pool as secondary pool, creating ec pool as primary is not encouraged and it will lead to failure. Closes: rook#8210 Signed-off-by: subhamkrai <srai@redhat.com>
when creating ec fs, create replicated pool as primary pool and ec pool as secondary pool, creating ec pool as primary is not encouraged and it will lead to failure. Closes: rook#8210 Signed-off-by: subhamkrai <srai@redhat.com>
When creating EC fs, create replicated pool as primary pool and ec pool as secondary pool, creating ec pool as primary is not encouraged and it will lead to failure. Closes: rook#8210 Signed-off-by: subhamkrai <srai@redhat.com>
Just hit this issue. From my understanding of the docs it's not actually required to attach 2 data pools at all and just an EC data pool should also work as long as That said, while it is not required to have a replicated and EC data pool (if I'm understanding the docs correctly), it does seem to be advised to potentially improve small-object backtrace update read and write performance. |
I'm not so sure that's true. The Ceph docs here https://docs.ceph.com/en/latest/rados/operations/erasure-code/#erasure-coding-with-overwrites say
|
Haha I was just about to update my post as I stumbled upon that segment in the docs as well while I was going through different parts of the docs. I'm giving it a shot to use an EC RBD now and see what happens. EDIT: So yea that does indeed also fail. So it is 100% required then. In which case I'd think, based on the docs, that if Rook instantiates the RBDs using |
Would you mind opening a new issue for this? |
Done; #9179 |
I did one more test with block and fs pools and to summarise:
What I haven't looked into yet is the last category; CephObjectStore. I can't figure out the setup/internal workings for it based on the documentation as what I'm seeing there suggests it creates like 6 pools per zone so I have no idea how that maps on what's documented/set up for Rook's CRD which only defines a single metadata- and data pool. I am expecting the first bullet to also be applicable here but I'll need to do a test with object stores to see what happens here in regards to pool setup/creation. [1] https://docs.ceph.com/en/latest/cephfs/createfs/#using-erasure-coded-pools-with-cephfs |
Looking at the possible options I'm currently seeing 2 things that could be done to improve the situation:
That said, I've done the operator's |
@Omar007 currently, we are looking for other options where don't have to use |
I've seen that PR but the point is, based on the docs, going that route should not be needed. And if there is side-effects not mentioned in the docs or anywhere else, that should really be fixed as well (though honestly that's a Ceph documentation problem, not such much a Rook documentation problem). To give some more context; in my case I'm talking about a pool that will be running a mostly read-oriented workload with files of several 100's of MBs in size for the small ones and several GBs for the big ones. That one/singular drawback currently mentioned in the docs is of completely no concern to me so if the documentation is valid/complete, I do not want to deal with having a more complicated configuration and unused pools for no reason as this should be working/usable. If there are other issues with this setup other than that single documented one, I would love to see more information about that be made available (or referenced if it is hidden somewhere completely unrelated at the moment). |
When creating EC fs, create replicated pool as primary pool and ec pool as secondary pool, creating ec pool as primary is not encouraged and it will lead to failure. Also, changing the pool name in storageclass-ec file. Closes: rook#8210 Signed-off-by: subhamkrai <srai@redhat.com>
@batrick @kotreshhr as pointed out in the discussions above, is this documentation "For CephFS, an erasure coded pool can be set as the default data pool during file system creation or via file layouts.", incorrect? See, https://docs.ceph.com/en/latest/rados/operations/erasure-code/#erasure-coding-with-overwrites |
@ajarr We are now adding erasure-coded pool as non-default data fool for CephFs but while testing I didn't run the the command for layouts and we can see in the PR comment that data is going in erasure-coded pool |
When creating EC fs, create replicated pool as primary pool and ec pool as secondary pool, creating ec pool as primary is not encouraged and it will lead to failure. Also, changing the pool name in storageclass-ec file. Closes: rook#8210 Signed-off-by: subhamkrai <srai@redhat.com>
When creating EC fs, create replicated pool as primary pool and ec pool as secondary pool, creating ec pool as primary is not encouraged and it will lead to failure. Also, changing the pool name in storageclass-ec file. Closes: rook#8210 Signed-off-by: subhamkrai <srai@redhat.com>
Strictly speaking it's not incorrect because you can use it for the default data pool but you should not. That documentation should mention as much. |
(That documentation was not updated probably because it's part of doc/rados and not doc/cephfs.) |
When creating EC fs, create replicated pool as primary pool and ec pool as secondary pool, creating ec pool as primary is not encouraged and it will lead to failure. Also, changing the pool name in storageclass-ec file. Closes: #8210 Signed-off-by: subhamkrai <srai@redhat.com> (cherry picked from commit 5bb29f1)
When creating EC fs, create replicated pool as primary pool and ec pool as secondary pool, creating ec pool as primary is not encouraged and it will lead to failure. Also, changing the pool name in storageclass-ec file. Closes: rook#8210 Signed-off-by: subhamkrai <srai@redhat.com>
When creating EC fs, create replicated pool as primary pool and ec pool as secondary pool, creating ec pool as primary is not encouraged and it will lead to failure. Also, changing the pool name in storageclass-ec file. Closes: rook#8210 Signed-off-by: subhamkrai <srai@redhat.com>
Is this a bug report or feature request?
Deviation from expected behavior:
EC filesystem is not created
Expected behavior:
Everything correctly created
How to reproduce it (minimal and precise):
I created on AKS a K8s cluster with 6 nodes. 3 OSDs. The deploy of normal filesystem works just fine, I used the example crds.yaml, common.yaml,
File(s) to submit:
cluster.yaml
, if necessary2021-06-28 07:36:15.478740 I | ceph-file-controller: creating filesystem "myfs-ec" 2021-06-28 07:36:15.478782 I | cephclient: creating filesystem "myfs-ec" with metadata pool "myfs-ec-metadata" and data pools [myfs-ec-data0] 2021-06-28 07:36:17.223857 E | ceph-file-controller: failed to reconcile failed to create filesystem "myfs-ec": failed to create filesystem "myfs-ec": failed enabling ceph fs "myfs-ec": Error EINVAL: pool 'myfs-ec-data0' (id '3') is an erasure-coded pool. Use of an EC pool for the default data pool is discouraged; see the online CephFS documentation for more information. Use --force to override.
Environment:
uname -a
):rook version
inside of a Rook Pod): v1.6.2ceph -v
): 16.2.2kubectl version
): 1.20.7ceph health
in the Rook Ceph toolbox): HEALTH_OKThe text was updated successfully, but these errors were encountered: