New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pool: Allow configuration of built-in pools with non-k8s names #9363
Conversation
6fdb6de
to
c43c014
Compare
pkg/apis/ceph.rook.io/v1/types.go
Outdated
// The desired name of the pool if different from the default name. | ||
// +optional | ||
Name string `json:"name,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since only two names are supported, should we do an enum for those?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it must be an enum so we can have validation from the API server for ""
, .nfs
and device_health_metrics
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perfect!
pkg/operator/ceph/pool/controller.go
Outdated
return pool.Spec.Name, nil | ||
} | ||
|
||
return "", errors.Errorf("not allowed to override the pool name except for the device_health_metrics and .nfs pools") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return "", errors.Errorf("not allowed to override the pool name except for the device_health_metrics and .nfs pools") | |
return "", errors.Errorf("not allowed to override the pool name except for the \"device_health_metrics\" and \".nfs\" pools") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return "", errors.Errorf("not allowed to override the pool name except for the device_health_metrics and .nfs pools") | |
return "", errors.New("not allowed to override the pool name except for the device_health_metrics and .nfs pools") |
pkg/operator/ceph/nfs/controller.go
Outdated
// Always create the default pool | ||
err = r.createDefaultNFSRADOSPool(cephNFS) | ||
if err != nil { | ||
return reconcile.Result{}, errors.Wrapf(err, "failed to create default pool %q", cephNFS.Spec.RADOS.Pool) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do I think, at minimum, want to verify that the pool exists. Though I still think that creating it might be good. That is, unless the lower level ceph osd pool create .nfs
resets a customized pool back to default values. In that case, maybe it'll be easier to chat in huddle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add a check for the existence of the pool
pkg/operator/ceph/pool/controller.go
Outdated
@@ -341,11 +347,31 @@ func (r *ReconcileCephBlockPool) reconcileCreatePool(clusterInfo *cephclient.Clu | |||
return reconcile.Result{}, nil | |||
} | |||
|
|||
func getCephName(pool cephv1.CephBlockPool) (string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be part of the API so it can be used for validation by the ceph/client
or other places as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ceph/client
doesn't have a concept of the difference between the k8s and ceph pool names, so it seems correct to keep the conversion of k8s name to ceph pool name here in the controller.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. But we could add a method to the API so that CephBlockPool
has a .Pool()
method that the client could use, and we could do sanitization of the pool name there to ensure any internal usage isn't breaking our own rules.
... Ah, but I just realized we have to break our own rules for some of the CephObjectStore
pools like .rgw
.
pkg/apis/ceph.rook.io/v1/types.go
Outdated
// The desired name of the pool if different from the default name. | ||
// +optional | ||
Name string `json:"name,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it must be an enum so we can have validation from the API server for ""
, .nfs
and device_health_metrics
.
pkg/daemon/ceph/client/pool.go
Outdated
func CreatePoolWithProfile(context *clusterd.Context, clusterInfo *ClusterInfo, clusterSpec *cephv1.ClusterSpec, poolName string, pool cephv1.PoolSpec, appName string) error { | ||
func CreatePoolWithProfile(context *clusterd.Context, clusterInfo *ClusterInfo, clusterSpec *cephv1.ClusterSpec, pool cephv1.NamedPoolSpec, appName string) error { | ||
if pool.Name == "" { | ||
return errors.Errorf("pool name must be specified") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return errors.Errorf("pool name must be specified") | |
return errors.New("pool name must be specified") |
pkg/daemon/ceph/client/pool.go
Outdated
} | ||
|
||
if !pool.IsErasureCoded() { | ||
// neither a replicated or EC pool | ||
return fmt.Errorf("pool %q type is not defined as replicated or erasure coded", poolName) | ||
return fmt.Errorf("pool %q type is not defined as replicated or erasure coded", pool.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return fmt.Errorf("pool %q type is not defined as replicated or erasure coded", pool.Name) | |
return errors.Errorf("pool %q type is not defined as replicated or erasure coded", pool.Name) |
pkg/operator/ceph/pool/controller.go
Outdated
@@ -341,11 +347,31 @@ func (r *ReconcileCephBlockPool) reconcileCreatePool(clusterInfo *cephclient.Clu | |||
return reconcile.Result{}, nil | |||
} | |||
|
|||
func getCephName(pool cephv1.CephBlockPool) (string, error) { | |||
if pool.Name == "" { | |||
return "", errors.Errorf("metadata name must be set") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return "", errors.Errorf("metadata name must be set") | |
return "", errors.New("metadata name must be set") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't that checked by the API server already?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, I didn't have it at first, and added it for unit test completeness. But I think i'll remove it again since it's overkill.
pkg/operator/ceph/pool/controller.go
Outdated
return pool.Name, nil | ||
} | ||
// Only allow the name to be overridden for certain built-in pool names | ||
if pool.Spec.Name == "device_health_metrics" || pool.Spec.Name == ".nfs" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have an enum in the API definition we don't need this.
pkg/operator/ceph/pool/controller.go
Outdated
return pool.Spec.Name, nil | ||
} | ||
|
||
return "", errors.Errorf("not allowed to override the pool name except for the device_health_metrics and .nfs pools") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return "", errors.Errorf("not allowed to override the pool name except for the device_health_metrics and .nfs pools") | |
return "", errors.New("not allowed to override the pool name except for the device_health_metrics and .nfs pools") |
c21c9ba
to
a062b89
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implementation seems good. I think it's just cleaning up docs/examples a bit more? I think NFS docs could get a mention, and the nfs-test.yaml
needs a new .nfs
CephBlockPool also.
deploy/examples/nfs.yaml
Outdated
################################################################################################################# | ||
# Create a Ceph pool with settings for replication in production environments. A minimum of 3 OSDs on | ||
# different hosts are required in this example. | ||
# kubectl create -f pool.yaml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# kubectl create -f pool.yaml | |
# kubectl create -f nfs.yaml |
pkg/operator/ceph/nfs/nfs.go
Outdated
return errors.Wrapf(err, "failed to create default NFS pool %q. %s", poolName, string(output)) | ||
return errors.Wrapf(err, "pool %q could not be retrieved", poolName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we could improve this error message to instruct users to create the .nfs
pool via CephBlockPool
CRD? Maybe pointing them to a section in the NFS doc about that? (which I think we'll also have to add?)
859eb14
to
565bfb4
Compare
pkg/operator/ceph/nfs/nfs.go
Outdated
if err != nil { | ||
return errors.Wrapf(err, "failed to create default NFS pool %q. %s", poolName, string(output)) | ||
return errors.Wrapf(err, "pool %q could not be retrieved, ensure the .nfs pool is defined", poolName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a const for .nfs
so let's use it?
90425bd
to
0ff55e3
Compare
This pull request has merge conflicts that must be resolved before it can be merged. @travisn please rebase it. https://rook.io/docs/rook/latest/development-flow.html#updating-your-fork |
0ff55e3
to
84a4b71
Compare
29e06f3
to
c9bc968
Compare
61802ad
to
bea646a
Compare
The built-in pools device_health_metrics and .nfs created by ceph need to be configured for replicas, failure domain, etc. To support this, we allow the pool to be created as a CR. Since K8s does not support underscores in the resource names the operator must translate this special pool name into the name expected by ceph. This also sets the basis for allowing filesystem data pools to specify the desired pool name instead of requiring a generated name. Signed-off-by: Travis Nielsen <tnielsen@redhat.com>
Creating an EC pool is causing the CI to hang when the EC pool is initialized since there aren't enough OSDs to satisfy the EC parameters. Signed-off-by: Travis Nielsen <tnielsen@redhat.com>
bea646a
to
60cc67f
Compare
pool: Allow configuration of built-in pools with non-k8s names (backport #9363)
Description of your changes:
The built-in pools device_health_metrics and .nfs created by ceph need to be configured for replicas, failure domain, etc. To support this, we allow the pool to be created as a CR. Since K8s does not support underscores in the resource names the operator must translate this special pool name into the name expected by ceph.
This also sets the basis for allowing filesystem data pools to specify the desired pool name instead of requiring a generated name as in progress with #9296.
Which issue is resolved by this Pull Request:
Resolves #9209
Checklist:
make codegen
) has been run to update object specifications, if necessary.