Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pool: Allow configuration of built-in pools with non-k8s names #9363

Merged
merged 2 commits into from Dec 15, 2021

Conversation

travisn
Copy link
Member

@travisn travisn commented Dec 8, 2021

Description of your changes:
The built-in pools device_health_metrics and .nfs created by ceph need to be configured for replicas, failure domain, etc. To support this, we allow the pool to be created as a CR. Since K8s does not support underscores in the resource names the operator must translate this special pool name into the name expected by ceph.

This also sets the basis for allowing filesystem data pools to specify the desired pool name instead of requiring a generated name as in progress with #9296.

Which issue is resolved by this Pull Request:
Resolves #9209

Checklist:

  • Commit Message Formatting: Commit titles and messages follow guidelines in the developer guide.
  • Skip Tests for Docs: Add the flag for skipping the build if this is only a documentation change. See here for the flag.
  • Skip Unrelated Tests: Add a flag to run tests for a specific storage provider. See test options.
  • Reviewed the developer guide on Submitting a Pull Request
  • Documentation has been updated, if necessary.
  • Unit tests have been added, if necessary.
  • Integration tests have been added, if necessary.
  • Pending release notes updated with breaking and/or notable changes, if necessary.
  • Upgrade from previous release is tested and upgrade user guide is updated, if necessary.
  • Code generation (make codegen) has been run to update object specifications, if necessary.

Comment on lines 590 to 593
// The desired name of the pool if different from the default name.
// +optional
Name string `json:"name,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since only two names are supported, should we do an enum for those?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it must be an enum so we can have validation from the API server for "", .nfs and device_health_metrics.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perfect!

return pool.Spec.Name, nil
}

return "", errors.Errorf("not allowed to override the pool name except for the device_health_metrics and .nfs pools")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return "", errors.Errorf("not allowed to override the pool name except for the device_health_metrics and .nfs pools")
return "", errors.Errorf("not allowed to override the pool name except for the \"device_health_metrics\" and \".nfs\" pools")

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return "", errors.Errorf("not allowed to override the pool name except for the device_health_metrics and .nfs pools")
return "", errors.New("not allowed to override the pool name except for the device_health_metrics and .nfs pools")

Comment on lines 282 to 287
// Always create the default pool
err = r.createDefaultNFSRADOSPool(cephNFS)
if err != nil {
return reconcile.Result{}, errors.Wrapf(err, "failed to create default pool %q", cephNFS.Spec.RADOS.Pool)
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do I think, at minimum, want to verify that the pool exists. Though I still think that creating it might be good. That is, unless the lower level ceph osd pool create .nfs resets a customized pool back to default values. In that case, maybe it'll be easier to chat in huddle.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add a check for the existence of the pool

pkg/operator/ceph/object/objectstore.go Show resolved Hide resolved
pkg/operator/ceph/pool/controller.go Outdated Show resolved Hide resolved
@@ -341,11 +347,31 @@ func (r *ReconcileCephBlockPool) reconcileCreatePool(clusterInfo *cephclient.Clu
return reconcile.Result{}, nil
}

func getCephName(pool cephv1.CephBlockPool) (string, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be part of the API so it can be used for validation by the ceph/client or other places as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ceph/client doesn't have a concept of the difference between the k8s and ceph pool names, so it seems correct to keep the conversion of k8s name to ceph pool name here in the controller.

Copy link
Member

@BlaineEXE BlaineEXE Dec 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. But we could add a method to the API so that CephBlockPool has a .Pool() method that the client could use, and we could do sanitization of the pool name there to ensure any internal usage isn't breaking our own rules.

... Ah, but I just realized we have to break our own rules for some of the CephObjectStore pools like .rgw.

Comment on lines 590 to 593
// The desired name of the pool if different from the default name.
// +optional
Name string `json:"name,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it must be an enum so we can have validation from the API server for "", .nfs and device_health_metrics.

func CreatePoolWithProfile(context *clusterd.Context, clusterInfo *ClusterInfo, clusterSpec *cephv1.ClusterSpec, poolName string, pool cephv1.PoolSpec, appName string) error {
func CreatePoolWithProfile(context *clusterd.Context, clusterInfo *ClusterInfo, clusterSpec *cephv1.ClusterSpec, pool cephv1.NamedPoolSpec, appName string) error {
if pool.Name == "" {
return errors.Errorf("pool name must be specified")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return errors.Errorf("pool name must be specified")
return errors.New("pool name must be specified")

}

if !pool.IsErasureCoded() {
// neither a replicated or EC pool
return fmt.Errorf("pool %q type is not defined as replicated or erasure coded", poolName)
return fmt.Errorf("pool %q type is not defined as replicated or erasure coded", pool.Name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return fmt.Errorf("pool %q type is not defined as replicated or erasure coded", pool.Name)
return errors.Errorf("pool %q type is not defined as replicated or erasure coded", pool.Name)

@@ -341,11 +347,31 @@ func (r *ReconcileCephBlockPool) reconcileCreatePool(clusterInfo *cephclient.Clu
return reconcile.Result{}, nil
}

func getCephName(pool cephv1.CephBlockPool) (string, error) {
if pool.Name == "" {
return "", errors.Errorf("metadata name must be set")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return "", errors.Errorf("metadata name must be set")
return "", errors.New("metadata name must be set")

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't that checked by the API server already?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I didn't have it at first, and added it for unit test completeness. But I think i'll remove it again since it's overkill.

return pool.Name, nil
}
// Only allow the name to be overridden for certain built-in pool names
if pool.Spec.Name == "device_health_metrics" || pool.Spec.Name == ".nfs" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have an enum in the API definition we don't need this.

return pool.Spec.Name, nil
}

return "", errors.Errorf("not allowed to override the pool name except for the device_health_metrics and .nfs pools")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return "", errors.Errorf("not allowed to override the pool name except for the device_health_metrics and .nfs pools")
return "", errors.New("not allowed to override the pool name except for the device_health_metrics and .nfs pools")

@travisn travisn force-pushed the configure-built-in-pool branch 3 times, most recently from c21c9ba to a062b89 Compare December 10, 2021 21:50
Copy link
Member

@BlaineEXE BlaineEXE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementation seems good. I think it's just cleaning up docs/examples a bit more? I think NFS docs could get a mention, and the nfs-test.yaml needs a new .nfs CephBlockPool also.

#################################################################################################################
# Create a Ceph pool with settings for replication in production environments. A minimum of 3 OSDs on
# different hosts are required in this example.
# kubectl create -f pool.yaml
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# kubectl create -f pool.yaml
# kubectl create -f nfs.yaml

pkg/operator/ceph/nfs/nfs.go Show resolved Hide resolved
return errors.Wrapf(err, "failed to create default NFS pool %q. %s", poolName, string(output))
return errors.Wrapf(err, "pool %q could not be retrieved", poolName)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could improve this error message to instruct users to create the .nfs pool via CephBlockPool CRD? Maybe pointing them to a section in the NFS doc about that? (which I think we'll also have to add?)

@travisn travisn force-pushed the configure-built-in-pool branch 7 times, most recently from 859eb14 to 565bfb4 Compare December 14, 2021 06:36
pkg/daemon/ceph/client/pool.go Show resolved Hide resolved
if err != nil {
return errors.Wrapf(err, "failed to create default NFS pool %q. %s", poolName, string(output))
return errors.Wrapf(err, "pool %q could not be retrieved, ensure the .nfs pool is defined", poolName)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a const for .nfs so let's use it?

@travisn travisn force-pushed the configure-built-in-pool branch 2 times, most recently from 90425bd to 0ff55e3 Compare December 14, 2021 14:52
@mergify
Copy link

mergify bot commented Dec 14, 2021

This pull request has merge conflicts that must be resolved before it can be merged. @travisn please rebase it. https://rook.io/docs/rook/latest/development-flow.html#updating-your-fork

@travisn travisn force-pushed the configure-built-in-pool branch 3 times, most recently from 29e06f3 to c9bc968 Compare December 15, 2021 20:59
@travisn travisn force-pushed the configure-built-in-pool branch 2 times, most recently from 61802ad to bea646a Compare December 15, 2021 22:13
The built-in pools device_health_metrics and .nfs created by ceph
need to be configured for replicas, failure domain, etc.
To support this, we allow the pool to be created as a CR.
Since K8s does not support underscores in the resource names
the operator must translate this special pool name into
the name expected by ceph.

This also sets the basis for allowing filesystem data
pools to specify the desired pool name instead of requiring
a generated name.

Signed-off-by: Travis Nielsen <tnielsen@redhat.com>
Creating an EC pool is causing the CI to hang when the EC
pool is initialized since there aren't enough OSDs to
satisfy the EC parameters.

Signed-off-by: Travis Nielsen <tnielsen@redhat.com>
@travisn travisn merged commit 07f101e into rook:master Dec 15, 2021
@travisn travisn deleted the configure-built-in-pool branch December 15, 2021 23:03
mergify bot added a commit that referenced this pull request Dec 15, 2021
pool: Allow configuration of built-in pools with non-k8s names (backport #9363)
@travisn travisn mentioned this pull request Feb 14, 2022
10 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

device_health_metrics pool defaults to "host" failure domain
3 participants