core: Treat cluster as not existing if the cleanup policy is set #9041

travisn · 2021-10-26T19:25:04Z

Description of your changes:
The cluster CR can be forcefully deleted and cleanup the cluster resources if the yes-really-destroy-data policy is set on the CR. In this case, the other controllers should treat the cluster CR as not existing and allow the finalizers to be removed on those resources if they are requested for deletion. This helps avoid the chicken-and-egg problem of trying to destroy the whole cluster and being blocked on resources such as buckets existing when their existence is really irrelevant. Setting yes-really-destroy-data on the cluster CR is really expected to be a destructive action and there is no more trying to protect data.

Checklist:

BlaineEXE · 2021-10-26T19:29:14Z

pkg/operator/ceph/controller/controller_utils.go

+	// If the cluster has a cleanup policy to destroy the cluster, treat it as if it does not exist
+	if cephCluster.Spec.CleanupPolicy.HasDataDirCleanPolicy() {
+		cephClusterExists = false
+		return cephCluster, false, cephClusterExists, WaitForRequeueIfCephClusterNotReady
+	}


Can we change this to if it has the cleanup policy set and it has a deletion timestamp? Otherwise, a user could conceivably set the cleanup policy accidentally and intend to unset it without deleting the CephCluster.

For this to take place accidentally, they would have to set the cleanup policy and delete the child CR. Since it already requires two actions, and we interpret the yes-really-destroy-data cleanup policy as truly destructive, it seems sufficient.

I don't see the harm in looking for both, and it is safer for user data if we only make the assumption when the CephCluster is also being deleted. That is what we talked about with Jose, and what I thought we all agreed to.

[Update] This would also mean mean that a user has more chances to opt out of deletion along their journey. Maybe they set the yes-really-destroy-data and suddenly remember that they forgot to back up one of their resources before hitting delete on the CephCluster. They would still have the opportunity to back out if we also look for the deletion timestamp, whereas without, they are committed as soon as they set yes-really-destroy-data. Also, for deleting the CephCluster, they aren't committed until they actually hit delete on the CephCluster resource. Let's keep the same level of commitment for all of the resources.

If they have set yes-really-destroy-data, they are only committed to destroying the cluster at the moment they actually start deleting CRs. This seems consistent to me that we would go ahead and delete if the policy is to destroy the cluster. I don't see why we need yet another hoop to jump through to force delete resources. This allows them to start cleaning up resources in steps if desired.

Interestingly, it isn't even really destructive if they delete the child CRs like this since no Ceph resources will be deleted by the finalizers in this case. If they mistakenly deleted the CRs, they could just create the CRs again and everything would still remain in the same state.

what I thought we all agreed to.

It's never agreed until the PRs are merged. :)

The policy is for deleting the cluster, and I am against extending cluster-policy to sub-resource-policy. I think that is counterintuitive for users, and I don't want to sign-off on something that I think will be detrimental to their experience or understanding of how Rook operates.

BlaineEXE

Requesting changes because this does not meet the proposal I agreed to. If we are missing the check that the user is actually deleting the CephCluster, my said-in-chat arguments about the unintuitive user experience remains.

The proposal I will accept:
If : we are deleting a CephObjectStore (could be any dependent resource, but let's keep this for the example)
and : the CephCluster has yes-really-destroy-data-set
and : the CephCluster has a nonzero deletion timestamp
then: we can treat deletion of the CephObjectStore as though the CephCluster doesn't exist, because we can be pretty sure it won't exist very soon. (i.e., just delete the CephObjectStore resource)

Why I am rejecting the changes as they are now:
The cleanupPolicy exists in the CephCluster CR because we intentionally, with effort, and by design, decided that users should have a protection from their Ceph cluster being accidentally deleted. Either by themselves due to manual error, or by an automation tool around it (helm, ocs-operator). This is an intentionally designed default-safe state. With this design, we would be forcing users to set their CephCluster into a default-unsafe configuration in order to allow them to cleanly delete CephObjectStores (as an example) in a way that force-removes that single resource. Then they would have to reset their configuration to the default-safe state again. If we truly treat user data protections as sacred, we should not be implementing the feature in this way.

As a further note, we already have a way for users to force-remove single resources, being removal of the finalizer on those resources, and that doesn't require setting default-unsafe configs for the underlying CephCluster, which this design could encourage users to do. Instead, I think we should merely treat this as a full-uninstallation optimization that allows everything to be deleted more quickly when the CephCluster is being deleted at the same time as other resources and with yes-delete-all-data set.

travisn · 2021-10-26T23:06:26Z

Requesting changes because this does not meet the proposal I agreed to. If we are missing the check that the user is actually deleting the CephCluster, my said-in-chat arguments about the unintuitive user experience remains.

The proposal I will accept: If : we are deleting a CephObjectStore (could be any dependent resource, but let's keep this for the example) and : the CephCluster has yes-really-destroy-data-set and : the CephCluster has a nonzero deletion timestamp then: we can treat deletion of the CephObjectStore as though the CephCluster doesn't exist, because we can be pretty sure it won't exist very soon. (i.e., just delete the CephObjectStore resource)

To be clear, if the CephCluster has the cleanupPolicy.confirmation set to yes-really-destroy-data, we already expect the cluster is imminently going to be deleted. The operator will immediately stop reconciling the cluster with the message found here.

skipping orchestration for cluster object %q in namespace %q because its cleanup policy is set. not reloading the manager

Why I am rejecting the changes as they are now: The cleanupPolicy exists in the CephCluster CR because we intentionally, with effort, and by design, decided that users should have a protection from their Ceph cluster being accidentally deleted. Either by themselves due to manual error, or by an automation tool around it (helm, ocs-operator). This is an intentionally designed default-safe state. With this design, we would be forcing users to set their CephCluster into a default-unsafe configuration in order to allow them to cleanly delete CephObjectStores (as an example) in a way that force-removes that single resource. Then they would have to reset their configuration to the default-safe state again. If we truly treat user data protections as sacred, we should not be implementing the feature in this way.

I do not understand where this scenario comes from. Why would they set it, then unset it just to delete the object store? The purpose of allowing deletion with the cleanup policy is just for the full cleanup case. I don't see it for another scenario.

As a further note, we already have a way for users to force-remove single resources, being removal of the finalizer on those resources, and that doesn't require setting default-unsafe configs for the underlying CephCluster, which this design could encourage users to do. Instead, I think we should merely treat this as a full-uninstallation optimization that allows everything to be deleted more quickly when the CephCluster is being deleted at the same time as other resources and with yes-delete-all-data set.

Agreed, this is a full-uninstallation optimization. And since Rook expects that the CephCluster CR is imminently going to be deleted after the cleanup policy is set, if not already deleted, I don't see how it improves the scenario. The cluster is expected to be fully destroyed at this point anyway.

Adding the check for deletion is simple, I just am not understanding how it improves the scenario.

BlaineEXE · 2021-10-26T23:26:14Z

To be clear, if the CephCluster has the cleanupPolicy.confirmation set to yes-really-destroy-data, we already expect the cluster is imminently going to be deleted. The operator will immediately stop reconciling the cluster with the message found here.
skipping orchestration for cluster object %q in namespace %q because its cleanup policy is set. not reloading the manager
...
Agreed, this is a full-uninstallation optimization. And since Rook expects that the CephCluster CR is imminently going to be deleted after the cleanup policy is set, if not already deleted, I don't see how it improves the scenario. The cluster is expected to be fully destroyed at this point anyway.

Adding the check for deletion is simple, I just am not understanding how it improves the scenario.

The important thing from my perspective is that the user can still back out of destroying the whole CephCluster even after they have set cleanupPolicy.confirmation: yes-really-destroy-data. They merely set cleanupPolicy.confirmation: "" or remove the cleanupPolicy altogether, and then they will have effectively changed their mind.

From the flow of the CephObjectStore controller, for example, the CephCluster is probably going to be deleted forcefully if cleanupPolicy.confirmation is set, but it is not definite until the user actually issues a delete on the CephCluster resource. It seems wrong to me to allow the user to back out of forceful CephCluster deletion but not back out of forceful deletion of other resources (CephObjectStore, for example).

I can imagine a real-life scenario where the admin intends to forcefully delete the CephCluster, thinking that they have backed up their CephObjectStore as needed. They set cleanupPolicy.confirmation, and then they start deleting their resources, starting with CephObjectStore. They get an email from a user saying they the object store back up job failed last night, so they realize they shouldn't forcefully remove the object store or the CephCluster, giving another night to back up data. Since the Ceph cluster still hasn't been deleted, the data is still intact. But here is where the two implementations diverge:

If we only forcefully delete CephObjectStores (or other dependent resources) when the CephCluster itself has the deletion timestamp and cleanupPolicy set (my ask), then the reconciler will just sit spinning while blocked dependents (buckets in the case of the CephObjectStore). The admin can unset the cleanupPolicy to ensure the CephCluster can't be forcibly deleted overnight.
If we forcefully delete CephObjectStores (or other dependent resources) while the CephCluster doesn't have its own deletion timestamp (current implementation), then the object store is toast, even though the CephCluster is still "safe", not having a deletion timestamp.

While this is a pretty edgy corner case, I think it explains the logical discrepancy I am stuck on. I think it's best to forcibly delete the CephObjectStore only when we are certain that the CephCluster is going to be imminently deleted. This way, the whole cluster and all resources (not just the CephCluster) have the time between when the cleanupPolicy is set and when the CephCluster delete call is made as a safety margin to catch any last mistakes in backing up before removal. The margin exists for the CephCluster itself, and so I think we should extend the same amount of margin to the subresources also.

leseb

In practice, it seems convenient, nice, and simple. However, the problem with this approach is that the cleanup policy can always be reverted. So now, when the cluster has a cleanup policy we can remove any of the child CRs. I don't think we should treat the CleanupPolicy as if there was no CephCluster at all. It looks dangerous and can lead to unintended. behaviors.

EDIT: I missed reading the current comments, at a quick glance it looks like @BlaineEXE has a similar concern.

leseb

Are there any doc changes we should make about this new behavior? Despite the cluster.yaml.

pkg/operator/ceph/controller/controller_utils.go

BlaineEXE · 2021-10-27T15:29:15Z

cluster/examples/kubernetes/ceph/cluster.yaml

+    # After the cleanup policy is set, if any other Ceph CRs (e.g. CephFilesystem, CephObjectStore, etc)
+    # are deleted, Rook will immediately remove their finalizers and allow them to be deleted.


IMO, we could make this change without changing the docs. At any rate, this statement could be clearer if it clarifies this is only if the CephCluster is being deleted. Otherwise, looks ready to go.

Sounds good just to leave it out for simplicity.

The cluster CR can be forcefully deleted and cleanup the cluster resources if the yes-really-destroy-data policy is set on the CR. In this case, the other controllers should treat the cluster CR as not existing and allow the finalizers to be removed on those resources if they are requested for deletion. Signed-off-by: Travis Nielsen <tnielsen@redhat.com>

core: Treat cluster as not existing if the cleanup policy is set (backport #9041)

travisn added the backport-release-1.7 label Oct 26, 2021

BlaineEXE reviewed Oct 26, 2021

View reviewed changes

travisn force-pushed the cluster-cleanup branch 2 times, most recently from 8445246 to 38bc5a3 Compare October 26, 2021 22:01

travisn requested a review from leseb October 26, 2021 22:10

BlaineEXE requested changes Oct 26, 2021

View reviewed changes

leseb reviewed Oct 27, 2021

View reviewed changes

travisn force-pushed the cluster-cleanup branch from 38bc5a3 to 6b2501b Compare October 27, 2021 14:39

leseb reviewed Oct 27, 2021

View reviewed changes

pkg/operator/ceph/controller/controller_utils.go Show resolved Hide resolved

BlaineEXE reviewed Oct 27, 2021

View reviewed changes

travisn force-pushed the cluster-cleanup branch from 6b2501b to fd10d98 Compare October 27, 2021 16:25

leseb approved these changes Oct 27, 2021

View reviewed changes

BlaineEXE approved these changes Oct 27, 2021

View reviewed changes

travisn merged commit 33072bf into rook:master Oct 27, 2021

travisn deleted the cluster-cleanup branch October 27, 2021 20:06

mergify bot mentioned this pull request Oct 27, 2021

core: Treat cluster as not existing if the cleanup policy is set (backport #9041) #9053

Merged

mergify bot added a commit that referenced this pull request Oct 27, 2021

Merge pull request #9053 from rook/mergify/bp/release-1.7/pr-9041

90103f5

core: Treat cluster as not existing if the cleanup policy is set (backport #9041)

Rakshith-R mentioned this pull request May 18, 2022

core: treat cluster as not existing if it is allowed to uninstall with volumes #10231

Closed

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

core: Treat cluster as not existing if the cleanup policy is set #9041

core: Treat cluster as not existing if the cleanup policy is set #9041

travisn commented Oct 26, 2021

BlaineEXE Oct 26, 2021

travisn Oct 26, 2021

BlaineEXE Oct 26, 2021 •

edited

travisn Oct 26, 2021

BlaineEXE Oct 26, 2021

BlaineEXE left a comment •

edited

travisn commented Oct 26, 2021

BlaineEXE commented Oct 26, 2021

leseb left a comment •

edited

leseb left a comment

BlaineEXE Oct 27, 2021 •

edited

travisn Oct 27, 2021

		# After the cleanup policy is set, if any other Ceph CRs (e.g. CephFilesystem, CephObjectStore, etc)
		# are deleted, Rook will immediately remove their finalizers and allow them to be deleted.

core: Treat cluster as not existing if the cleanup policy is set #9041

core: Treat cluster as not existing if the cleanup policy is set #9041

Conversation

travisn commented Oct 26, 2021

BlaineEXE Oct 26, 2021

Choose a reason for hiding this comment

travisn Oct 26, 2021

Choose a reason for hiding this comment

BlaineEXE Oct 26, 2021 • edited

Choose a reason for hiding this comment

travisn Oct 26, 2021

Choose a reason for hiding this comment

BlaineEXE Oct 26, 2021

Choose a reason for hiding this comment

BlaineEXE left a comment • edited

Choose a reason for hiding this comment

travisn commented Oct 26, 2021

BlaineEXE commented Oct 26, 2021

leseb left a comment • edited

Choose a reason for hiding this comment

leseb left a comment

Choose a reason for hiding this comment

BlaineEXE Oct 27, 2021 • edited

Choose a reason for hiding this comment

travisn Oct 27, 2021

Choose a reason for hiding this comment

BlaineEXE Oct 26, 2021 •

edited

BlaineEXE left a comment •

edited

leseb left a comment •

edited

BlaineEXE Oct 27, 2021 •

edited