Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

osd: Remove osd with purge instead of destroy #9807

Merged
merged 1 commit into from Feb 27, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
7 changes: 4 additions & 3 deletions .github/workflows/canary-integration-test.yml
Expand Up @@ -75,7 +75,7 @@ jobs:
kubectl -n rook-ceph exec $toolbox -- ceph auth caps client.csi-rbd-provisioner mon 'profile rbd, allow command "osd ls"' osd 'profile rbd' mgr 'allow rw'
# print client.csi-rbd-provisioner user after update
kubectl -n rook-ceph exec $toolbox -- ceph auth get client.csi-rbd-provisioner
kubectl -n rook-ceph exec $toolbox -- python3 /etc/ceph/create-external-cluster-resources.py --rbd-data-pool-name replicapool
kubectl -n rook-ceph exec $toolbox -- python3 /etc/ceph/create-external-cluster-resources.py --rbd-data-pool-name replicapool
# print client.csi-rbd-provisioner user after running script
kubectl -n rook-ceph exec $toolbox -- ceph auth get client.csi-rbd-provisioner

Expand Down Expand Up @@ -124,7 +124,7 @@ jobs:
# --cluster-name and --run-as-user flag while upgrading
kubectl -n rook-ceph exec $toolbox -- python3 /etc/ceph/create-external-cluster-resources.py --upgrade --rbd-data-pool-name replicapool --rados-namespace radosNamespace --cluster-name rookStorage --run-as-user client.csi-rbd-node-rookStorage-replicapool-radosNamespace
# print ugraded client auth
kubectl -n rook-ceph exec $toolbox -- ceph auth get client.csi-rbd-node-rookStorage-replicapool-radosNamespace
kubectl -n rook-ceph exec $toolbox -- ceph auth get client.csi-rbd-node-rookStorage-replicapool-radosNamespace

- name: check-ownerreferences
run: tests/scripts/github-action-helper.sh check_ownerreferences
Expand All @@ -140,7 +140,8 @@ jobs:
kubectl -n rook-ceph create -f deploy/examples/osd-purge.yaml
toolbox=$(kubectl get pod -l app=rook-ceph-tools -n rook-ceph -o jsonpath='{.items[*].metadata.name}')
kubectl -n rook-ceph exec $toolbox -- ceph status
timeout 120 sh -c "until kubectl -n rook-ceph exec $toolbox -- ceph osd tree|grep -qE 'osd.1.*.destroyed'; do echo 'waiting for ceph osd 1 to be destroyed'; sleep 1; done"
# wait until osd.1 is removed from the osd tree
timeout 120 sh -c "while kubectl -n rook-ceph exec $toolbox -- ceph osd tree|grep -qE 'osd.1'; do echo 'waiting for ceph osd 1 to be purged'; sleep 1; done"
kubectl -n rook-ceph exec $toolbox -- ceph status
kubectl -n rook-ceph exec $toolbox -- ceph osd tree

Expand Down
8 changes: 4 additions & 4 deletions pkg/daemon/ceph/osd/remove.go
Expand Up @@ -174,19 +174,19 @@ func removeOSD(clusterdContext *clusterd.Context, clusterInfo *client.ClusterInf
}

// purge the osd
logger.Infof("destroying osd.%d", osdID)
purgeOSDArgs := []string{"osd", "destroy", fmt.Sprintf("osd.%d", osdID), "--yes-i-really-mean-it"}
logger.Infof("purging osd.%d", osdID)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason why I used destroy instead of purge was intentional so that we allow re-installing the OSD with the same ID, reducing data movement. When using destroy, the OSD and the CRUSH details are kept intact, allowing a smooth device replacement.
But after all, we cannot get the same OSD ID since the device is brand new and is not a form of /dev/sdX. So makes sense to use purge.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, for the common case it was just confusing that the osd id was left there. I wonder if we should document that as an advanced scenario. How could we replace a specific osd ID? Or you're saying it may be too difficult anyway to orchestrate?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the only way to achieve this would be to have the same disk name which means changing how we map the PVC disk when mounting/copying it. It's non-trivial for sure.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC c-v also has a parameter for provisioning an OSD with a specific OSD ID to replace. Not sure how we would pass that either. The ID isn't desired state that would be added to the cluster CR, it's just one-time info to be used for the next OSD created.

purgeOSDArgs := []string{"osd", "purge", fmt.Sprintf("osd.%d", osdID), "--force", "--yes-i-really-mean-it"}
_, err = client.NewCephCommand(clusterdContext, clusterInfo, purgeOSDArgs).Run()
if err != nil {
logger.Errorf("failed to purge osd.%d. %v", osdID, err)
}

// Attempting to remove the parent host. Errors can be ignored if there are other OSDs on the same host
logger.Infof("removing osd.%d from ceph", osdID)
logger.Infof("attempting to remove host %q from crush map if not in use", osdID)
hostArgs := []string{"osd", "crush", "rm", hostName}
_, err = client.NewCephCommand(clusterdContext, clusterInfo, hostArgs).Run()
if err != nil {
logger.Errorf("failed to remove CRUSH host %q. %v", hostName, err)
logger.Infof("failed to remove CRUSH host %q. %v", hostName, err)
}

// call archiveCrash to silence crash warning in ceph health if any
Expand Down