Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

failed to delete bucket #8245

Closed
subhamkrai opened this issue Jul 2, 2021 · 11 comments
Closed

failed to delete bucket #8245

subhamkrai opened this issue Jul 2, 2021 · 11 comments
Labels

Comments

@subhamkrai
Copy link
Contributor

subhamkrai commented Jul 2, 2021

Is this a bug report or feature request?

  • Bug Report
    error deleting object
*2021-07-02 14:48:17.282807 I | ceph-object-controller: deleting object "rookHealthCheckTestObject" from bucket "rook-ceph-bucket-checker-022999f0-3ca7-45bf-9438-2ab1b4a20697" in object store "my-store"
2021-07-02 14:48:17.324321 E | ceph-object-controller: failed to delete bucket "rook-ceph-bucket-checker-022999f0-3ca7-45bf-9438-2ab1b4a20697" for object store "my-store". NoSuchKey tx000000000000000000001-0060df2731-1328-my-store 1328-my-store-my-store

How to reproduce it (minimal and precise):

kubectl create -f crds.yaml -f common.yaml -f operator.yaml
kubectl create -f cluster-test.yaml
kubectl create -f object-test.yaml
kubectl delete -f object-test.yaml

File(s) to submit:

  • Cluster CR (custom resource), typically called cluster.yaml, if necessary
  • Operator's logs, if necessary
  • Crashing pod(s) logs, if necessary

To get logs, use kubectl -n <namespace> logs <pod name>
When pasting logs, always surround them with backticks or use the insert code button from the Github UI.
Read Github documentation if you need help.

Environment:

  • OS (e.g. from /etc/os-release):
  • Kernel (e.g. uname -a):
  • Cloud provider or hardware configuration:
  • Rook version (use rook version inside of a Rook Pod):
  • Storage backend version (e.g. for ceph do ceph -v):
  • Kubernetes version (use kubectl version):
  • Kubernetes cluster type (e.g. Tectonic, GKE, OpenShift):
  • Storage backend status (e.g. for Ceph use ceph health in the Rook Ceph toolbox):
@subhamkrai subhamkrai added the bug label Jul 2, 2021
@subhamkrai
Copy link
Contributor Author

operator logs

2021-07-02 14:47:42.087124 I | ceph-spec: adding finalizer "cephobjectstore.ceph.rook.io" on "my-store"
2021-07-02 14:47:42.105477 I | op-mon: parsing mon endpoints: a=10.99.212.103:6789
2021-07-02 14:47:42.512879 I | ceph-object-controller: reconciling object store deployments
2021-07-02 14:47:42.524407 I | ceph-object-controller: ceph object store gateway service running at 10.101.38.227
2021-07-02 14:47:42.524431 I | ceph-object-controller: reconciling object store pools
2021-07-02 14:47:43.762037 I | cephclient: setting pool property "pg_num_min" to "8" on pool "my-store.rgw.log"
2021-07-02 14:47:47.854941 I | cephclient: setting pool property "compression_mode" to "none" on pool "my-store.rgw.buckets.non-ec"
2021-07-02 14:47:48.841109 I | cephclient: setting pool property "compression_mode" to "none" on pool "my-store.rgw.control"
2021-07-02 14:47:48.844428 I | cephclient: setting pool property "compression_mode" to "none" on pool "my-store.rgw.buckets.index"
2021-07-02 14:47:48.854185 I | cephclient: setting pool property "compression_mode" to "none" on pool "my-store.rgw.meta"
2021-07-02 14:47:48.854546 I | cephclient: setting pool property "compression_mode" to "none" on pool ".rgw.root"
2021-07-02 14:47:49.858897 I | cephclient: creating replicated pool my-store.rgw.buckets.non-ec succeeded
2021-07-02 14:47:49.858921 I | cephclient: setting pool property "pg_num_min" to "8" on pool "my-store.rgw.buckets.non-ec"
2021-07-02 14:47:51.859752 I | cephclient: creating replicated pool my-store.rgw.buckets.index succeeded
2021-07-02 14:47:51.859780 I | cephclient: setting pool property "pg_num_min" to "8" on pool "my-store.rgw.buckets.index"
2021-07-02 14:47:51.881821 I | cephclient: creating replicated pool my-store.rgw.control succeeded
2021-07-02 14:47:51.881910 I | cephclient: setting pool property "pg_num_min" to "8" on pool "my-store.rgw.control"
2021-07-02 14:47:51.883019 I | cephclient: creating replicated pool .rgw.root succeeded
2021-07-02 14:47:51.883092 I | cephclient: setting pool property "pg_num_min" to "8" on pool ".rgw.root"
2021-07-02 14:47:51.883327 I | cephclient: creating replicated pool my-store.rgw.meta succeeded
2021-07-02 14:47:51.883355 I | cephclient: setting pool property "pg_num_min" to "8" on pool "my-store.rgw.meta"
2021-07-02 14:47:55.908393 I | cephclient: setting pool property "compression_mode" to "none" on pool "my-store.rgw.buckets.data"
2021-07-02 14:47:57.928961 I | cephclient: creating replicated pool my-store.rgw.buckets.data succeeded
2021-07-02 14:47:57.928993 I | ceph-object-controller: setting multisite settings for object store "my-store"
2021-07-02 14:47:58.332715 I | ceph-object-controller: Multisite for object-store: realm=my-store, zonegroup=my-store, zone=my-store
2021-07-02 14:47:58.332734 I | ceph-object-controller: multisite configuration for object-store my-store is complete
2021-07-02 14:47:58.332738 I | ceph-object-controller: creating object store "my-store" in namespace "rook-ceph"
2021-07-02 14:47:58.332745 I | cephclient: getting or creating ceph auth key "client.rgw.my.store.a"
2021-07-02 14:47:58.661079 I | ceph-object-controller: setting rgw config flags
2021-07-02 14:47:58.661126 I | op-config: setting "client.rgw.my.store.a"="rgw_zonegroup"="my-store" option to the mon configuration database
2021-07-02 14:47:58.920739 I | op-config: successfully set "client.rgw.my.store.a"="rgw_zonegroup"="my-store" option to the mon configuration database
2021-07-02 14:47:58.920759 I | op-config: setting "client.rgw.my.store.a"="rgw_log_nonexistent_bucket"="true" option to the mon configuration database
2021-07-02 14:47:59.190801 I | op-config: successfully set "client.rgw.my.store.a"="rgw_log_nonexistent_bucket"="true" option to the mon configuration database
2021-07-02 14:47:59.190823 I | op-config: setting "client.rgw.my.store.a"="rgw_log_object_name_utc"="true" option to the mon configuration database
2021-07-02 14:47:59.452945 I | op-config: successfully set "client.rgw.my.store.a"="rgw_log_object_name_utc"="true" option to the mon configuration database
2021-07-02 14:47:59.452976 I | op-config: setting "client.rgw.my.store.a"="rgw_enable_usage_log"="true" option to the mon configuration database
2021-07-02 14:47:59.723202 I | op-config: successfully set "client.rgw.my.store.a"="rgw_enable_usage_log"="true" option to the mon configuration database
2021-07-02 14:47:59.723236 I | op-config: setting "client.rgw.my.store.a"="rgw_zone"="my-store" option to the mon configuration database
2021-07-02 14:47:59.997567 I | op-config: successfully set "client.rgw.my.store.a"="rgw_zone"="my-store" option to the mon configuration database
2021-07-02 14:48:00.000373 I | ceph-object-controller: object store "my-store" deployment "rook-ceph-rgw-my-store-a" started
2021-07-02 14:48:00.083038 I | ceph-object-controller: enabling rgw dashboard
2021-07-02 14:48:00.893802 I | ceph-object-controller: created object store "my-store" in namespace "rook-ceph"
2021-07-02 14:48:00.894047 I | ceph-object-controller: setting the dashboard api secret key
2021-07-02 14:48:01.115226 I | ceph-object-controller: starting rgw healthcheck
2021-07-02 14:48:01.121597 I | op-k8sutil: Reporting Event rook-ceph:my-store Normal:ReconcileSucceeded:successfully configured CephObjectStore "rook-ceph/my-store"
2021-07-02 14:48:01.501189 I | ceph-object-controller: done setting the dashboard api secret key
2021-07-02 14:48:01.688115 I | clusterdisruption-controller: all "host" failure domains: [minikube]. osd is down in failure domain: "". active node drains: false. pg health: "cluster is not fully clean. PGs: [{StateName:active+clean Count:223} {StateName:peering Count:1}]"
2021-07-02 14:48:02.172294 I | clusterdisruption-controller: all "host" failure domains: [minikube]. osd is down in failure domain: "". active node drains: false. pg health: "cluster is not fully clean. PGs: [{StateName:active+clean Count:223} {StateName:peering Count:1}]"
2021-07-02 14:48:16.806817 I | op-mon: parsing mon endpoints: a=10.99.212.103:6789
2021-07-02 14:48:17.282807 I | ceph-object-controller: deleting object "rookHealthCheckTestObject" from bucket "rook-ceph-bucket-checker-022999f0-3ca7-45bf-9438-2ab1b4a20697" in object store "my-store"
2021-07-02 14:48:17.324321 E | ceph-object-controller: failed to delete bucket "rook-ceph-bucket-checker-022999f0-3ca7-45bf-9438-2ab1b4a20697" for object store "my-store". NoSuchKey tx000000000000000000001-0060df2731-1328-my-store 1328-my-store-my-store
2021-07-02 14:48:17.376269 I | ceph-object-controller: stopping monitoring of rgw endpoints for object store "my-store"
2021-07-02 14:48:18.324503 I | ceph-object-controller: CephObjectStore "rook-ceph/my-store" can be deleted safely. deleting CephObjectStore "rook-ceph/my-store"
2021-07-02 14:48:18.324526 I | op-k8sutil: Reporting Event rook-ceph:my-store Normal:Deleting:deleting CephObjectStore "rook-ceph/my-store"
2021-07-02 14:48:18.332419 I | ceph-object-controller: deleting object store "my-store" from namespace "rook-ceph"
2021-07-02 14:48:18.332440 I | ceph-object-controller: deleting rgw CephX key and configuration in centralized mon database for "rook-ceph-rgw-my-store-a"
2021-07-02 14:48:18.636042 I | op-config: deleting "rgw_log_nonexistent_bucket" option from the mon configuration database
2021-07-02 14:48:18.910681 I | op-config: successfully deleted "rgw_log_nonexistent_bucket" option from the mon configuration database
2021-07-02 14:48:18.910724 I | op-config: deleting "rgw_log_object_name_utc" option from the mon configuration database
2021-07-02 14:48:19.182817 I | op-config: successfully deleted "rgw_log_object_name_utc" option from the mon configuration database
2021-07-02 14:48:19.182839 I | op-config: deleting "rgw_enable_usage_log" option from the mon configuration database
2021-07-02 14:48:19.446723 I | op-config: successfully deleted "rgw_enable_usage_log" option from the mon configuration database
2021-07-02 14:48:19.446745 I | op-config: deleting "rgw_zone" option from the mon configuration database
2021-07-02 14:48:19.729113 I | op-config: successfully deleted "rgw_zone" option from the mon configuration database
2021-07-02 14:48:19.729138 I | op-config: deleting "rgw_zonegroup" option from the mon configuration database
2021-07-02 14:48:20.015368 I | op-config: successfully deleted "rgw_zonegroup" option from the mon configuration database
2021-07-02 14:48:20.015391 I | ceph-object-controller: successfully deleted rgw config for "client.rgw.my.store.a" in mon configuration database
2021-07-02 14:48:20.015397 I | cephclient: deleting ceph auth "client.rgw.my.store.a"
2021-07-02 14:48:20.322048 I | ceph-object-controller: completed deleting rgw CephX key and configuration in centralized mon database for "rook-ceph-rgw-my-store-a"
2021-07-02 14:48:20.322233 I | ceph-object-controller: disabling the dashboard api user and secret key
2021-07-02 14:48:20.378689 I | ceph-object-controller: Found stores [my-store] when deleting store my-store
2021-07-02 14:48:21.359748 I | cephclient: no images/snapshosts present in pool ".rgw.root"
2021-07-02 14:48:21.359768 I | cephclient: purging pool ".rgw.root" (id=12)
2021-07-02 14:48:21.511568 I | cephclient: no images/snapshosts present in pool "my-store.rgw.meta"
2021-07-02 14:48:21.512039 I | cephclient: purging pool "my-store.rgw.meta" (id=13)
2021-07-02 14:48:21.651143 I | cephclient: no images/snapshosts present in pool "my-store.rgw.buckets.data"
2021-07-02 14:48:21.652012 I | cephclient: purging pool "my-store.rgw.buckets.data" (id=15)
2021-07-02 14:48:21.692502 I | cephclient: no images/snapshosts present in pool "my-store.rgw.buckets.index"
2021-07-02 14:48:21.692527 I | cephclient: purging pool "my-store.rgw.buckets.index" (id=14)
2021-07-02 14:48:21.807456 I | cephclient: no images/snapshosts present in pool "my-store.rgw.buckets.non-ec"
2021-07-02 14:48:21.807515 I | cephclient: purging pool "my-store.rgw.buckets.non-ec" (id=10)
2021-07-02 14:48:21.810524 I | cephclient: no images/snapshosts present in pool "my-store.rgw.control"
2021-07-02 14:48:21.810546 I | cephclient: purging pool "my-store.rgw.control" (id=11)
2021-07-02 14:48:21.848692 I | cephclient: no images/snapshosts present in pool "my-store.rgw.log"
2021-07-02 14:48:21.848753 I | cephclient: purging pool "my-store.rgw.log" (id=9)
2021-07-02 14:48:22.526867 I | ceph-object-controller: deleting rgw dashboard user
2021-07-02 14:48:22.920476 W | ceph-object-controller: failed to delete ceph user "dashboard-admin". failed to delete s3 user uid="dashboard-admin": exit status 5
2021-07-02 14:48:24.200887 I | cephclient: purge completed for pool "my-store.rgw.log"
2021-07-02 14:48:24.235832 I | cephclient: purge completed for pool "my-store.rgw.meta"
2021-07-02 14:48:24.584497 I | ceph-object-controller: done disabling the dashboard api secret key
2021-07-02 14:48:25.274439 I | cephclient: purge completed for pool "my-store.rgw.buckets.data"
2021-07-02 14:48:25.280545 I | cephclient: purge completed for pool ".rgw.root"
2021-07-02 14:48:25.289888 I | cephclient: purge completed for pool "my-store.rgw.control"
2021-07-02 14:48:25.294122 I | cephclient: purge completed for pool "my-store.rgw.buckets.non-ec"
2021-07-02 14:48:25.299160 I | cephclient: purge completed for pool "my-store.rgw.buckets.index"
2021-07-02 14:48:25.575726 I | ceph-object-controller: done deleting object store "my-store" from namespace "rook-ceph"
2021-07-02 14:48:25.575754 I | ceph-spec: removing finalizer "cephobjectstore.ceph.rook.io" on "my-store"
2021-07-02 14:48:25.592471 I | op-k8sutil: Reporting Event rook-ceph:my-store Normal:ReconcileSucceeded:successfully configured CephObjectStore "rook-ceph/my-store"
2021-07-02 14:48:25.593764 I | op-k8sutil: Reporting Event : Normal:ReconcileSucceeded:successfully configured  "/"
2021-07-02 14:48:26.255229 I | ceph-spec: object "rook-ceph-rgw-my-store" matched on delete, reconciling
2021-07-02 14:48:26.257901 I | ceph-spec: object "rook-ceph-rgw-my-store-a-keyring" matched on delete, reconciling
2021-07-02 14:48:26.258171 I | ceph-spec: object "rook-ceph-rgw-my-store-a" matched on delete, reconciling

@leseb
Copy link
Member

leseb commented Jul 5, 2021

Can you turn on DEBUG on the operator when you see this? Thanks!

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs. Thank you for your contributions.

@github-actions
Copy link

This issue has been automatically closed due to inactivity. Please re-open if this still requires investigation.

@satoru-takeuchi
Copy link
Member

I encountered the same problem and captured the operator's DEBUG log.

operator.log

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs. Thank you for your contributions.

@BlaineEXE
Copy link
Member

BlaineEXE commented Feb 17, 2022

This could have been related to the fixes that I did to make sure the operator doesn't keep the same health checker info from previous CephObjectStore installs. This is the behavior I saw when that error was happening, but it doesn't mean this was the only cause.

#9417

If no one (@subhamkrai , @satoru-takeuchi) has experienced this in recent master/1.8 versions, I think we can close this and reopen if necessary.

@satoru-takeuchi
Copy link
Member

@BlaineEXE In my case, I use Ceph v16.2.7 + the following patch to fix the root cause than bypassing this bug in Rook.

ceph/ceph#44413

So this issue can close if @subhamkrai don't encounter this problem.

@github-actions github-actions bot removed the wontfix label Feb 18, 2022
@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs. Thank you for your contributions.

@satoru-takeuchi
Copy link
Member

I don't encounter this issue any more.

@github-actions github-actions bot removed the wontfix label Apr 20, 2022
@subhamkrai
Copy link
Contributor Author

I don't encounter this issue any more.

closing this since we don't see anymore

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants