New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rgw: multisite commands are used idempotently in error #8879
Comments
I've done some investigation of using First my findings from comparing Period get {
"id": "07e8d4f8-e7e7-4845-a0b7-22d91d5271ca",
"epoch": 1,
"predecessor_uuid": "f02fbd92-8722-4a9c-9247-92fa62467b54",
"sync_status": [],
"period_map": {
"id": "07e8d4f8-e7e7-4845-a0b7-22d91d5271ca",
"zonegroups": [
{
"id": "663564f3-4050-4dbd-9e0a-e6ab87af03bd",
"name": "my-store",
"api_name": "my-store",
"is_master": "true",
"endpoints": [
"http://10.103.203.154:80"
],
"hostnames": [],
"hostnames_s3website": [],
"master_zone": "452600ee-8c8a-4cc0-a841-7960030fe88c",
"zones": [
{
"id": "452600ee-8c8a-4cc0-a841-7960030fe88c",
"name": "my-store",
"endpoints": [
"http://10.103.203.154:80"
],
"log_meta": "false",
"log_data": "false",
"bucket_index_max_shards": 11,
"read_only": "false",
"tier_type": "",
"sync_from_all": "true",
"sync_from": [],
"redirect_zone": ""
}
],
"placement_targets": [
{
"name": "default-placement",
"tags": [],
"storage_classes": [
"STANDARD"
]
}
],
"default_placement": "default-placement",
"realm_id": "3ba59d2b-c1b0-4844-88bd-65309fa1a221",
"sync_policy": {
"groups": []
}
}
],
"short_zone_ids": [
{
"key": "452600ee-8c8a-4cc0-a841-7960030fe88c",
"val": 2231118265
}
]
},
"master_zonegroup": "663564f3-4050-4dbd-9e0a-e6ab87af03bd",
"master_zone": "452600ee-8c8a-4cc0-a841-7960030fe88c",
"period_config": {
"bucket_quota": {
"enabled": false,
"check_on_raw": false,
"max_size": -1,
"max_size_kb": 0,
"max_objects": -1
},
"user_quota": {
"enabled": false,
"check_on_raw": false,
"max_size": -1,
"max_size_kb": 0,
"max_objects": -1
}
},
"realm_id": "3ba59d2b-c1b0-4844-88bd-65309fa1a221",
"realm_name": "my-store",
"realm_epoch": 2
} Period update --staging {
"id": "3ba59d2b-c1b0-4844-88bd-65309fa1a221:staging",
"epoch": 1,
"predecessor_uuid": "07e8d4f8-e7e7-4845-a0b7-22d91d5271ca",
"sync_status": [],
"period_map": {
"id": "07e8d4f8-e7e7-4845-a0b7-22d91d5271ca",
"zonegroups": [
{
"id": "663564f3-4050-4dbd-9e0a-e6ab87af03bd",
"name": "my-store",
"api_name": "my-store",
"is_master": "true",
"endpoints": [
"http://10.103.203.154:80"
],
"hostnames": [],
"hostnames_s3website": [],
"master_zone": "452600ee-8c8a-4cc0-a841-7960030fe88c",
"zones": [
{
"id": "452600ee-8c8a-4cc0-a841-7960030fe88c",
"name": "my-store",
"endpoints": [
"http://10.103.203.154:80"
],
"log_meta": "false",
"log_data": "false",
"bucket_index_max_shards": 11,
"read_only": "false",
"tier_type": "",
"sync_from_all": "true",
"sync_from": [],
"redirect_zone": ""
}
],
"placement_targets": [
{
"name": "default-placement",
"tags": [],
"storage_classes": [
"STANDARD"
]
}
],
"default_placement": "default-placement",
"realm_id": "3ba59d2b-c1b0-4844-88bd-65309fa1a221",
"sync_policy": {
"groups": []
}
}
],
"short_zone_ids": [
{
"key": "452600ee-8c8a-4cc0-a841-7960030fe88c",
"val": 2231118265
}
]
},
"master_zonegroup": "663564f3-4050-4dbd-9e0a-e6ab87af03bd",
"master_zone": "452600ee-8c8a-4cc0-a841-7960030fe88c",
"period_config": {
"bucket_quota": {
"enabled": false,
"check_on_raw": false,
"max_size": -1,
"max_size_kb": 0,
"max_objects": -1
},
"user_quota": {
"enabled": false,
"check_on_raw": false,
"max_size": -1,
"max_size_kb": 0,
"max_objects": -1
}
},
"realm_id": "3ba59d2b-c1b0-4844-88bd-65309fa1a221",
"realm_name": "my-store",
"realm_epoch": 3
} Diff for ease of comparison
From my findings, I see that updating the period would do 3 things:
Questions |
Following up from above I compared Period get after updating
{
"id": "07e8d4f8-e7e7-4845-a0b7-22d91d5271ca",
"epoch": 2,
"predecessor_uuid": "f02fbd92-8722-4a9c-9247-92fa62467b54",
"sync_status": [],
"period_map": {
"id": "07e8d4f8-e7e7-4845-a0b7-22d91d5271ca",
"zonegroups": [
{
"id": "663564f3-4050-4dbd-9e0a-e6ab87af03bd",
"name": "my-store",
"api_name": "my-store",
"is_master": "true",
"endpoints": [
"http://10.103.203.154:80"
],
"hostnames": [],
"hostnames_s3website": [],
"master_zone": "452600ee-8c8a-4cc0-a841-7960030fe88c",
"zones": [
{
"id": "452600ee-8c8a-4cc0-a841-7960030fe88c",
"name": "my-store",
"endpoints": [
"http://10.103.203.154:80"
],
"log_meta": "false",
"log_data": "false",
"bucket_index_max_shards": 11,
"read_only": "false",
"tier_type": "",
"sync_from_all": "true",
"sync_from": [],
"redirect_zone": ""
}
],
"placement_targets": [
{
"name": "default-placement",
"tags": [],
"storage_classes": [
"STANDARD"
]
}
],
"default_placement": "default-placement",
"realm_id": "3ba59d2b-c1b0-4844-88bd-65309fa1a221",
"sync_policy": {
"groups": []
}
}
],
"short_zone_ids": [
{
"key": "452600ee-8c8a-4cc0-a841-7960030fe88c",
"val": 2231118265
}
]
},
"master_zonegroup": "663564f3-4050-4dbd-9e0a-e6ab87af03bd",
"master_zone": "452600ee-8c8a-4cc0-a841-7960030fe88c",
"period_config": {
"bucket_quota": {
"enabled": false,
"check_on_raw": false,
"max_size": -1,
"max_size_kb": 0,
"max_objects": -1
},
"user_quota": {
"enabled": false,
"check_on_raw": false,
"max_size": -1,
"max_size_kb": 0,
"max_objects": -1
}
},
"realm_id": "3ba59d2b-c1b0-4844-88bd-65309fa1a221",
"realm_name": "my-store",
"realm_epoch": 2
} Diff for ease of comparison
Findings:
In summary...
|
Further question: It seems that |
the |
the behavior of |
you can ignore those variables in the diff, because they can't be changed manually with any of the zone/zonegroup modify commands |
Yes. You're correct. From my perspective, it seems that we might not be able to accurately predict whether there have been no changes before issuing a I will investigate what actually changes in I am still wondering if there is a |
New findings for zonegroups:
|
i'm not sure exactly what you're looking for here. technically it does 'stage' the changes by writing them to zonegroup metadata objects in rados. the |
|
I see. To clarify my updated understanding... (I wasn't quite sure what "staging" was conceptually to RGWs before your comment here #8879 (comment))
|
right. though since
that's right 👍 |
Okay. From a practical standpoint for Rook, I want to make sure the following workflow will achieve the idempotent workflow we want that...
Note: I think we can ignore realms in this discussion. I can't find any record of Rook updating a realm once it is created, and it seems that there isn't even a Workflow:
|
overall sounds great! i don't think you'll need a special case for nonexistent periods in |
Interesting. This may be another bug in Rook then... Currently, the realm creation also follows the patter of It seems that after Rook creates a realm, it should issue |
huh, i'm not sure why. is that with the
no, if rook doesn't support that stuff, and all of its object stores are on the same ceph cluster, then they all share the rados pool that stores these realm/period/zonegroup/zone objects. so if |
Actually, I am wrong here. :( |
Replace calls to 'radosgw-admin period update --commit' with an idempotent function. Resolves rook#8879 Signed-off-by: Blaine Gardner <blaine.gardner@redhat.com>
Replace calls to 'radosgw-admin period update --commit' with an idempotent function. Resolves rook#8879 Signed-off-by: Blaine Gardner <blaine.gardner@redhat.com>
Replace calls to 'radosgw-admin period update --commit' with an idempotent function. Resolves rook#8879 Signed-off-by: Blaine Gardner <blaine.gardner@redhat.com>
Replace calls to 'radosgw-admin period update --commit' with an idempotent function. Resolves rook#8879 Signed-off-by: Blaine Gardner <blaine.gardner@redhat.com>
Replace calls to 'radosgw-admin period update --commit' with an idempotent function. Resolves rook#8879 Signed-off-by: Blaine Gardner <blaine.gardner@redhat.com>
Replace calls to 'radosgw-admin period update --commit' with an idempotent function. Resolves rook#8879 Signed-off-by: Blaine Gardner <blaine.gardner@redhat.com>
Sorry I don't really understand why zonegroup/zone modified for already existing zonegroup/zone. Does it creates unnecessary period updates while reconciling cephobjectstore CRD or restarting the rook-operator pod.
|
Replace calls to 'radosgw-admin period update --commit' with an idempotent function. Resolves rook#8879 Signed-off-by: Blaine Gardner <blaine.gardner@redhat.com>
Replace calls to 'radosgw-admin period update --commit' with an idempotent function. Resolves #8879 (cherry picked from commit eadcd75) rgw: add integration test for committing period Add to the RGW multisite integration test a verification that the RGW period is committed on the first reconcile and not committed on the second reconcile. Do this in the multisite test so that we verify that this works for both the primary and secondary multi-site cluster. To add this test, the github-action-helper.sh script had to be modified to 1. actually deploy the version of Rook under test 2. adjust how functions are called to not lose the `-e` in a subshell 3. fix wait_for_prepare_pod helper that had a failure in the middle of its operation that didn't cause failures in the past Signed-off-by: Blaine Gardner <blaine.gardner@redhat.com> (cherry picked from commit 9564308)
Replace calls to 'radosgw-admin period update --commit' with an idempotent function. Resolves rook#8879 Signed-off-by: Blaine Gardner <blaine.gardner@redhat.com>
Replace calls to 'radosgw-admin period update --commit' with an idempotent function. Resolves rook#8879 Signed-off-by: Blaine Gardner <blaine.gardner@redhat.com>
Is this a bug report or feature request?
Commands for setting up RGW multi-site in Rook treat commands as idempotent, but that is not the case. RGW commands to create/update realm/zonegroup/zone/period cause a new "epoch," and all RGWs that use the config have to pause for a short time to update to use the new config. Rook should only update configs if they have actually changed. There is a
--staging
flag that may be useful in allowing Rook to see if there will be any changes made without rook having to deeply inspect the configurations.See this conversation: #8828 (comment)
And this comment: #8828 (comment)
The text was updated successfully, but these errors were encountered: