Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rgw: replace period update --commit with function #8911

Merged
merged 2 commits into from Oct 12, 2021

Conversation

BlaineEXE
Copy link
Member

@BlaineEXE BlaineEXE commented Oct 4, 2021

Replace calls to 'radosgw-admin period update --commit' with an
idempotent function.

Note: Code changes are less than ~150 lines with the rest of the change being additional unit tests or modifications to existing tests.

Resolves #8879

Signed-off-by: Blaine Gardner blaine.gardner@redhat.com

Description of your changes:

Which issue is resolved by this Pull Request:
Resolves #

Checklist:

  • Commit Message Formatting: Commit titles and messages follow guidelines in the developer guide.
  • Skip Tests for Docs: Add the flag for skipping the build if this is only a documentation change. See here for the flag.
  • Skip Unrelated Tests: Add a flag to run tests for a specific storage provider. See test options.
  • Reviewed the developer guide on Submitting a Pull Request
  • Documentation has been updated, if necessary.
  • Unit tests have been added, if necessary.
  • Integration tests have been added, if necessary.
  • Pending release notes updated with breaking and/or notable changes, if necessary.
  • Upgrade from previous release is tested and upgrade user guide is updated, if necessary.
  • Code generation (make codegen) has been run to update object specifications, if necessary.

@BlaineEXE BlaineEXE force-pushed the rgw-commands-use-staging-flag branch from 6feebee to dfca98d Compare October 4, 2021 20:42
@BlaineEXE BlaineEXE marked this pull request as ready for review October 4, 2021 20:42
Comment on lines -304 to +310
realmExists bool
zoneGroupExists bool
zoneExists bool
periodExists bool
failCreateRealm bool
failCreateZoneGroup bool
failCreateZone bool
failUpdatePeriod bool
realmExists bool
zoneGroupExists bool
zoneExists bool
failCreateRealm bool
failCreateZoneGroup bool
failCreateZone bool
failCommitConfigChanges bool
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TL;DR of changes in this file: no longer use raw period get/period update --commit to create multisite. Instead, use a mock CommitConfigChanges, validate that the function (unit tested elsewhere) is called, and control when it returns success/fail.

Comment on lines 298 to 414
// CommitConfigChanges commits changes to RGW configs for realm/zonegroup/zone changes idempotently.
// Under the hood, this updates the RGW config period and commits the change if changes are detected.
func CommitConfigChanges(c *Context) error {
currentPeriod, err := runAdminCommand(c, true, "period", "get")
if err != nil {
return errorOrIsNotFound(err, "failed to get the current RGW configuration period to see if it needs changed")
}

// this stages the current config changees and returns what the new period config will look like
// without committing the changes
stagedPeriod, err := runAdminCommand(c, true, "period", "update")
if err != nil {
return errorOrIsNotFound(err, "failed to stage the current RGW configuration period")
}

shouldCommit, err := periodWillChange(currentPeriod, stagedPeriod)
if err != nil {
return errors.Wrap(err, "failed to determine if the staged RGW configuration period is different from current")
}

if !shouldCommit {
nsName := fmt.Sprintf("%s/%s", c.clusterInfo.Namespace, c.Name)
logger.Debugf("not committing changes to RGW configuration period for CephObjectStore %q", nsName)
return nil
}
// don't expect json output since we don't intend to use the output from the command
_, err = runAdminCommand(c, false, "period", "update", "--commit")
if err != nil {
return errorOrIsNotFound(err, "failed to commit RGW configuration period changes")
}

return nil
}

// return true if the configuration period will change if the staged period is committed
func periodWillChange(current, staged string) (bool, error) {
var currentJSON map[string]interface{}
var stagedJSON map[string]interface{}
var err error

err = json.Unmarshal([]byte(current), &currentJSON)
if err != nil {
return true, errors.Wrap(err, "failed to unmarshal current RGW configuration period")
}
err = json.Unmarshal([]byte(staged), &stagedJSON)
if err != nil {
return true, errors.Wrap(err, "failed to unmarshal staged RGW configuration period")
}

// There are some values in the periods that we don't care to diff because they are always
// different in the staged period, even when no updates are needed. Sometimes, the values are
// reported as different in the staging output but aren't actually changed upon commit.
ignorePaths := cmp.FilterPath(func(path cmp.Path) bool {
// path.String() outputs nothing for the crude map[string]interface{} JSON struct
// Example of path.GoString() output for a long path in the period JSON:
// root["period_map"].(map[string]interface {})["short_zone_ids"].([]interface {})[0].(map[string]interface {})["val"].(float64)
switch path.GoString() {
case `root["id"]`:
// "id" is always changed in the staged period, but it doesn't always update.
return true
case `root["predecessor_uuid"]`:
// "predecessor_uuid" is always changed in the staged period, but it doesn't always update.
return true
case `root["realm_epoch"]`:
// "realm_epoch" is always incremented in the staged period, but it doesn't always increment.
return true
case `root["epoch"]`:
// Strangely, "epoch" is not incremented in the staged period even though it is always
// incremented upon an actual commit. It could be argued that this behavior is a bug.
// Ignore this value to handle the possibility that the behavior changes in the future.
return true
default:
return false
}
}, cmp.Ignore())

diff := cmp.Diff(currentJSON, stagedJSON, ignorePaths)
diff = strings.TrimSpace(diff)
logger.Debugf("RGW config period diff:\n%s", diff)

return (diff != ""), nil
}

Copy link
Member Author

@BlaineEXE BlaineEXE Oct 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are discussing whether there could be a more targeted (and therefore less fragile) approach to detecting whether period update --commit should be issued. periodWillChange() is the heart of this.

Current implementation (ignore some fields)

Currently, the function parses the JSON output from period get and period update (<-- same output as period get --staging after the period update) and then does a pretty simple diff on them. The diff is configured to ignore id, predecessor_uuid, and realm_epoch which always change from period get and period update based on my testing. It also ignores epoch which does not change in staging, but it does change upon actual period update --commit.

I think this strategy is pretty decent, but it is fragile if period update (1) changes different staged values in the future or (2) changes different staged values in situations I didn't think to manually test.

The failure mode of this approach is that we might update the period unnecessarily if RGW output changes or if I have missed a case in the implementation. The CephObjectStore reconcile loop would still succeed, and the config would still be updated, but the RGWs would pause briefly to update their config, impacting throughput briefly for each reconcile.

Alternative implementation (focus on specific fields)

We could consider an alternative implementation that focuses on specific fields rather than ignores specific fields. For example, we could only focus on changes from period_map, master_zonegroup, master_zone, period_config, realm_id, and realm_name.

This could still be fragile if period update (1) changes different staged values in the future or (2) changes different staged values in situations I didn't think to manually test.

The failure mode of this approach is that the diff would report "no changes" when in reality we should be applying changes. In this case, we would fail to issue period update --commit when we should be, and then updates won't be configured. If the endpoints were to change, for example, this could prevent a user from being able to access the object store.

Given the failure mode of this implementation, I think the ignore-some-fields approach is the better option.

Alternative implementation (deep inspection)

A totally different alternative could be to deeply inspect the return from the RGW and inspect all the values we care about to validate whether they exist in period get (no --staging) and issue period update --commit if a value doesn't match what we know we have configured. This would require creating a struct type to unmarshal JSON from the RGW into.

There are a few failure modes of this implementation I can think of

  1. It is easier to make bugs in the code that analyzes whether changes exist, and we could fail to find changes or find nonexistent changes for any value accidentally
  2. If the RGW output changes in the future, we will fail to unmarshal the JSON, and we will likely fail the CephObjectStore reconcile
  3. [Addendum] If field names change or fields are added in future RGW output, we might drop them and fail to unmarshal them, which could result in not detecting config changes that we should commit.

Failure mode number 2 is in some ways the most preferred because it means that our CI would catch errors that could happen from the RGW output structure changing. But failure mode number 1 means that this might not be a great implementation overall due to the increased susceptibility to developer error.

Alternative implementation (focus on fields with additional validation)

Given that failure mode number 2 above might be more preferred, we could consider a hybrid approach to the focus-on-fields implementation. In this implementation, we would do 2 things:

  1. Focus only on changes within some RGW output fields, AND
  2. Check that the fields we focus on exist in the output (if the fields do not exist, then we fail the reconcile)

The failure mode in this case changes from the basic focus-on-fields option in that we would fail the CephObjectStore reconcile if we determine that any of the fields we want to compare are missing. We would be more likely to catch this in CI; however, it could also totally block users from being able to update CephObjectStores.

I think we could mitigate this last caveat with some careful work. We could report a reconcile failure on the CephObjectStore status but still perform period update --commit for the object store so it's available and updated for users. This would introduce an additional failure mode in that we might commit the period more often than necessary (just like the current implementation), but this would be intentionally chosen over making the CephObjectStore potentially unable to be updated by users for a particular Rook version.

Questions for the implementation

I think some good guiding questions for RGW experts (@cbodley, @alimaredia ?) are...

  1. Do you think I have captured all of the right fields to ignore?
  2. Do you think focusing on fields is better than ignoring fields?
  3. Do you think the JSON data structure output by the RGW is likely to have breaking changes in the future?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH, I think the current failure mode of possibly applying period update --commit more often than is necessary is the best for users.

Validating RGW output fields in the Rook operator and failing the reconcile if we detect changes is also an option for the ignore-some-fields approach. But failing the reconcile seems too heavy-handed for me. Also, it can't catch the case where the RGW starts reporting staged changes in existing fields.

The best way to catch RGW behavior changes in our CI might be to just focus on testing our expectations in the integration test. If we restart the Rook operator after the CephObjectStore is reconciled the first time, we can expect that Rook should not issue the radosgw-admin period update --commit command the second reconcile. We can search the operator's logs to determine if it issues the command the second time and fail the CI if that happens.

@BlaineEXE BlaineEXE force-pushed the rgw-commands-use-staging-flag branch from d7417b2 to 9b484a9 Compare October 5, 2021 21:45
pkg/operator/ceph/object/admin.go Show resolved Hide resolved
pkg/operator/ceph/object/controller_test.go Outdated Show resolved Hide resolved
pkg/operator/ceph/object/controller_test.go Outdated Show resolved Hide resolved
@BlaineEXE BlaineEXE force-pushed the rgw-commands-use-staging-flag branch 3 times, most recently from 71737d1 to e2f1be8 Compare October 6, 2021 15:34
Comment on lines -278 to +319
if ! $FUNCTION "$@"; then
echo "Call to $FUNCTION was not successful" >&2
exit 1
fi
# -e, -E, and -o=pipefail will ensure this script returns a failure if a part of the function fails
$FUNCTION "$@"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I puzzled over this for a long time. The best I can determine, using if <function call> in bash creates a subshell for the function call which does not inherit the -e context from the parent. As such, the functions called there would not terminate early if they encountered errors, which could result in false positives. Therefore, we can just call the function directly and rely on -e (and -E and -o pipefail) to fail the script if there are errors encountered.

Once I got rid of this I also had to modify wait_for_prepare_pod because it was encountering an error with the values returned in its operation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit strange, I would think -E handles the error in the function regardless if it's a subshell or not.

@BlaineEXE
Copy link
Member Author

Example diff

2021-10-04 19:37:48.423856 D | ceph-object-controller: RGW config period diff:
map[string]interface{}{
  	... // 2 ignored entries
- 	"master_zone":      string(""),
+ 	"master_zone":      string("fc0bacb4-e893-4e15-8e96-93bfa761349d"),
- 	"master_zonegroup": string(""),
+ 	"master_zonegroup": string("f9ea81f0-5f30-4b98-ab33-96ed35c9744e"),
  	"period_config":    map[string]interface{}{"bucket_quota": map[string]interface{}{"check_on_raw": bool(false), "enabled": bool(false), "max_objects": float64(-1), "max_size": float64(-1), ...}, "user_quota": map[string]interface{}{"check_on_raw": bool(false), "enabled": bool(false), "max_objects": float64(-1), "max_size": float64(-1), ...}},
  	"period_map": map[string]interface{}{
  		"id":             string("40bd955e-b9fe-4832-b8ef-dac914491c2e"),
- 		"short_zone_ids": []interface{}{},
+ 		"short_zone_ids": []interface{}{
+ 			map[string]interface{}{
+ 				"key": string("fc0bacb4-e893-4e15-8e96-93bfa761349d"),
+ 				"val": float64(4.054075095e+09),
+ 			},
+ 		},
- 		"zonegroups": []interface{}{},
+ 		"zonegroups": []interface{}{
+ 			map[string]interface{}{
+ 				"api_name":            string("my-store"),
+ 				"default_placement":   string("default-placement"),
+ 				"endpoints":           []interface{}{string("http://10.111.231.155:80")},
+ 				"hostnames":           []interface{}{},
+ 				"hostnames_s3website": []interface{}{},
+ 				"id":                  string("f9ea81f0-5f30-4b98-ab33-96ed35c9744e"),
+ 				"is_master":           string("true"),
+ 				"master_zone":         string("fc0bacb4-e893-4e15-8e96-93bfa761349d"),
+ 				"name":                string("my-store"),
+ 				"placement_targets": []interface{}{
+ 					map[string]interface{}{
+ 						"name":            string("default-placement"),
+ 						"storage_classes": []interface{}{...},
+ 						"tags":            []interface{}{},
+ 					},
+ 				},
+ 				"realm_id":    string("9a5c1fb0-3896-4f1f-bcfe-0d3bcbe6ceec"),
+ 				"sync_policy": map[string]interface{}{"groups": []interface{}{}},
+ 				"zones": []interface{}{
+ 					map[string]interface{}{
+ 						"bucket_index_max_shards": float64(11),
+ 						"endpoints":               []interface{}{...},
+ 						"id":                      string("fc0bacb4-e893-4e15-8e96-93bfa761"...),
+ 						"log_data":                string("false"),
+ 						...
+ 					},
+ 				},
+ 			},
+ 		},
  	},
  	... // 2 ignored and 3 identical entries
  }

There are 4 ignored entries which correspond to the 4 ignored fields: id, predecessor_uuid, realm_epoch, and epoch.

There are only 4 top-level entries that haven't changed here:

  1. period_config (though this could probably change with quota updates)
  2. realm_name (this might if the config is updated to use a different realm for multisite)
  3. realm_id (ditto above)
  4. sync_status (I'm not sure what this is or when it might change)

Analysis

  • that the whole structure here changes outside of the 4 entries above
  • 3 of the 4 entries I might expect could reasonably change

Thus, the reason I diff the whole structure rather than focusing on specific types is because the majority of the structure does or might reasonably change based on config updates.

@BlaineEXE BlaineEXE force-pushed the rgw-commands-use-staging-flag branch 3 times, most recently from 6c00276 to c2a59c6 Compare October 6, 2021 17:59
@BlaineEXE BlaineEXE force-pushed the rgw-commands-use-staging-flag branch 5 times, most recently from 384d464 to a61be55 Compare October 6, 2021 23:53
if returns.periodCommitError {
return "", errors.New("fake period update --commit error")
}
return "", nil // success
Copy link
Contributor

@thotz thotz Oct 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Do radosgw-admin period update --commit returns period JSON o/p as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does return JSON, but Rook don't use the result, so I configured it to not parse the JSON. We only use the error condition to determine success/failure. This test would fail with a JSON parsing issue if JSON parsing were enabled for that function call.

Copy link
Contributor

@thotz thotz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM for me, but since there changes in Multisite part I request @alimaredia to take a look as well

@BlaineEXE BlaineEXE mentioned this pull request Oct 7, 2021
10 tasks
@BlaineEXE BlaineEXE force-pushed the rgw-commands-use-staging-flag branch from a61be55 to 4364cd7 Compare October 7, 2021 20:53
@BlaineEXE BlaineEXE force-pushed the rgw-commands-use-staging-flag branch from 4364cd7 to 04e3d7b Compare October 7, 2021 22:26
assert.NoError(t, err)
assert.False(t, res.Requeue)
assert.True(t, calledCommitConfigChanges)
err = r.client.Get(context.TODO(), req.NamespacedName, objectStore)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
err = r.client.Get(context.TODO(), req.NamespacedName, objectStore)
err = r.client.Get(ctx, req.NamespacedName, objectStore)

return &dependents.DependentList{}, nil
}

err = r.client.Get(context.TODO(), req.NamespacedName, objectStore)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
err = r.client.Get(context.TODO(), req.NamespacedName, objectStore)
err = r.client.Get(ctx, req.NamespacedName, objectStore)

Comment on lines -278 to +319
if ! $FUNCTION "$@"; then
echo "Call to $FUNCTION was not successful" >&2
exit 1
fi
# -e, -E, and -o=pipefail will ensure this script returns a failure if a part of the function fails
$FUNCTION "$@"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit strange, I would think -E handles the error in the function regardless if it's a subshell or not.

@BlaineEXE
Copy link
Member Author

It's a bit strange, I would think -E handles the error in the function regardless if it's a subshell or not.

I thought so, too but it is apparently more nuanced

Docs:

-E If set, the ERR trap is inherited by shell functions.

In Bash 4.4+ (pretty new) there is the option shopt -s inherit_errexit. It's possible that this was the default behavior before the option was added, but I haven't found a lot of clarity there. It may also be enabled by default if Bash is run in posix compliant mode. Again, kind of unclear...

@BlaineEXE
Copy link
Member Author

It's a bit strange, I would think -E handles the error in the function regardless if it's a subshell or not.

I thought so, too but it is apparently more nuanced

Docs:

-E If set, the ERR trap is inherited by shell functions.

In Bash 4.4+ (pretty new) there is the option shopt -s inherit_errexit. It's possible that this was the default behavior before the option was added, but I haven't found a lot of clarity there. It may also be enabled by default if Bash is run in posix compliant mode. Again, kind of unclear...

I've been playing with this some more, and it's still not working, even with inherit_errexit. I think it may be that if disables set -e so that the return value from the command in if won't fail the script.

@BlaineEXE BlaineEXE force-pushed the rgw-commands-use-staging-flag branch from 04e3d7b to ba89ad7 Compare October 8, 2021 17:16
@BlaineEXE
Copy link
Member Author

The canary test's prepare pod is failing with:

[2021-10-08 17:26:01,066][ceph_volume.process][INFO  ] Running command: /usr/bin/ceph-osd --cluster ceph --osd-objectstore bluestore --mkfs -i 0 --monmap /var/lib/ceph/osd/ceph-0/activate.monmap --keyfile - --osd-data /var/lib/ceph/osd/ceph-0/ --osd-uuid 07682e11-05b4-4245-b944-ca378411fb20 --setuser ceph --setgroup ceph
[2021-10-08 17:26:01,690][ceph_volume.process][INFO  ] stderr 2021-10-08T17:26:01.118+0000 7f42c0677080 -1 bluestore(/var/lib/ceph/osd/ceph-0/) _read_fsid unparsable uuid
[2021-10-08 17:26:01,690][ceph_volume.process][INFO  ] stderr 2021-10-08T17:26:01.662+0000 7f42c0677080 -1 bluestore(/var/lib/ceph/osd/ceph-0//block) _read_bdev_label failed to open /var/lib/ceph/osd/ceph-0//block: (13) Permission denied
[2021-10-08 17:26:01,690][ceph_volume.process][INFO  ] stderr 2021-10-08T17:26:01.662+0000 7f42c0677080 -1 bdev(0x555abd7a7400 /var/lib/ceph/osd/ceph-0//block) open open got: (13) Permission denied
[2021-10-08 17:26:01,690][ceph_volume.process][INFO  ] stderr 2021-10-08T17:26:01.662+0000 7f42c0677080 -1 OSD::mkfs: ObjectStore::mkfs failed with error (13) Permission denied

pvc-db seems like maybe it isn't waiting long enough for the prepare pod to be running.

Ditto for encryption-pvc-db and lvm-pvc...

@BlaineEXE BlaineEXE force-pushed the rgw-commands-use-staging-flag branch from ba89ad7 to a8e5742 Compare October 8, 2021 18:10
Replace calls to 'radosgw-admin period update --commit' with an
idempotent function.

Resolves rook#8879

Signed-off-by: Blaine Gardner <blaine.gardner@redhat.com>
@BlaineEXE BlaineEXE force-pushed the rgw-commands-use-staging-flag branch from a8e5742 to 5561311 Compare October 11, 2021 20:45
Add to the RGW multisite integration test a verification that the RGW
period is committed on the first reconcile and not committed on the
second reconcile.

Do this in the multisite test so that we verify that this works for
both the primary and secondary multi-site cluster.

To add this test, the github-action-helper.sh script had to be modified
to
1. actually deploy the version of Rook under test
2. adjust how functions are called to not lose the `-e` in a subshell
3. fix wait_for_prepare_pod helper that had a failure in the middle
   of its operation that didn't cause failures in the past

Signed-off-by: Blaine Gardner <blaine.gardner@redhat.com>
@BlaineEXE BlaineEXE force-pushed the rgw-commands-use-staging-flag branch from 5561311 to 9564308 Compare October 11, 2021 21:25
@BlaineEXE
Copy link
Member Author

Some runs of rgw-multisite-testing are failing because of #8954

@BlaineEXE BlaineEXE merged commit a1dd256 into rook:master Oct 12, 2021
@BlaineEXE BlaineEXE deleted the rgw-commands-use-staging-flag branch October 12, 2021 14:22
travisn added a commit that referenced this pull request Oct 12, 2021
rgw: replace period update --commit with function (backport #8911)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

rgw: multisite commands are used idempotently in error
4 participants