-
Notifications
You must be signed in to change notification settings - Fork 392
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MCO-1152: MCO-1146: Add e2e tests for NodeDisruptionPolicy #4365
base: master
Are you sure you want to change the base?
Conversation
@djoshy: This pull request references MCO-1152 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.16.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
Skipping CI for Draft Pull Request. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: djoshy The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/test e2e-gcp-op-techpreview |
/test e2e-gcp-op-techpreview |
/test e2e-gcp-op-techpreview |
@djoshy: This pull request references MCO-1152 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.16.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
@djoshy: This pull request references MCO-1152 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.16.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
@djoshy: This pull request references MCO-1152 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.17.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
@djoshy: This pull request references MCO-1152 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.17.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
/retest-required |
/test unit |
/retest-required |
/test e2e-gcp-op-techpreview |
/test all |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this looks great! I just have a few small suggestions.
@@ -388,7 +388,7 @@ func assertNodeAndMCPIsDegraded(t *testing.T, cs *framework.ClientSet, node core | |||
mcdPod, err := helpers.MCDForNode(cs, &node) | |||
require.Nil(t, err) | |||
|
|||
assertLogsContain(t, cs, mcdPod, &node, logEntry) | |||
helpers.AssertMCDLogsContain(t, cs, mcdPod, &node, logEntry) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
praise: Nice!
var ac *mcoac.NodeDisruptionPolicySpecFileApplyConfiguration | ||
fileName := "/etc/test-" + string(action.Type) | ||
if action.Type == opv1.ReloadSpecAction { | ||
ac = mcoac.NodeDisruptionPolicySpecFile().WithPath(fileName).WithActions(mcoac.NodeDisruptionPolicySpecAction().WithType(action.Type).WithReload(mcoac.ReloadService().WithServiceName(action.Reload.ServiceName))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (non-blocking): For better readability, split this up a bit:
reload := mcoac.ReloadService().WithServiceName(action.Reload.ServiceName)
actions := mcoac.NodeDisruptionPolicySpecAction().WithType(action.Type).WithReload(reload)
ac = mcoac.NodeDisruptionPolicySpecFile().WithPath(fileName).WithActions(actions)
If this API won't let you do that or you get bizarre errors by doing that, you can break this across multiple lines like this instead:
ac = mcoac.NodeDisruptionPolicySpecFile().WithPath(fileName).WithActions(
mcoac.NodeDisruptionPolicySpecAction().WithType(action.Type).WithReload(
mcoac.ReloadService().WithServiceName(action.Reload.ServiceName)))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will attempt to clean this up, thanks (:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was able to break this down as suggested, and create a new helper - making it overall a lot cleaner and re-usable. Thanks so much for this suggestion!
} | ||
} | ||
|
||
func GetFunctionName(i interface{}) string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
praise: Cool!
// Ensure status.ObservedGeneration matches the last generation of MachineConfiguration | ||
if mcop.Generation != mcop.Status.ObservedGeneration { | ||
klog.Errorf("calculating NodeDisruptionPolicies: NodeDisruptionPolicyStatus is not up to date.") | ||
err = fmt.Errorf("NodeDisruptionPolicyStatus is not up to date") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: Should this error be returned? If not, add a comment explaining why.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thought: I just took a look at why this was written like this. Basically, we use the variable err
for both the value returned by wait.PollUntilContextTimeout()
in addition to what is inside the closure that wait.PollUntilContextTimeout()
executes. Its a bit confusing as to whether we should stop polling whenever we encounter an error or keep going and deal with it error later. If we want to keep going and deal with the error later, using the Aggregate type coupled with an appropriate deduplication function could help readability.
(To be clear: I am not asking you to change this as part of this PR. It's just a thought I'm putting here for posterity.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry about the structure being confusing! My thought process to use it was the following:
- If we encounter an error, we don't want to keep going in the polling function. We want to try again after an interval of time.
- If we timeout on the polling function, we want the last error we encountered to be reported. Hence the use of the same variable inside and outside of the function.
I hope that clears up why I went with the last error vs aggregation. Any of the errors within the polling loop are fatal, with the earlier ones being slightly more fatal than the ones following.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahhh that makes sense, thanks for the clarification!
{Type: opv1.RestartSpecAction, Restart: &opv1.RestartService{ServiceName: "crio.service"}}, | ||
{Type: opv1.ReloadSpecAction, Reload: &opv1.ReloadService{ServiceName: "crio.service"}}} | ||
|
||
// Shuffle the three action sets so each testFunc is randomly assigned one of the above action sets. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
praise: Nice! I wish we could run our e2e suite with -shuffle=on
. I think we have e2e tests that need to execute in a certain order, but I'm unsure how that flag affects subtests like this.
t.Run(helpers.GetFunctionName(testFunc), func(t *testing.T) { | ||
// Only parallelize if there are enough nodes to run the tests individually | ||
if len(nodes) >= len(testFuncs) { | ||
t.Parallel() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
praise: Nice! I want to be able to parallelize more in our e2e suite.
/retest-required |
2 similar comments
/retest-required |
/retest-required |
@djoshy: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
This PR adds: