Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MCO-1152: MCO-1146: Add e2e tests for NodeDisruptionPolicy #4365

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

djoshy
Copy link
Contributor

@djoshy djoshy commented May 14, 2024

This PR adds:

@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented May 14, 2024

@djoshy: This pull request references MCO-1152 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.16.0" version, but no target version was set.

In response to this:

- What I did

- How to verify it

- Description for the changelog

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label May 14, 2024
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 14, 2024
Copy link
Contributor

openshift-ci bot commented May 14, 2024

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

Copy link
Contributor

openshift-ci bot commented May 14, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: djoshy

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 14, 2024
@djoshy
Copy link
Contributor Author

djoshy commented May 14, 2024

/test e2e-gcp-op-techpreview

@djoshy
Copy link
Contributor Author

djoshy commented May 15, 2024

/test e2e-gcp-op-techpreview

@djoshy
Copy link
Contributor Author

djoshy commented May 15, 2024

/test e2e-gcp-op-techpreview

@djoshy djoshy marked this pull request as ready for review May 15, 2024 17:24
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 15, 2024
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented May 15, 2024

@djoshy: This pull request references MCO-1152 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.16.0" version, but no target version was set.

In response to this:

This PR:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented May 15, 2024

@djoshy: This pull request references MCO-1152 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.16.0" version, but no target version was set.

In response to this:

This PR adds:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 19, 2024
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 20, 2024
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented May 20, 2024

@djoshy: This pull request references MCO-1152 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.17.0" version, but no target version was set.

In response to this:

This PR adds:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@djoshy djoshy changed the title MCO-1152: Add e2e tests for NodeDisruptionPolicy MCO-1152: MCO-1146: Add e2e tests for NodeDisruptionPolicy May 20, 2024
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented May 20, 2024

@djoshy: This pull request references MCO-1152 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.17.0" version, but no target version was set.

In response to this:

This PR adds:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@djoshy
Copy link
Contributor Author

djoshy commented May 21, 2024

/retest-required

@djoshy
Copy link
Contributor Author

djoshy commented May 21, 2024

/test unit

@djoshy
Copy link
Contributor Author

djoshy commented May 23, 2024

/retest-required

@djoshy
Copy link
Contributor Author

djoshy commented May 23, 2024

/test e2e-gcp-op-techpreview

@djoshy
Copy link
Contributor Author

djoshy commented May 30, 2024

/test all

Copy link
Member

@cheesesashimi cheesesashimi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this looks great! I just have a few small suggestions.

@@ -388,7 +388,7 @@ func assertNodeAndMCPIsDegraded(t *testing.T, cs *framework.ClientSet, node core
mcdPod, err := helpers.MCDForNode(cs, &node)
require.Nil(t, err)

assertLogsContain(t, cs, mcdPod, &node, logEntry)
helpers.AssertMCDLogsContain(t, cs, mcdPod, &node, logEntry)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

praise: Nice!

var ac *mcoac.NodeDisruptionPolicySpecFileApplyConfiguration
fileName := "/etc/test-" + string(action.Type)
if action.Type == opv1.ReloadSpecAction {
ac = mcoac.NodeDisruptionPolicySpecFile().WithPath(fileName).WithActions(mcoac.NodeDisruptionPolicySpecAction().WithType(action.Type).WithReload(mcoac.ReloadService().WithServiceName(action.Reload.ServiceName)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (non-blocking): For better readability, split this up a bit:

reload := mcoac.ReloadService().WithServiceName(action.Reload.ServiceName)
actions := mcoac.NodeDisruptionPolicySpecAction().WithType(action.Type).WithReload(reload)
ac = mcoac.NodeDisruptionPolicySpecFile().WithPath(fileName).WithActions(actions)

If this API won't let you do that or you get bizarre errors by doing that, you can break this across multiple lines like this instead:

ac = mcoac.NodeDisruptionPolicySpecFile().WithPath(fileName).WithActions(
	mcoac.NodeDisruptionPolicySpecAction().WithType(action.Type).WithReload(
		mcoac.ReloadService().WithServiceName(action.Reload.ServiceName)))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will attempt to clean this up, thanks (:

Copy link
Contributor Author

@djoshy djoshy May 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was able to break this down as suggested, and create a new helper - making it overall a lot cleaner and re-usable. Thanks so much for this suggestion!

test/e2e-techpreview/nodedisrupt_test.go Outdated Show resolved Hide resolved
test/e2e-techpreview/nodedisrupt_test.go Outdated Show resolved Hide resolved
test/helpers/utils.go Show resolved Hide resolved
}
}

func GetFunctionName(i interface{}) string {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

praise: Cool!

// Ensure status.ObservedGeneration matches the last generation of MachineConfiguration
if mcop.Generation != mcop.Status.ObservedGeneration {
klog.Errorf("calculating NodeDisruptionPolicies: NodeDisruptionPolicyStatus is not up to date.")
err = fmt.Errorf("NodeDisruptionPolicyStatus is not up to date")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Should this error be returned? If not, add a comment explaining why.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thought: I just took a look at why this was written like this. Basically, we use the variable err for both the value returned by wait.PollUntilContextTimeout() in addition to what is inside the closure that wait.PollUntilContextTimeout() executes. Its a bit confusing as to whether we should stop polling whenever we encounter an error or keep going and deal with it error later. If we want to keep going and deal with the error later, using the Aggregate type coupled with an appropriate deduplication function could help readability.

(To be clear: I am not asking you to change this as part of this PR. It's just a thought I'm putting here for posterity.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry about the structure being confusing! My thought process to use it was the following:

  • If we encounter an error, we don't want to keep going in the polling function. We want to try again after an interval of time.
  • If we timeout on the polling function, we want the last error we encountered to be reported. Hence the use of the same variable inside and outside of the function.

I hope that clears up why I went with the last error vs aggregation. Any of the errors within the polling loop are fatal, with the earlier ones being slightly more fatal than the ones following.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahhh that makes sense, thanks for the clarification!

{Type: opv1.RestartSpecAction, Restart: &opv1.RestartService{ServiceName: "crio.service"}},
{Type: opv1.ReloadSpecAction, Reload: &opv1.ReloadService{ServiceName: "crio.service"}}}

// Shuffle the three action sets so each testFunc is randomly assigned one of the above action sets.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

praise: Nice! I wish we could run our e2e suite with -shuffle=on. I think we have e2e tests that need to execute in a certain order, but I'm unsure how that flag affects subtests like this.

t.Run(helpers.GetFunctionName(testFunc), func(t *testing.T) {
// Only parallelize if there are enough nodes to run the tests individually
if len(nodes) >= len(testFuncs) {
t.Parallel()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

praise: Nice! I want to be able to parallelize more in our e2e suite.

@djoshy
Copy link
Contributor Author

djoshy commented Jun 3, 2024

/retest-required

2 similar comments
@djoshy
Copy link
Contributor Author

djoshy commented Jun 3, 2024

/retest-required

@djoshy
Copy link
Contributor Author

djoshy commented Jun 3, 2024

/retest-required

Copy link
Contributor

openshift-ci bot commented Jun 3, 2024

@djoshy: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-azure-ovn-upgrade-out-of-change 33b1a8f link false /test e2e-azure-ovn-upgrade-out-of-change
ci/prow/e2e-hypershift 33b1a8f link true /test e2e-hypershift

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants