Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI: Fix Suite-runtime.RuntimeDatapathPrivilegedUnitTests #23297

Conversation

dylandreimerink
Copy link
Member

@dylandreimerink dylandreimerink commented Jan 24, 2023

This fixes both the masking issue and the underlying issue, please see commit descriptions for details.

Fixes: #23272

Fixed flake in the `TestRequestIPWithMismatchedLabel` LB-IPAM tests.

@dylandreimerink dylandreimerink added area/CI Continuous Integration testing issue or flake area/operator Impacts the cilium-operator component release-note/misc This PR makes changes that have no direct user impact. sig/ipam IP address management, including cloud IPAM ci/flake This is a known failure that occurs in the tree. Please investigate me! needs-backport/1.13 This PR / issue needs backporting to the v1.13 branch labels Jan 24, 2023
@dylandreimerink
Copy link
Member Author

/test

@dylandreimerink dylandreimerink force-pushed the feature/fix-lbipam-mismatched-label-test branch 2 times, most recently from 794a37f to db4b4a7 Compare January 27, 2023 09:47
@dylandreimerink
Copy link
Member Author

/test

@dylandreimerink dylandreimerink marked this pull request as ready for review January 27, 2023 14:19
@dylandreimerink dylandreimerink requested a review from a team as a code owner January 27, 2023 14:19
@dylandreimerink
Copy link
Member Author

Can't get the tests better than this. The smoke tests error on the hubble relay container: error: timed out waiting for the condition on pods/hubble-relay-6c9f4d49f-fgw5x which this PR doesn't touch. Otherwise everything is green and the PR is reviewed. So marking ready-to-merge

@dylandreimerink dylandreimerink added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Feb 2, 2023
@pchaigno pchaigno added release-note/ci This PR makes changes to the CI. and removed release-note/misc This PR makes changes that have no direct user impact. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. labels Feb 2, 2023
@pchaigno
Copy link
Member

pchaigno commented Feb 2, 2023

The smoke tests error on the hubble relay container: error: timed out waiting for the condition on pods/hubble-relay-6c9f4d49f-fgw5x which this PR doesn't touch.

Do we have an issue for that?

I also see that the k8s-1.16-kernel-4.9 didn't run for some reason.

@dylandreimerink
Copy link
Member Author

Do we have an issue for that?

We do now #23533

I also see that the k8s-1.16-kernel-4.9 didn't run for some reason.

Oh, missed that. Perhaps we can re-trigger it, I will try

@dylandreimerink
Copy link
Member Author

dylandreimerink commented Feb 2, 2023

/test-1.16-4.9

Job 'Cilium-PR-K8s-1.16-kernel-4.9' has 3 failures but they might be new flakes since it also hit 1 known flakes: #22056 (91.04)

@dylandreimerink
Copy link
Member Author

/test

@dylandreimerink
Copy link
Member Author

/test

@dylandreimerink
Copy link
Member Author

/test

We need to replace the t.Fatal calls with t.Error when within the
await callbacks. These callbacks are called by the fake client but from
the goroutine of the LBIPAM controller. This means that we panic in
the controller causing us to not call `.Done` on resource events.

Not calling `.Done` on resource events triggers the finalizer based
checks to enforce `.Done` is always called. This masks the original
error.

By using t.Error we do mark the test as failed but only ever panic in
the test suite itself.

Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>
This unit test was incorrect. It assumed that services with requested
IPs whom didn't match the pool selector would be ignored. But this
behavior was changed at some point during development so we instead
set a status condition on the service. However the test wasn't updated.

The test still returned PASS because it is time based and terminated
after 100ms, but 7999/8000 times LB-IPAM takes longer than 100ms to
send the update. This flake occurs only when we send the update in
<100ms after start.

The test has been updated to reflect the correct behavior, which
surfaced a bug that was fixed as well.

Fixes: cilium#23272

Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>
@dylandreimerink
Copy link
Member Author

/test

@dylandreimerink
Copy link
Member Author

I finally got CI to agree with me on this one. We are still hitting #24217 but that is it. All reviews are in. CI green except for the flak on ConformanceGatewayAPI. Marking as ready-to-merge

@dylandreimerink dylandreimerink added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Mar 21, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from master in 1.13.2 Mar 21, 2023
@tklauser tklauser merged commit db1a061 into cilium:master Mar 22, 2023
@NikAleksandrov NikAleksandrov mentioned this pull request Mar 23, 2023
29 tasks
@NikAleksandrov NikAleksandrov added backport-pending/1.13 The backport for Cilium 1.13.x for this PR is in progress. and removed needs-backport/1.13 This PR / issue needs backporting to the v1.13 branch labels Mar 23, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from master to Backport pending to v1.13 in 1.13.2 Mar 23, 2023
@jibi jibi added backport-done/1.13 The backport for Cilium 1.13.x for this PR is done. and removed backport-pending/1.13 The backport for Cilium 1.13.x for this PR is in progress. labels Apr 7, 2023
@gentoo-root gentoo-root moved this from Backport pending to v1.13 to Backport done to v1.13 in 1.13.2 Apr 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/CI Continuous Integration testing issue or flake area/operator Impacts the cilium-operator component backport-done/1.13 The backport for Cilium 1.13.x for this PR is done. ci/flake This is a known failure that occurs in the tree. Please investigate me! ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/ci This PR makes changes to the CI. sig/ipam IP address management, including cloud IPAM
Projects
No open projects
1.13.2
Backport done to v1.13
7 participants