-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v1.13 backports 2023-04-26 #25137
v1.13 backports 2023-04-26 #25137
Conversation
[ upstream commit 5163094 ] Previously, kernel modules names were read from the `/proc/modules` only which prevented k8s linux distribution with kernel modules builtin into the kernel to pass the validation of required kernel modules by Cilium. This patch adds support for systems with builtin modules by trying to read the kernel module names from following files additionally: - `/lib/modules/<kernel-version>/modules.builtin` - `/usr/lib/modules/<kernel-version>/modules.builtin` - `/usr/lib/debug/lib/modules/<kernel-version>/modules.builtin` Fixes: cilium#23863 Signed-off-by: Mateusz Paluszkiewicz (TheAifam5) <theaifam5@gmail.com> Signed-off-by: Tam Mach <tam.mach@cilium.io>
[ upstream commit e84e99f ] The ginkgo tests are currently executed in a different order for each run in our CI. Unfortunately, the order in which we run the tests often matters and can impact the test results. That's because there are a lot of side effects (leftover Linux state, leftover Cilium conntrack entries, leftover Kubernetes state, etc.) that are hard to control for. This undeterminism facilities the introduction of flakes: a first pull request introduces a flake, but it doesn't show up until a later pull request runs the tests in a particular order. The random order also makes flake a bit order to reproduce locally, though that can easily be worked around by passing the proper seed to ginkgo when reproducing. One argument in favor of this randomness is that it surfaces those side effects and allows us to identify and clean them. In practice, that doesn't work great: - Being side effects, they are often hard to identify and are a huge time cost for contributors to debug. Our ever increasing list of flake issues is a good testimony of that. - There are not always easy to clean up [1]. - There are often not representative of bugs likely to affect users. Our users rarely restart Cilium 50 times in the span of an hour with different configurations. In the end, our tests were not meant to uncover those side effects and they are therefore inadequate at exposing them properly. If we want to identify those side effects, then we should have tests specifically for that. Other, non-ginkgo CI jobs run their tests in a fixed order [2] or on completely independent clusters [3]. This commit therefore fixes the order of the ginkgo tests. To that end, a ginkgo seed was selected using $SRANDOM and will be the same for all subsequent runs. 1 - cilium#17459 2 - cilium/cilium-cli#558 3 - https://github.com/cilium/cilium/blob/main/.github/workflows/conformance-datapath.yaml Signed-off-by: Paul Chaignon <paul@cilium.io> Signed-off-by: Tam Mach <tam.mach@cilium.io>
[ upstream commit 922aa1b ] When using metallb as BGP speaker, if IPv6 advertisement is made - metallb will return error as unsupported. This error is logged and error is returned to control loop, which continues retrying causing log flooding and high CPU. This change filters out IPv6 prefixes before sending them to metallb library and logs one time error message. Signed-off-by: harsimran pabla <hpabla@isovalent.com> Signed-off-by: Tam Mach <tam.mach@cilium.io>
[ upstream commit e7c748f ] The leader election logic of the operator is currently executed in a separate goroutine started by an OnStart hook, and relies on a context (canceled by the respective OnStop hook) for shutting down. Once this occurs, it will automatically release the lease, to allow the immediate election of a new leader. Yet, we do not explicitly wait for this operation to complete, and the operator might stop before the lease is actually released, introducing a ~15 seconds delay for the next election. This commit fixes this by introducing a WaitGroup to explicitly wait for the leader election loop to terminate before continuing with the shutdown process. Signed-off-by: Marco Iorio <marco.iorio@isovalent.com> Signed-off-by: Tam Mach <tam.mach@cilium.io>
[ upstream commit d2301cb ] When running the host firewall over IPv6 NodePort traffic, it would enforce policies on the SYN+ACK packet with the SNATed IP address. Obviously that didn't work great. This happened because the IPv6 logic would simply lookup the source security identity in the ipcache using the source IP address, without considering that it might be SNATed. In case it is SNATed, the source IP is always a host IP and the identity always HOST_ID. Thus, the host firewall starts incorrectly enforcing policies and the packet. The IPv4 path, on the other hand, does this source identity lookup right: it only assumes a packet is coming from the local host if it has the HOST magic mark (which we set via netfilter). SNATed packets don't have this mark and therefore end up with an UNKNOWN identity. This commit fixes the IPv6 logic to match the IPv4 logic. This bug has always existed in the IPv6 code path. For IPv4, it was fixed in commit cfd8fbf ("bpf: Fix srcID resolution on to-netdev in case of SNAT"). I don't remember why I seemed to think the IPv6 code path didn't need the same fix :( We already have an end-to-end test to cover this code path, but it is currently quarantined. Other changes are required before we can unquarantine it, so I'll send those in a separate patchset. Fixes: 489dbef ("bpf: Enforce host policies for IPv6") Signed-off-by: Paul Chaignon <paul@cilium.io> Signed-off-by: Tam Mach <tam.mach@cilium.io>
[ upstream commit 16e4b68 ] Update documentation related to configuration on AWS. Certain IAM Policies must be in place if --clustername and --eni-gc-tags are not set. Incomplete docs got introduced after e66ed7f Signed-off-by: Tore S. Loenoey <tore.lonoy@gmail.com> Signed-off-by: Tam Mach <tam.mach@cilium.io>
/test-backport-1.13 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My PRs look good. Thanks!
[ upstream commit 46de5bc ] These are remnants of a past before GHPRB. At best they create uncessary noise in the logs, at worst they can interfere with the default behaviour, so let's just remove them. Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com> Signed-off-by: Tam Mach <tam.mach@cilium.io>
/test-backport-1.13 Job 'Cilium-PR-K8s-1.24-kernel-4.19' failed: Click to show.Test Name
Failure Output
Jenkins URL: https://jenkins.cilium.io/job/Cilium-PR-K8s-1.24-kernel-4.19/22/ If it is a flake and a GitHub issue doesn't already exist to track it, comment Then please upload the Jenkins artifacts to that issue. Job 'Cilium-PR-K8s-1.17-kernel-4.19' failed: Click to show.Test Name
Failure Output
Jenkins URL: https://jenkins.cilium.io/job/Cilium-PR-K8s-1.17-kernel-4.19/22/ If it is a flake and a GitHub issue doesn't already exist to track it, comment Then please upload the Jenkins artifacts to that issue. Job 'Cilium-PR-K8s-1.19-kernel-4.19' failed: Click to show.Test Name
Failure Output
Jenkins URL: https://jenkins.cilium.io/job/Cilium-PR-K8s-1.19-kernel-4.19/21/ If it is a flake and a GitHub issue doesn't already exist to track it, comment Then please upload the Jenkins artifacts to that issue. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My PR looks good. Thanks!
/test-1.16-4.19 |
/test-1.17-4.19 |
/test-1.18-4.19 |
/test-1.19-4.19 |
/test-1.20-4.19 |
/test-1.21-4.19 |
/test-1.22-4.19 |
/test-1.23-4.19 |
/test-1.24-4.19 |
/test-runtime |
/test-1.21-4.19 |
/test-1.17-4.19 |
/test-1.18-4.19 |
/test-1.19-4.19 |
/test-1.24-4.19 |
Re-trigger a couple of jobs due to a mixed of Github and provision issues. |
i trust mlh |
STATUS
commands from upstream tests' Jenkinsfile #25046 -- ci: remove STATUS commands from upstream tests' JenkinsfileOnce this PR is merged, you can update the PR labels via: