Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v1.13 backports 2023-05-12 #25424

Merged
merged 8 commits into from
May 16, 2023

Conversation

gentoo-root
Copy link
Contributor

Once this PR is merged, you can update the PR labels via:

$ for pr in 25112; do contrib/backporting/set-labels.py $pr done 1.13; done

@gentoo-root gentoo-root requested a review from a team as a code owner May 12, 2023 14:59
@gentoo-root gentoo-root added kind/backports This PR provides functionality previously merged into master. backport/1.13 This PR represents a backport for Cilium 1.13.x of a PR that was merged to main. labels May 12, 2023
@gentoo-root
Copy link
Contributor Author

Depends on #25346. After #25346 is merged, the first three commits should be dropped.

@julianwiedmann
Copy link
Member

/test-backport-1.13

@thorn3r thorn3r mentioned this pull request May 15, 2023
43 tasks
[ upstream commit be111c7 ]

Rename ct_lb_lookup4 to ct_lazy_lookup4, because it's going to be used
not only in LB flows, but now also in NAT. Add the action parameter to
cover ICMP packets in NAT flows.

The actual usage comes in the next commit, this one doesn't contain
functional changes.

Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com>
[ upstream commit 3f0c2b7 ]

The tuple prepared by snat_v4_nat or snat_v4_rev_nat is good enough for
ct_lazy_lookup4, so we can spare a call to ct_extract_ports4, reducing
complexity, improving performance, but most importantly, allowing to
pass ports that differ from the packet header, which will allow to fix a
bug in a subsequent commit.

Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com>
[ upstream commit 7cb16c8 ]

Rename ct_lb_lookup6 to ct_lazy_lookup6, because it's going to be used
not only in LB flows, but now also in NAT. Add the action parameter to
cover ICMP packets in NAT flows.

The actual usage comes in the next commit, this one doesn't contain
functional changes.

Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com>
[ upstream commit ccf7965 ]

The tuple prepared by snat_v6_nat or snat_v6_rev_nat is good enough for
ct_lazy_lookup6, so we can spare a call to ct_extract_ports6, reducing
complexity, improving performance, but most importantly, allowing to
pass ports that differ from the packet header, which will allow to fix a
bug in a subsequent commit.

Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com>
[ upstream commit 83d2621 ]

This flag will be reused for more cases that need connection tracking.
Currently it's only used for host-local connections, see commit
0e785f9 ("bpf: initial bpf-based masquerading support") for
details.

Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com>
[ upstream commit 6bbbb2a ]

Flag the egress gateway NAT entries as needs_ct, so that reply packets
could extend the expiration timeout as well.

The next commit is also required for the complete fix, otherwise the
needs_ct mark won't be matched.

Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com>
[ upstream commit 4e27e55 ]

Fix a bug in connection tracking in the revSNAT flow: reply packets are
tracked before revSNAT restores the original tuple, meaning that it
won't match the existing CT entry, and a new one will be created.

This bug has impact on connections that set needs_ct (host-local and
egress gateway connections): packets coming in the reply direction won't
get the needs_ct mark and won't extend the expiration timeout.

The fix is to convert the tuple before doing the conntrack lookup, so
that it matches the tuple after revSNAT.

Fixes: cilium#25110
Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com>
[ upstream commit 86cddf6 ]

At the moment, both snat_v{4,6}_track_connection and
snat_v{4,6}_new_mapping check the same conditions to determine whether a
corresponding CT entry should exist for an SNAT entry. This approach has
the following drawbacks:

1. Duplicate code: risk of divergence and a minor performance penalty,
   because this code is actually executed twice for the first packet.

2. The checks in snat_v{4,6}_track_connection are not optimal, because
   this function is called from both egress and ingress flows, but the
   checks are only relevant for the egress flow (because the first
   packet is egress, and the subsequent packets will have the result of
   the checks cached in the SNAT state).

Solve both drawbacks by moving the checks one layer up, to
snat_v{4,6}_nat_handle_mapping (also put the checks to a separate
function for clarity). This way, the checks are only made in the egress
flow when the SNAT state doesn't exist yet, and the result is reused for
both snat_v{4,6}_track_connection and snat_v{4,6}_new_mapping, making
the code more robust and slightly more optimized.

Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com>
Suggested-by: Julian Wiedmann <jwi@isovalent.com>
Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com>
@thorn3r thorn3r force-pushed the pr/v1.13-backport-2023-05-12 branch from 1a743cf to bf63427 Compare May 16, 2023 01:20
@julianwiedmann
Copy link
Member

julianwiedmann commented May 16, 2023

/test-backport-1.13

Job 'Cilium-PR-K8s-1.24-kernel-4.19' hit: #25255 (93.81% similarity)

Job 'Cilium-PR-K8s-1.26-kernel-net-next' failed:

Click to show.

Test Name

K8sDatapathServicesTest Checks N/S loadbalancing With host policy Tests NodePort

Failure Output

FAIL: Can not connect to service "tftp://192.168.56.12:32760/hello" from outside cluster (1/10)

Jenkins URL: https://jenkins.cilium.io/job/Cilium-PR-K8s-1.26-kernel-net-next/2314/

If it is a flake and a GitHub issue doesn't already exist to track it, comment /mlh new-flake Cilium-PR-K8s-1.26-kernel-net-next so I can create one.

Then please upload the Jenkins artifacts to that issue.

Job 'Cilium-PR-K8s-1.25-kernel-4.19' hit: #25255 (89.37% similarity)

Job 'Cilium-PR-K8s-1.16-kernel-4.19' failed:

Click to show.

Test Name

K8sDatapathConfig Host firewall With VXLAN

Failure Output

FAIL: Failed to reach 192.168.56.11:80 from testserver-host-q79ff

Jenkins URL: https://jenkins.cilium.io/job/Cilium-PR-K8s-1.16-kernel-4.19/1499/

If it is a flake and a GitHub issue doesn't already exist to track it, comment /mlh new-flake Cilium-PR-K8s-1.16-kernel-4.19 so I can create one.

Then please upload the Jenkins artifacts to that issue.

@julianwiedmann
Copy link
Member

/test-1.16-4.19

@julianwiedmann
Copy link
Member

/test-1.17-4.19

@julianwiedmann
Copy link
Member

/test-1.19-4.19

@julianwiedmann
Copy link
Member

/test-1.22-4.19

@julianwiedmann
Copy link
Member

/test-1.23-4.19

@julianwiedmann
Copy link
Member

/test-1.24-4.19

@julianwiedmann
Copy link
Member

/test-1.24-5.4

@julianwiedmann
Copy link
Member

/test-1.25-4.19

@thorn3r
Copy link
Contributor

thorn3r commented May 16, 2023

k8s-1.25-kernel-4.19 hit a known flake: #25255

k8s-1.16-kernel-4.19 hit a known flake: #15455, now being tracked here #25448
k8s-1.26-kernel-net-next hit the same flake

@thorn3r
Copy link
Contributor

thorn3r commented May 16, 2023

/test-1.17-4.19

@thorn3r
Copy link
Contributor

thorn3r commented May 16, 2023

/ci-multicluster-1.13

@thorn3r
Copy link
Contributor

thorn3r commented May 16, 2023

/ci-aks-1.13

@aditighag
Copy link
Member

Thanks @thorn3r for linking the flake issues. Merging the PR based on your assessment.

@aditighag aditighag merged commit b79d6d3 into cilium:v1.13 May 16, 2023
59 of 62 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/1.13 This PR represents a backport for Cilium 1.13.x of a PR that was merged to main. kind/backports This PR provides functionality previously merged into master.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants