Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v1.14 Backports 2023-10-06 #28442

Merged
merged 9 commits into from
Oct 9, 2023
Merged

v1.14 Backports 2023-10-06 #28442

merged 9 commits into from
Oct 9, 2023

Conversation

sayboras
Copy link
Member

@sayboras sayboras commented Oct 6, 2023

@sayboras sayboras added kind/backports This PR provides functionality previously merged into master. backport/1.14 This PR represents a backport for Cilium 1.14.x of a PR that was merged to main. labels Oct 6, 2023
@sayboras sayboras force-pushed the pr/v1.14-backport-2023-10-06 branch from ddfb238 to 1a73eff Compare October 6, 2023 11:12
@sayboras
Copy link
Member Author

sayboras commented Oct 6, 2023

/test-backport-1.14

Copy link
Member

@giorio94 giorio94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My commit looks good, thanks!

@sayboras
Copy link
Member Author

sayboras commented Oct 6, 2023

ah thanks a lot, I will remove backport label from your PR.

@sayboras sayboras force-pushed the pr/v1.14-backport-2023-10-06 branch from 1a73eff to 8e69f9c Compare October 6, 2023 12:55
gandro and others added 9 commits October 7, 2023 10:44
ENI mode creates routing tables with index `10 + eni-index`. This commit
documents that and mentions that those indices are not taken by the
system.

Suggested-by: Chris Tarazi <chris@isovalent.com>
Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
[ upstream commit def1770 ]

TL;DR. This commit works around a Cilium iptables bug that can affect
IPsec. The root of the issue for IPsec is a conflict between the packet
mark we use and some iptables-nft rules. The workaround changes the
packet mark to avoid the conflict.

Cilium's IPsec implementation relies on the packet mark to match
packets against XFRM rules. In particular, the packet mark holds the
node ID (0xffff0000), the SPI (0xf000), and whether to encrypt or
decrypt (0xf00). On egress, the packet mark is written in bpf_lxc before
the packet is sent to the stack for encryption. The encryption layer
retains the packet mark, except for the encryption bit (0xf00) in some
cases. bpf_host then processes the encrypted packet and clears the mark.

In some corner cases (what causes this isn't clear yet), some OSes can
end up with kube-proxy rules in both iptables-nft and iptables-legacy.
Cilium is currently unable to handle that situation properly: it should
install iptables rules to skip some of kube-proxy's rules, but only does
that for iptables-legacy. We can therefore end up with the following
rules matching outgoing packets:

    [31061:1736670] -A POSTROUTING -m comment --comment "kubernetes postrouting rules" -j KUBE-POSTROUTING
    [31059:1736390] -A KUBE-POSTROUTING -m mark ! --mark 0x4000/0x4000 -j RETURN
    [2:280] -A KUBE-POSTROUTING -j MARK --set-xmark 0x4000/0x0
    [2:280] -A KUBE-POSTROUTING -m comment --comment "kubernetes service traffic requiring SNAT" -j MASQUERADE --random-fully

We can see that kube-proxy install a KUBE-POSTROUTING chain in the
POSTROUTING netfilter chain. That chain skips all packets not marked
with 0x4XXX. Other packets see the mark removed and are then
masqueraded. In the above example, 2 packets had the 0x4XXX mark and
were masqueraded.

These rules can end up matching the mark of our encrypted packets, if
the SPI is 4. Encrypted packets are then incorrectly masqueraded.
Depending on the node and network configuration, they may end up
dropped.

The SPI is incremented on key rotation (from 1 to 15, then going back to
1 again). Users performing key rotations are therefore highly likely to
hit this bug if they have iptables-nft kube-proxy rules installed.

We can work around this bug by clearing all packet mark bits we don't
need anymore after the encryption. Once the packet is encrypted,
bpf_host only needs the 0xf00 part of the mark to determine if a packet
is encrypted or not. The node ID and SPI can be cleared from the mark.
This commit therefore clears those bits right after encryption, with
output-mark, on egress and ingress.

Note this only avoids the IPsec impact of this bug, but doesn't fix the
underlying issue with iptables-nft. More work is needed to have the
agent correctly handle this situation.

Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
Signed-off-by: Tam Mach <tam.mach@cilium.io>
[ upstream commit b4da4e6 ]

This commit is to make sure that we are having map pressure metric for
auth map to improve observability, which can be useful for adjusting
map size if required.

Relates: #24617

Signed-off-by: Tam Mach <tam.mach@cilium.io>
Signed-off-by: Tam Mach <tam.mach@cilium.io>
[ upstream commit 5741606 ]

These targets failed to select the nodes of a kind cluster. This change
selects the right variables that are used to check for the cluster
nodes.

Fixes: 5f88bbf ("docs: Add Makefile and documentation for "fast" development targets")
Signed-off-by: André Martins <andre@cilium.io>
Signed-off-by: Tam Mach <tam.mach@cilium.io>
[ upstream commit 6ea3ed5 ]

The kind-install-cilium-fast should install Cilium on all available
kind clusters.

Fixes: 5f88bbf ("docs: Add Makefile and documentation for "fast" development targets")
Signed-off-by: André Martins <andre@cilium.io>
Signed-off-by: Tam Mach <tam.mach@cilium.io>
[ upstream commit e773f96 ]

Currently, we are installing a fixed version of the AWS VPC CNI plugin
version (i.e., v1.11) regardless of the Kubernetes version. Yet, this
means that in certain cases we upgrade it, while in others we downgrade,
it, introducing unnecessary churn.

Let's instead use the default one that gets installed with the given
k8s version. Specifically, for the k8s versions currently tested:

* k8s: v1.23 - AWS VPC CNI: v1.10.4-eksbuild.1
* k8s: v1.24 - AWS VPC CNI: v1.11.4-eksbuild.1
* k8s: v1.25 - AWS VPC CNI: v1.12.2-eksbuild.1
* k8s: v1.26 - AWS VPC CNI: v1.12.5-eksbuild.2
* k8s: v1.27 - AWS VPC CNI: v1.12.6-eksbuild.2

Retrieved through:

for MINOR in $(seq 23 27)
do
    echo -n "k8s: v1.$MINOR - AWS VPC CNI: "
    aws eks describe-addon-versions --addon-name vpc-cni \
      --kubernetes-version 1.$MINOR --output yaml | \
      yq '.addons[].addonVersions[] | select(.compatibilities[].defaultVersion == true) | .addonVersion';
done

Signed-off-by: Marco Iorio <marco.iorio@isovalent.com>
Signed-off-by: Tam Mach <tam.mach@cilium.io>
[ upstream commit b825f4e ]

Path prefixes could not be stripped which lead to double slashes.

Fixes: #28270
Signed-off-by: Dawid Danieluk <lolnoxy@gmail.com>
Signed-off-by: Tam Mach <tam.mach@cilium.io>
[ upstream commit 0397bf4 ]

Signed-off-by: Raphaël Pinson <raphael@isovalent.com>
Signed-off-by: Tam Mach <tam.mach@cilium.io>
[ upstream commit 262d59e ]

Manual update for docs-builder tag, because the related PR's branch was
not opened in the Cilium repository (but from a fork).

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Tam Mach <tam.mach@cilium.io>
@sayboras sayboras force-pushed the pr/v1.14-backport-2023-10-06 branch from 8e69f9c to 1f6f570 Compare October 6, 2023 23:44
@sayboras
Copy link
Member Author

sayboras commented Oct 6, 2023

/test-backport-1.14

@sayboras sayboras marked this pull request as ready for review October 6, 2023 23:46
@sayboras sayboras requested review from a team as code owners October 6, 2023 23:46
Copy link
Member

@pchaigno pchaigno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My PR looks good. Thanks!

@sayboras
Copy link
Member Author

sayboras commented Oct 9, 2023

Most of the reviews are in, CI is green, let's merge it in.

@sayboras sayboras added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Oct 9, 2023
@aanm aanm merged commit f124056 into v1.14 Oct 9, 2023
196 checks passed
@aanm aanm deleted the pr/v1.14-backport-2023-10-06 branch October 9, 2023 08:28
@maintainer-s-little-helper maintainer-s-little-helper bot removed the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Oct 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/1.14 This PR represents a backport for Cilium 1.14.x of a PR that was merged to main. kind/backports This PR provides functionality previously merged into master.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants