-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v1.14 Backports 2023-10-06 #28442
v1.14 Backports 2023-10-06 #28442
Conversation
ddfb238
to
1a73eff
Compare
/test-backport-1.14 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My commit looks good, thanks!
ah thanks a lot, I will remove backport label from your PR. |
1a73eff
to
8e69f9c
Compare
ENI mode creates routing tables with index `10 + eni-index`. This commit documents that and mentions that those indices are not taken by the system. Suggested-by: Chris Tarazi <chris@isovalent.com> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
[ upstream commit def1770 ] TL;DR. This commit works around a Cilium iptables bug that can affect IPsec. The root of the issue for IPsec is a conflict between the packet mark we use and some iptables-nft rules. The workaround changes the packet mark to avoid the conflict. Cilium's IPsec implementation relies on the packet mark to match packets against XFRM rules. In particular, the packet mark holds the node ID (0xffff0000), the SPI (0xf000), and whether to encrypt or decrypt (0xf00). On egress, the packet mark is written in bpf_lxc before the packet is sent to the stack for encryption. The encryption layer retains the packet mark, except for the encryption bit (0xf00) in some cases. bpf_host then processes the encrypted packet and clears the mark. In some corner cases (what causes this isn't clear yet), some OSes can end up with kube-proxy rules in both iptables-nft and iptables-legacy. Cilium is currently unable to handle that situation properly: it should install iptables rules to skip some of kube-proxy's rules, but only does that for iptables-legacy. We can therefore end up with the following rules matching outgoing packets: [31061:1736670] -A POSTROUTING -m comment --comment "kubernetes postrouting rules" -j KUBE-POSTROUTING [31059:1736390] -A KUBE-POSTROUTING -m mark ! --mark 0x4000/0x4000 -j RETURN [2:280] -A KUBE-POSTROUTING -j MARK --set-xmark 0x4000/0x0 [2:280] -A KUBE-POSTROUTING -m comment --comment "kubernetes service traffic requiring SNAT" -j MASQUERADE --random-fully We can see that kube-proxy install a KUBE-POSTROUTING chain in the POSTROUTING netfilter chain. That chain skips all packets not marked with 0x4XXX. Other packets see the mark removed and are then masqueraded. In the above example, 2 packets had the 0x4XXX mark and were masqueraded. These rules can end up matching the mark of our encrypted packets, if the SPI is 4. Encrypted packets are then incorrectly masqueraded. Depending on the node and network configuration, they may end up dropped. The SPI is incremented on key rotation (from 1 to 15, then going back to 1 again). Users performing key rotations are therefore highly likely to hit this bug if they have iptables-nft kube-proxy rules installed. We can work around this bug by clearing all packet mark bits we don't need anymore after the encryption. Once the packet is encrypted, bpf_host only needs the 0xf00 part of the mark to determine if a packet is encrypted or not. The node ID and SPI can be cleared from the mark. This commit therefore clears those bits right after encryption, with output-mark, on egress and ingress. Note this only avoids the IPsec impact of this bug, but doesn't fix the underlying issue with iptables-nft. More work is needed to have the agent correctly handle this situation. Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> Signed-off-by: Tam Mach <tam.mach@cilium.io>
[ upstream commit 5741606 ] These targets failed to select the nodes of a kind cluster. This change selects the right variables that are used to check for the cluster nodes. Fixes: 5f88bbf ("docs: Add Makefile and documentation for "fast" development targets") Signed-off-by: André Martins <andre@cilium.io> Signed-off-by: Tam Mach <tam.mach@cilium.io>
[ upstream commit e773f96 ] Currently, we are installing a fixed version of the AWS VPC CNI plugin version (i.e., v1.11) regardless of the Kubernetes version. Yet, this means that in certain cases we upgrade it, while in others we downgrade, it, introducing unnecessary churn. Let's instead use the default one that gets installed with the given k8s version. Specifically, for the k8s versions currently tested: * k8s: v1.23 - AWS VPC CNI: v1.10.4-eksbuild.1 * k8s: v1.24 - AWS VPC CNI: v1.11.4-eksbuild.1 * k8s: v1.25 - AWS VPC CNI: v1.12.2-eksbuild.1 * k8s: v1.26 - AWS VPC CNI: v1.12.5-eksbuild.2 * k8s: v1.27 - AWS VPC CNI: v1.12.6-eksbuild.2 Retrieved through: for MINOR in $(seq 23 27) do echo -n "k8s: v1.$MINOR - AWS VPC CNI: " aws eks describe-addon-versions --addon-name vpc-cni \ --kubernetes-version 1.$MINOR --output yaml | \ yq '.addons[].addonVersions[] | select(.compatibilities[].defaultVersion == true) | .addonVersion'; done Signed-off-by: Marco Iorio <marco.iorio@isovalent.com> Signed-off-by: Tam Mach <tam.mach@cilium.io>
[ upstream commit 0397bf4 ] Signed-off-by: Raphaël Pinson <raphael@isovalent.com> Signed-off-by: Tam Mach <tam.mach@cilium.io>
[ upstream commit 262d59e ] Manual update for docs-builder tag, because the related PR's branch was not opened in the Cilium repository (but from a fork). Signed-off-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Tam Mach <tam.mach@cilium.io>
8e69f9c
to
1f6f570
Compare
/test-backport-1.14 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My PR looks good. Thanks!
Most of the reviews are in, CI is green, let's merge it in. |
datapath: fix NodePort to remote hostns backend with tunnel config #27323 (@michaelasp)RouteTableInterfacesOffset
in system requirements #28358 (@gandro)docs: Update Kubernetes Gateway-API version to v0.8.1 #28388 (@haiyuewa)Once this PR is merged, you can update the PR labels via:
or with