Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failure in Pod connectivity when an additional IP address on the primary interface in the same subnet is added and removed #8739

Open
svallala opened this issue Apr 19, 2024 · 4 comments

Comments

@svallala
Copy link

svallala commented Apr 19, 2024

Felix is incorrectly removing the directly connected route when it detects that an IP address is deleted even if there are additional addresses in the same subnet on the interface.

This is causing critical failures in the field.

Expected Behavior

Pod connectivity should not break

I have a 3 node IPv6 Kubernetes cluster with a VIP managed via Keepalived. When things are stable the routing table looks intact, pod subnets for the other nodes have the next hop correctly set as the Node IP Address.

VIP - fd74:ca9b:3a09:868c:10:9:121:136
Primary Node IP - fd74:ca9b:3a09:868c:10:9:61:181

[root@hypervvm-61-181 ~]# ip addr show dev br0
38: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether 00:15:5d:14:24:2a brd ff:ff:ff:ff:ff:ff
    inet6 **fd74:ca9b:3a09:868c:10:9:121:136/64** scope global deprecated nodad
       valid_lft forever preferred_lft 0sec
    inet6 **fd74:ca9b:3a09:868c:10:9:61:181/64** scope global
       valid_lft forever preferred_lft forever

Routing Table on the Host
-------------------------

[root@hypervvm-61-181 ~]# ip -6 route | grep fd74:ca9b:3a09:868c:
fd74:ca9b:3a09:868c:10:9:124:4d00/122 via **fd74:ca9b:3a09:868c:10:9:61:182** dev br0 proto bird metric 1024 pref medium
fd74:ca9b:3a09:868c:10:9:124:4d40/122 via **fd74:ca9b:3a09:868c:10:9:61:183** dev br0 proto bird metric 1024 pref medium
..
fd74:ca9b:3a09:868c::/64 dev br0 proto kernel metric 256 pref medium
default via fd74:ca9b:3a09:868c::1 dev br0 metric 1 pref medium

Bird routing table in the Calico Pod
-----------------------------------

[root@hypervvm-61-181 /]# birdcl6
BIRD v0.3.3+birdv1.6.8 ready.
bird> show route
....
fd74:ca9b:3a09:868c:10:9:124:4d40/122 via **fd74:ca9b:3a09:868c:10:9:61:183** on br0 [Mesh_fd74_ca9b_3a09_868c_10_9_61_183 21:09:17] * (100/0) [i]
fd74:ca9b:3a09:868c:10:9:124:4d00/122 via **fd74:ca9b:3a09:868c:10:9:61:182** on br0 [Mesh_fd74_ca9b_3a09_868c_10_9_61_182 21:09:16] * (100/0) [i]
**fd74:ca9b:3a09:868c::/64 dev br0 [direct1 21:09:15] * (240)**

If the VIP now moves to a different node, then the directly connected route is missing from the Calico Pod Bird routing table, because of this the Pod subnet routes are configrued with the next hop as the default gateway. This results in failure in Pod connectivity

VIP is now moved to a different node

[root@hypervvm-61-181 ~]# ip addr show dev br0
38: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ether 00:15:5d:14:24:2a brd ff:ff:ff:ff:ff:ff
    inet6 fd74:ca9b:3a09:868c:10:9:61:181/64 scope global
       valid_lft forever preferred_lft forever

Host Routing Table - Note that the next hop for the pod subnets is configured as the default gateway
-----------------------------------------------------------------------------------------------------

[root@hypervvm-61-181 ~]# ip -6 route | grep fd74:ca9b:3a09:868c:
fd74:ca9b:3a09:868c:10:9:124:4d00/122 via **fd74:ca9b:3a09:868c::1** dev br0 proto bird metric 1024 pref medium
fd74:ca9b:3a09:868c:10:9:124:4d40/122 via **fd74:ca9b:3a09:868c::1** dev br0 proto bird metric 1024 pref medium
....
fd74:ca9b:3a09:868c::/64 dev br0 proto kernel metric 256 pref medium
default via fd74:ca9b:3a09:868c::1 dev br0 metric 1 pref medium

Bird Routing table in the Calico Pod - The directly connected route for subnet fd74:ca9b:3a09:868c::/64 is missing
-------------------------------------------------------------------------------------------------------------------

bird> show route
....
fd74:ca9b:3a09:868c:10:9:124:4d40/122 via **fd74:ca9b:3a09:868c::1** on br0 [Mesh_fd74_ca9b_3a09_868c_10_9_61_183 21:09:18 from fd74:ca9b:3a09:868c:10:9:61:183] * (100/?) [i]
fd74:ca9b:3a09:868c:10:9:124:4d00/122 via **fd74:ca9b:3a09:868c::1** on br0 [Mesh_fd74_ca9b_3a09_868c_10_9_61_182 21:09:17 from fd74:ca9b:3a09:868c:10:9:61:182] * (100/?) [i]

I am able to reproduce the issue even without VIP movement. I just have to add an additional IP address in the same subnet to the primary interface and then remove it. Looks like when Felix detects that an IP address is removed, it incorrectly is removing the directly connected route entry even if there are additional IP addresses on the interface in the same subnet

Your Environment

  • Calico version: v3.24.3
  • Kubernetes Version: v1.29.0
  • Operating System and version: 3.10.0-1160.62.1.el7.x86_64
@caseydavenport
Copy link
Member

This does seem strange, especially considering the interface still has an IP address within that subnet even after the VIP is removed. Seems likely to be related to BIRD's / BGP next hop calculation rather than Felix though.

Perhaps worth looking into whether or not the remote nodes have changed the next hop address on the advertised BGP routes as well, in case it's a peer issue rather than an issue with the local route resolution.

@svallala
Copy link
Author

svallala commented May 2, 2024

This does seem strange, especially considering the interface still has an IP address within that subnet even after the VIP is removed. Seems likely to be related to BIRD's / BGP next hop calculation rather than Felix though.

Perhaps worth looking into whether or not the remote nodes have changed the next hop address on the advertised BGP routes as well, in case it's a peer issue rather than an issue with the local route resolution.

@caseydavenport yes you are right, seems like an issue in BIRD. The peers are not impacted its only the local route resolution. For now, as a workaround adding a static route for the subnet.

@nelljerram
Copy link
Member

Our BIRD fork is based on an upstream BIRD version (v1.6.8) that is now a little old, and it's possible that this has been fixed in upstream BIRD since v1.6.8. If an interested party would like to investigate that and identify the relevant change (if there is one), we could certainly look at cherry-picking that to our fork.

@abasitt
Copy link

abasitt commented May 17, 2024

@nelljerram thank you for pointing about the possible bug. It's indeed a bug in v1.6.8.
below is the result from v1.6.8.

bash status.sh 

Initial BIRD Routes
────────────────────────────────────────
BIRD 1.6.8 ready.
::/0               via fd00:1::1 on eth0 [kernel1 08:58:31] * (10)
fd00:1::/64        dev eth0 [direct1 08:58:31] * (240)
fd00:10::/64       dev eth0 [static1 08:58:31] * (200)
fd00:11::/64       via fe80::42:c0ff:fea8:2002 on eth0 [bgp1 08:58:39 from fd00:1::3] * (100/0) [AS65002i]

Routes After IP Addition
────────────────────────────────────────
BIRD 1.6.8 ready.
::/0               via fd00:1::1 on eth0 [kernel1 08:58:31] * (10)
fd00:1::/64        dev eth0 [direct1 08:58:31] * (240)
fd00:10::/64       dev eth0 [static1 08:58:31] * (200)
fd00:11::/64       via fe80::42:c0ff:fea8:2002 on eth0 [bgp1 08:58:39 from fd00:1::3] * (100/0) [AS65002i]

Routes After IP Deletion
────────────────────────────────────────
BIRD 1.6.8 ready.
::/0               via fd00:1::1 on eth0 [kernel1 08:58:30] * (10)
fd00:10::/64       dev eth0 [static1 08:58:30] * (200)
fd00:11::/64       via fd00:1::1 on eth0 [bgp1 08:58:38 from fd00:1::3] * (100/?) [AS65002i]

below is the result from bird2

bash status.sh 

Initial BIRD Routes
────────────────────────────────────────
BIRD 2.14 ready.
Table master6:
::/0                 unicast [kernel1 09:02:48.030] * (10)
        via fd00:1::1 on eth0
fd00:11::/64         unicast [static1 09:02:48.028] * (200)
        dev eth0
fd00:12::/64         unicast [p1 09:02:49.736] * (100) [AS65002i]
        via fd00:1::3 on eth0

Routes After IP Addition
────────────────────────────────────────
BIRD 2.14 ready.
Table master6:
::/0                 unicast [kernel1 09:02:48.030] * (10)
        via fd00:1::1 on eth0
fd00:11::/64         unicast [static1 09:02:48.028] * (200)
        dev eth0
fd00:12::/64         unicast [p1 09:02:49.736] * (100) [AS65002i]
        via fd00:1::3 on eth0

Routes After IP Deletion
────────────────────────────────────────
BIRD 2.14 ready.
Table master6:
::/0                 unicast [kernel1 09:02:48.030] * (10)
        via fd00:1::1 on eth0
fd00:11::/64         unicast [static1 09:02:48.028] * (200)
        dev eth0
fd00:12::/64         unicast [p1 09:02:49.736] * (100) [AS65002i]
        via fd00:1::3 on eth0

The setup can be recreate here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants