Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kubevirt hot migrate failed with fixed ip address in ipip mode. #8663

Open
GaoChX opened this issue Mar 28, 2024 · 1 comment · May be fixed by #8671
Open

Kubevirt hot migrate failed with fixed ip address in ipip mode. #8663

GaoChX opened this issue Mar 28, 2024 · 1 comment · May be fixed by #8671

Comments

@GaoChX
Copy link

GaoChX commented Mar 28, 2024

Expected Behavior

When the virtual machines of kubevirt are being migrated, a virt-launcher pod with the same IP address is created on another node. And began the migration process. After the migration is completed, the old pod will change to the 'Completed' status.

Current Behavior

This functions normally in vxlan mode, but does not work properly in IPIP mode.

Possible Solution

The issue lies in this line of code: (https://github.com/GaoChX/calico/blob/71d6f8385a6272fc517e192fabc0898f7f565792/cni-plugin/pkg/dataplane/linux/dataplane_linux.go#L346)

In VXLAN mode, the remote route maintained by the other compute node is a subnet, but in IPIP mode, it is a /32 host route be maintained by bird. Therefore, adding a route in this manner can lead to conflicts, resulting in an error being returned.

Steps to Reproduce (for bugs)

  1. Set calico work in ipip mode.
  2. Create a kubevirt pod with fixed ip address.
  3. Ensure that there are more than two nodes available.
  4. Try to migrate.

Context

Here is error log:

Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "206aa4ad8adc7966a48670507ec39cda27b4a13703905d5ca6499d7d33e8cc61": plugin type="multus" name="multus-cni-network" failed (add): [xcloud-default/virt-launcher-gcx-test-lhqmh/548833b8-b463-44ab-82c3-82000adc737b:gcx-test-nad-pod-network-eth1]: error adding container to network "gcx-test-nad-pod-network-eth1": error adding host side routes for interface: cali8ba68918cba, error: route (Ifindex: 57, Dst: 100.66.0.2/32, Scope: link) already exists for an interface other than 'cali8ba68918cba': route (Ifindex: 8, Dst: 100.66.0.2/32, Scope: universe, Iface: tunl0)

I tried replacing RouteAdd with RouteReplace, and it worked very well.

Your Environment

  • Calico version v3.26.1
  • Orchestrator version (e.g. kubernetes, mesos, rkt): Server Version: v1.28.3+rke2r2
  • Operating System and version: Ubuntu 22.04.1 LTS
  • Link to your project (optional):
@caseydavenport
Copy link
Member

In VXLAN mode, the remote route maintained by the other compute node is a subnet, but in IPIP mode, it is a /32 host route be maintained by bird

The remote route maintained by BIRD should also be a subnet for the IPAM block that contains the route, not a /32.

The main times you should see /32 routes is when you have over-provisioned your IP pool (resulting in IP borrowing from other nodes) or if e.g., the IP pool itself has been deleted.

I think the first step is figuring out why you're seeing a /32 route advertised in this case instead of using the aggregated route.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants