-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpf: nodeport: add RevDNAT-based FIB lookup for reply traffic #26638
bpf: nodeport: add RevDNAT-based FIB lookup for reply traffic #26638
Conversation
/test |
8e4d6b6
to
67f1862
Compare
/test |
67f1862
to
67659fc
Compare
/test |
67659fc
to
5a74b80
Compare
/test |
5a74b80
to
44d365e
Compare
/test |
44d365e
to
383e5ab
Compare
/test |
383e5ab
to
7c21344
Compare
/test |
7c21344
to
89f6c71
Compare
/test |
89f6c71
to
2996fb3
Compare
/test |
When replies by local / DSR backends enter to-netdev, they have been routed to the interface using the backend IP as source IP. But depending on what routing rules are installed on the system, the routing for the RevDNATed packet is different. Improve the RevDNAT code in to-netdev to first perform a FIB lookup (based on the src IP from the RevDNAT info), and if needed redirect the packet to the correct egress interface. This requires that bpf_host is also attached to the chosen egress iface, so that the actual RevDNAT rewrite gets performed there. Also update some of the tests to cover this scenario. [Note: ideally we would have this check much earlier in to-netdev, before even going through the HostFW code. But that requires shrinking the program size first, so postponing that to a follow-on PR] Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
2996fb3
to
a708962
Compare
/test |
Before merging this PR, it would be good to get Anyway, the suggested testing plan for
|
Rebased to resolve trivial |
send_trace_notify(ctx, obs_point, 0, 0, 0, 0, trace.reason, | ||
trace.monitor); | ||
if (ret == CTX_ACT_OK) | ||
send_trace_notify(ctx, obs_point, 0, 0, 0, 0, trace.reason, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For my education... do we only trace the forwarded packets at the destination interface?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if we have a strict precedent for it at the moment.
I could be convinced that throwing notifications for the intermediate interfaces has value. But then we should also apply proper CT-driven aggregration, and include the corresponding ifindex?
When moving the RevDNAT for backend replies from
bpf_lxc
intobpf_host
, we lost the routing based on the post-RevDNAT layout of the packet. Depending on the installed routing rules, we then end up sending the (RevDNATed) packet from the wrong egress interface.So as part of the RevDNAT processing in
to-netdev
, do a FIB lookup that considers the desired source IP. And redirect the packet if it's currently on the wrong interface. We intentionally delay the actual RevDNAT rewrite until the packet has reached the correct interface, so that the packet looks as expected to components into-netdev
(in particular HostFW).Long-term we would want
FIB_DONE
mark on the packet (similar toSNAT_DONE
), to skip additional lookups.from-container
(for BPF Host Routing) to already consider service RevDNAT.The initial idea was to have this check at the very top of the to-netdev program (see #26635). But we're hitting program size limits in 4.19 there, and improving that is too much of a refactor for a fix PR.
Fixes: #26166