-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpf: nodeport: minor DSR improvements #23326
bpf: nodeport: minor DSR improvements #23326
Conversation
/test |
cfb0ffc
to
ee0bda2
Compare
/test |
ee0bda2
to
7ad0b59
Compare
/test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked them this morning and they lgtm, nice work!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. One nit below but it's probably not worth repushing just for that.
Probably would be best to move new tests like the last commit into separate PRs in the future. We may want to backport changes here separately (e.g., don't backport refactoring but backport the test).
EDIT: Really nice cleanups and nice commit breakdown by the way ❤️
#define DSR_IPV4_OPT_32 0x9a080000 | ||
#define DSR_IPV4_OPT_MASK 0xffff0000 | ||
#define DSR_IPV4_DPORT_MASK 0x0000ffff | ||
#define DSR_IPV4_OPT_TYPE (IPOPT_COPY | 0x1a) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: We probably need to remove the Len line in the comment above, so clarify this is now encoding only the option number as it should be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, on my radar 👍 . Once the DSR code fully moves into nodeport.h
, this stuff goes with it (right next to the dsr_opt_v4 struct).
I'll need to push once more either way, to adjust for #23281. |
7ad0b59
to
35d2935
Compare
/test |
Currently we first extract the ports from the packet in their actual order, and then store them into the CT tuple in reverse order. Just follow the established pattern and immediately store into the CT tuple in reverse order. While at it also switch to the l4_load_ports() helper. Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
The preceding revalidate_data() always locates the L3 header at `data + ETH_HLEN`. So there's no need for a more generic approach when determining the L4 offset a few lines later. Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
Let bpf_lxc provide a validated L3 header to the low-level DSR code. Also pass this on to snat_*_create_dsr(). Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
Introduce a proper struct to get rid of all the complicated masking, and clearly spell out the .type and .len parts of the option. Use this struct to fill & extract the DSR info. The on-wire format stays the same, and this patch doesn't try to solve the endianness confusion for the .port and .addr fields. When fetching the IPv4 option words, use a single ctx_load_bytes(). We've previously checked ip4->ihl, so there's no risk that the packet might be too short. Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
The pre-defined ports are in network byte-order, do the same for the IP addresses. Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
Add tests for forwarding to a remote backend with DSR, where the LB needs to insert the IPv4 Option. We're not testing local-backend here (it's identical to non-DSR), and there's no reply path trough the LB node. Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
35d2935
to
8323e33
Compare
/test |
Looks like #22978 needs some more cycles. So let's split off some non-controversial cleanups & tests to reduce the review later on. I'll tag this PR for backports once #22978 is ready and needs it.
Best reviewed patch-by-patch - #23262 has some more context on the IPv4 option endianness.