Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpf: dsr: restore CB_SRC_LABEL across DSR-INGRESS tail-call #24794

Merged
merged 3 commits into from
Apr 12, 2023

Conversation

julianwiedmann
Copy link
Member

@julianwiedmann julianwiedmann commented Apr 10, 2023

#22978 introduced a new pair of DSR-specific tail-calls that get called from nodeport_lb*(). These tail-calls apply some DSR handling, then call ctx_skip_nodeport_set() and return to the "program start" by tail-calling to CILIUM_CALL_IPV*_FROM_NETDEV.

Such tail-call chains in the nodeport code are responsible for restoring CB_SRC_LABEL, so that it's available again when eg. bpf_host.c:tail_handle_ipv4() wants to read it. But we missed doing so for the new DSR-Ingress tail-calls - and the IPv6 variant currently doesn't offer any free CB slot to transfer the src_label over. So first stop using the CB for transferring DSR information, and then add code to restore the CB_SRC_LABEL.

@julianwiedmann julianwiedmann added kind/bug This is a bug in the Cilium logic. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. release-note/misc This PR makes changes that have no direct user impact. labels Apr 10, 2023
@julianwiedmann
Copy link
Member Author

/test

Shuffle the two DSR-INGRESS tail-calls into the ENABLE_DSR-guarded section,
so that we can add calls to DSR-only code to them in a subsequent patch.

Their callers are already guarded in the same fashion.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
We need to free up the CB_SRC_LABEL slot, but at least for IPv6 all skb->cb
slots are already in use. So stop using the CB for transporting the DSR
info, but just extract it a second time inside the tail-call.

There's opportunities to clean this up more, and also alternative
approaches (eg. use a per-CPU map entry). But let's keep things
intentationally simple for now.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
All nodeport code paths that tail-call back to IPV*_FROM_NETDEV need to
restore the CB_SRC_LABEL with `src_identity`.

We missed doing so for the DSR-Ingress tail-calls.

Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
@julianwiedmann julianwiedmann marked this pull request as ready for review April 10, 2023 13:27
@julianwiedmann julianwiedmann requested a review from a team as a code owner April 10, 2023 13:27
@julianwiedmann
Copy link
Member Author

julianwiedmann commented Apr 10, 2023

/test

Job 'Cilium-PR-K8s-1.26-kernel-net-next' failed:

Click to show.

Test Name

K8sDatapathConfig Host firewall With VXLAN and endpoint routes

Failure Output

FAIL: Failed to reach 10.0.0.15:80 from testclient-host-b9zm9

Jenkins URL: https://jenkins.cilium.io/job/Cilium-PR-K8s-1.26-kernel-net-next/1739/

If it is a flake and a GitHub issue doesn't already exist to track it, comment /mlh new-flake Cilium-PR-K8s-1.26-kernel-net-next so I can create one.

@julianwiedmann
Copy link
Member Author

(CI was green, just needed a trivial rebase)

Copy link
Contributor

@aspsk aspsk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, left a few [minor] questions

int ret, l4_off;
__be16 port;
__be16 port = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you init port?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needed to init the addr to appease the verifier, and figured why not go the whole way. But yeah, looks like it would pass either way.

ret = nodeport_extract_dsr_v6(ctx, ip6, &tuple, l4_off, &addr, &port, &dsr);
if (IS_ERR(ret))
goto drop_err;
if (!dsr) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this happen at all?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, totally unexpected - same as for IPv4 (where I added a comment).

/* Be paranoid about leaking info back to the main path: */
ctx_store_meta(ctx, CB_PORT, 0);
ctx_store_meta(ctx, CB_ADDR_V4, 0);
ret = nodeport_extract_dsr_v4(ctx, ip4, &tuple, l4_off, &addr, &port, &dsr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How expensive is to call this? (The question is - is this ok to call this here in favor of mirroring v4/v6 code, or should we still use the CB to pass info in IPv4 case?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope not too expensive. I think we should (1) keep IPv4/IPv6 in sync, and (2) not use the CB at all for such tail-call chains (given that we're already not fully owning it).

Would rather try to follow-up with by optimizing the first call (in nodeport_lb4()), or passing the data via a per-CPU map.

@julianwiedmann
Copy link
Member Author

net-next flaked with #15455

@julianwiedmann julianwiedmann added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Apr 12, 2023
@dylandreimerink
Copy link
Member

The 1.26 on net-next test fails with flake #15455. Rerunning

@dylandreimerink
Copy link
Member

/test-1.26-net-next

@dylandreimerink dylandreimerink merged commit 2c05517 into cilium:master Apr 12, 2023
42 checks passed
@julianwiedmann julianwiedmann deleted the 1.14-dsr-sec-label branch April 12, 2023 11:43
@julianwiedmann julianwiedmann added the backport-done/1.13 The backport for Cilium 1.13.x for this PR is done. label Apr 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-done/1.13 The backport for Cilium 1.13.x for this PR is done. kind/bug This is a bug in the Cilium logic. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/misc This PR makes changes that have no direct user impact. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants