Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cilium: improve 16bit ifindex probe on old kernels #32466

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

msune
Copy link
Contributor

@msune msune commented May 10, 2024

Previously, HaveFibIfindex probe checked the presence of commit d1c362e1dd68 in the Linux kernel indirectly, by verifying the presence of the redirect helpers (which were merged alongside).

Since d1c362e1dd68 has been backported to older kernels - but not helpers -, this commit improves HaveFibIfindex detection by running a live probe.

Fixes: #27642

On kernels where `d1c362e1dd68` has been backported, don't store `ifindex` in the CT map

@msune msune requested a review from a team as a code owner May 10, 2024 13:21
@msune msune requested a review from lmb May 10, 2024 13:21
@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label May 10, 2024
@github-actions github-actions bot added the kind/community-contribution This was a contribution made by a community member. label May 10, 2024
@msune
Copy link
Contributor Author

msune commented May 10, 2024

RFC

I don't think the unit is enough. I would think adding a check in integration/e2e that checks the flag is correctly detected would be a better approach than a unit (which I am not even sure is actually correct for backports as is). Please advise

@msune msune force-pushed the pr/27642_have_fix_index_probe branch from f8ba39f to 4733a89 Compare May 10, 2024 13:24
@julianwiedmann julianwiedmann added the release-note/minor This PR changes functionality that users may find relevant to operating Cilium. label May 10, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label May 10, 2024
@julianwiedmann julianwiedmann added sig/loader Impacts the loading of BPF programs into the kernel. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. labels May 10, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot removed dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. labels May 10, 2024
@julianwiedmann
Copy link
Member

/test

pkg/datapath/linux/probes/probes.go Outdated Show resolved Hide resolved
pkg/datapath/linux/probes/probes.go Outdated Show resolved Hide resolved
pkg/datapath/linux/probes/probes.go Show resolved Hide resolved
pkg/datapath/linux/probes/probes.go Show resolved Hide resolved
pkg/datapath/linux/probes/probes.go Outdated Show resolved Hide resolved
pkg/datapath/linux/probes/probes.go Outdated Show resolved Hide resolved

func TestDetectHaveFibIfindex(t *testing.T) {
testutils.PrivilegedTest(t)
testutils.SkipOnOldKernel(t, "5.10", "bpf: Always return target ifindex in bpf_fib_lookup")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? On old kernels it should still work, but return an error saying that the feature is not supported, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Humm.. HaveFibIfindex call will return err = ebpf.ErrNotSupported, so no, the test - as written here - shouldn't succeed for Kernels that don't have the patch.

But as I say in the comment, I see little value on this unit, as you don't have much control on which Kernel the CI is actually running. I think a much better approach would be to run it as part of the e2e or integr, where tests are run against specific kernels, making sure detection (true/false) is correct based on the specific kernel version.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question, I don't actually know how we test these. ci-runtime only runs on whatever CI has.

@msune msune force-pushed the pr/27642_have_fix_index_probe branch 3 times, most recently from a362f74 to 3c8f17e Compare May 14, 2024 07:31
Copy link
Contributor

@lmb lmb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a couple of vet errors due to fmt.Errorf.

pkg/datapath/linux/probes/probes.go Outdated Show resolved Hide resolved
@msune msune force-pushed the pr/27642_have_fix_index_probe branch from 3c8f17e to 0d9575c Compare May 14, 2024 10:18
Copy link
Contributor

@lmb lmb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lmb
Copy link
Contributor

lmb commented May 14, 2024

You might need a

go get golang.org/x/sys/unix@latest
go mod vendor

@julianwiedmann
Copy link
Member

julianwiedmann commented May 14, 2024

I was discussing with Marc a bit on how this might affect downgrades.

Considering the scenario where a 1.16 Cilium would detect HAVE_FIB_IFINDEX on a fixed 5.4 kernel, create new CT entries (without filling the ct_entry->ifindex) ... and then we downgrade into 1.15, which doesn't detect HAVE_FIB_IFINDEX and relies on the non-existent ct_entry->ifindex value. We could solve this by having some compat code in 1.16 that continues to fill the ifindex (without using it in the datapath), to enable the downgrade. And then in 1.17 remove that compat code.

But Marc also made the fair point that we didn't consider this downgrade aspect when merging #27622. Which also produced such CT states, that were not safe for downgrade.

@lmb
Copy link
Contributor

lmb commented May 14, 2024

Based on my earlier work to bump the minimum kernel version we know that at least ubuntu 20.04 lts started on a 5.4 kernel. But the current version in repos is 5.10: https://packages.ubuntu.com/search?suite=focal&searchon=names&keywords=linux-image Probably this would only people that hand roll their kernels / distros and are on an old version?

Previously, `HaveFibIfindex` probe checked the presence of
commit d1c362e1dd68 in the Linux kernel indirectly, by verifying
the presence of the redirect helpers (which were merged alongside).

Since `d1c362e1dd68` has been backported to older kernels - but
not helpers -, this commit improves `HaveFibIfindex` detection
by running a live probe.

Fixes: cilium#27642

Signed-off-by: Marc Suñé <marc.sune@isovalent.com>
@msune msune force-pushed the pr/27642_have_fix_index_probe branch from 0d9575c to ab0a12a Compare May 15, 2024 08:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/community-contribution This was a contribution made by a community member. release-note/minor This PR changes functionality that users may find relevant to operating Cilium. sig/datapath Impacts bpf/ or low-level forwarding details, including map management and monitor messages. sig/loader Impacts the loading of BPF programs into the kernel.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve HAVE_FIB_IFINDEX probe
3 participants