Skip to content

ipsec, auth: Avoid out-of-order events that lead to inconsistent node IDs #27029

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Jul 27, 2023

Conversation

pchaigno
Copy link
Member

@pchaigno pchaigno commented Jul 24, 2023

We found out that the logic to allocate and deallocate node IDs had multiple bugs because it wasn't handling out-of-order events properly. In this context, out-of-order events are e.g. when pod objects are received before their corresponding node object. We discussed those issues at length in #26725 (see review) and concluded we should try to avoid having to deal with such out-of-order events altogether.

This pull request therefore removes all handling of node IDs on pod events. Node IDs are now entirely managed from node events. The node IDs are removed from the ipcache (ep->node_id). That simplifies the agent code a lot (all ipcache is uncoupled from node ID logic) and only slightly complicated the datapath logic (one more map lookup).

Summary of commits:

  • Commits 1-4 are preparatory work in the datapath and the agent, without functional changes.
  • Commit 5 moves the node ID allocation to be consistent with the deallocation.
  • Commit 6 introduces a new DROP reason.
  • Commits 7-8 remove the two consumers of ep->node_id in the datapath and replace them with an additional map lookup.
  • Commits 9-10 covers the new logic of the previous commit in BPF unit tests.
  • Commits 11-12 remove the node IDs from the ipcache and clean up related agent logic.
  • Commit 13 fixes a leak of node IDs when the IP addresses of a node are updated.
  • Commit 14 adds a runtime check for the bogus state fixed in the previous commit.
  • Commits 15-16 cover the node ID lifecycle in Golang integration tests.

See commits for details.

This PR is an alternative to the solution tried at #26725.

Fix a bug that could cause packet drops of type XfrmOutPolBlock when IPsec is enabled and node are recycled.
Fix a bug that could cause IPsec-encrypted packets to be sent to the wrong destination node when node churn is high.

@pchaigno pchaigno added release-note/bug This PR fixes an issue in a previous release of Cilium. area/encryption Impacts encryption support such as IPSec, WireGuard, or kTLS. needs-backport/1.11 feature/authentication labels Jul 24, 2023
@pchaigno pchaigno requested a review from jrfastab July 24, 2023 15:10
@pchaigno pchaigno force-pushed the avoid-ooo-events-for-node-ids branch 3 times, most recently from f1e103c to faeef2b Compare July 24, 2023 16:38
@jrfastab

This comment was marked as resolved.

@jrfastab
Copy link
Contributor

actually now that it is internal to datapath/node can we get rid of all the exported GetNodeIP, GetNodeID. But DumpNodeIDs is used from cmd so might still need that.

And one small question. On restart can we miss a NodeDelete? If that happens do we need some logic to correct or should the normal nodeDelete logic eventually get called and nodeID map will also be fixed up (I think this is true?).

Copy link
Contributor

@jrfastab jrfastab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is amazing!

@jrfastab
Copy link
Contributor

I agree with the overall assessment trying to save a map lookup in the datapath introduces too much complexity in user space to warrant the effort/complexity/risk and anyways this is going to an ipsec stack that is going to encrypt the entire thing anyways so a single map lookup is optimizing something with no/little gain.

@pchaigno pchaigno force-pushed the avoid-ooo-events-for-node-ids branch from faeef2b to a388830 Compare July 24, 2023 21:51
@pchaigno pchaigno force-pushed the avoid-ooo-events-for-node-ids branch 8 times, most recently from 1516e78 to 654ad8c Compare July 26, 2023 09:00
@pchaigno
Copy link
Member Author

actually now that it is internal to datapath/node can we get rid of all the exported GetNodeIP, GetNodeID. But DumpNodeIDs is used from cmd so might still need that.

The Mutual Auth feature is also still using GetNodeID & co. It's probably doable to replace that, but I'd rather avoid touching Mutual Auth too much in this PR. It's already quite big in terms of commits.

And one small question. On restart can we miss a NodeDelete? If that happens do we need some logic to correct or should the normal nodeDelete logic eventually get called and nodeID map will also be fixed up (I think this is true?).

AFAIK, yes, the NodeDelete logic gets called on restart in that case. We should manually test it though.

@pchaigno pchaigno marked this pull request as ready for review July 26, 2023 10:02
@pchaigno pchaigno requested a review from a team as a code owner July 26, 2023 10:02
mhofstetter added a commit to mhofstetter/cilium that referenced this pull request Jul 27, 2023
After removing the node IDS from the IPCache with cilium#27029, the auth
module should directly depend on the NodeIDHandler instead of the
IPCache.

This commit refactors the dependency from the IPCache to the
NodeIDHandler.

Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com>
dylandreimerink pushed a commit that referenced this pull request Jul 28, 2023
After removing the node IDS from the IPCache with #27029, the auth
module should directly depend on the NodeIDHandler instead of the
IPCache.

This commit refactors the dependency from the IPCache to the
NodeIDHandler.

Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com>
@pchaigno pchaigno added backport-pending/1.11 and removed needs-backport/1.11 backport/author The backport will be carried out by the author of the PR. labels Jul 31, 2023
@margamanterola margamanterola added backport-done/1.13 The backport for Cilium 1.13.x for this PR is done. backport-done/1.12 The backport for Cilium 1.12.x for this PR is done. and removed backport-pending/1.13 labels Aug 1, 2023
@pchaigno pchaigno linked an issue Aug 2, 2023 that may be closed by this pull request
2 tasks
@margamanterola margamanterola added backport-done/1.11 The backport for Cilium 1.11.x for this PR is done. and removed backport-pending/1.11 labels Aug 2, 2023
@sayboras sayboras mentioned this pull request Aug 3, 2023
11 tasks
@sayboras sayboras added backport-pending/1.14 The backport for Cilium 1.14.x for this PR is in progress. backport-done/1.14 The backport for Cilium 1.14.x for this PR is done. and removed needs-backport/1.14 backport-pending/1.14 The backport for Cilium 1.14.x for this PR is in progress. labels Aug 3, 2023
ldelossa pushed a commit to ldelossa/cilium that referenced this pull request Sep 27, 2023
[ upstream commit eb3c1d8 ]

After removing the node IDS from the IPCache with cilium#27029, the auth
module should directly depend on the NodeIDHandler instead of the
IPCache.

This commit refactors the dependency from the IPCache to the
NodeIDHandler.

Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com>
Signed-off-by: Julian Wiedmann <jwi@isovalent.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/encryption Impacts encryption support such as IPSec, WireGuard, or kTLS. backport-done/1.11 The backport for Cilium 1.11.x for this PR is done. backport-done/1.12 The backport for Cilium 1.12.x for this PR is done. backport-done/1.13 The backport for Cilium 1.13.x for this PR is done. backport-done/1.14 The backport for Cilium 1.14.x for this PR is done. feature/authentication release-note/bug This PR fixes an issue in a previous release of Cilium.
Projects
No open projects
Status: Released
Status: Released
Development

Successfully merging this pull request may close these issues.

Ever increasing Xfrm error count when ipsec is enabled
7 participants