Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

raft: make follower commit index advancement safe #122100

Open
pav-kv opened this issue Apr 10, 2024 · 4 comments · May be fixed by #124648
Open

raft: make follower commit index advancement safe #122100

pav-kv opened this issue Apr 10, 2024 · 4 comments · May be fixed by #124648
Assignees
Labels
A-kv-replication Relating to Raft, consensus, and coordination. A-leader-leases Related to the introduction of leader leases C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-kv-replication KV Replication Team

Comments

@pav-kv
Copy link
Collaborator

pav-kv commented Apr 10, 2024

A follower panics if commit index advances above its local log last entry index.

For this reason, the leader caps the commit index in heartbeat messages at Match index of the follower log. This may lead to delays in commit index advancement on the follower: etcd-io/raft#138.

We should make the follower's commit index advancement more flexible, and safer, and allow the leader to attach the latest commit index without capping it. The follower would then cap it locally.

When a leader@term sends a commit index, it means all entries <= this index are committed. Generally, the follower's log shares some prefix with the leader's log, and can commit up to the index at which it "forks" from the leader's log. A simple way for the follower to check that its log is fully consistent with the leader's log is to ensure that it has already accepted at least one append from this leader. After that, the follower can safely advance the commit index to min(commit, lastIndex).

Benefits of doing this:

  • Faster commit index convergence.
  • Safer code.
  • Commit index propagation is decoupled from replication flow (Match index), and possibly from heartbeats.

Issue etcd-io/raft#138 describes steps to achieve this. It is probably necessary to hide this change behind a version gate, so that in mixed-version clusters followers running the old code don't get a high commit index and panic.

Epic: CRDB-37516

@pav-kv pav-kv added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) A-kv-replication Relating to Raft, consensus, and coordination. T-kv-replication KV Replication Team labels Apr 10, 2024
Copy link

blathers-crl bot commented Apr 10, 2024

cc @cockroachdb/replication

lyang24 added a commit to lyang24/cockroach that referenced this issue Apr 10, 2024
We need the ability to safely ship backward incompatible changes in
the raft module. This commit plumbs version handle in to raft package
to enable version gate infrastructure in raft.

Touches: cockroachdb#122100

Release note: None
@sumeerbhola
Copy link
Collaborator

Will this potentially also help with raft regression panics such as:

F240416 22:22:39.291114 177248 go.etcd.io/raft/v3/log.go:322 ⋮ [T1,n391,s2703,r2145884/19:‹/Table/219/1/"5b{2ff6…-30aa…}›] 423 tocommit(413827) is out of range [lastIndex(413826)]. Was the raft log corrupted, truncated, or lost?

Related discussion in https://cockroachlabs.slack.com/archives/C06UNJ6PE3U/p1713827710088409?thread_ts=1713794326.949309&cid=C06UNJ6PE3U

@pav-kv
Copy link
Collaborator Author

pav-kv commented May 8, 2024

@sumeerbhola Yes, it will make the follower commit index update to a safe value, instead of panicking. The loss of durability could manifest in some other place though.

@pav-kv
Copy link
Collaborator Author

pav-kv commented May 16, 2024

@sumeerbhola After some discussion here and internally, there is a concern that, despite removing this panic, the underlying problem (loss of durability on a follower) jeopardizes safety and can manifest in another way.

Is the short term, it seems safer to still panic if the follower's log regresses. There is a way to do this without taking a dependency on the commit index: #124225.

In the long term, we should improve raft's resilience to durability regressions. One systematic way of doing this is described in #124278.

craig bot pushed a commit that referenced this issue May 23, 2024
122690: raft: advance commit index safely r=lyang24 a=lyang24

This change makes the commit index advancement in handleHeartbeat safe. Previously, a follower would attempt to update the commit index to whichever was sent in the MsgHeartbeat message. Out-of-bound indices would crash the node.

It is always safe to advance a commit index if the follower's log is "in sync" with the leader, i.e. when its log is guaranteed to be a prefix of the leader's log. This is always true if the term of last entry in the log matches the leader's term, otherwise this guarantee is established when the first MsgApp append message from the leader succeeds.

At the moment, the leader will never send a commit index that exceeds the follower's log size. However, this may change in future. This change is a defence-in-depth.

The newly added accTerm field will be used for other safety checks in the future, for example to establish that overriding a suffix of entries in raftLog is safe.

Part of #122100

Co-authored-by: lyang24 <lanqingy@usc.edu>
@pav-kv pav-kv linked a pull request Jun 5, 2024 that will close this issue
@nvanbenschoten nvanbenschoten added the A-leader-leases Related to the introduction of leader leases label Jun 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-kv-replication Relating to Raft, consensus, and coordination. A-leader-leases Related to the introduction of leader leases C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-kv-replication KV Replication Team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants