Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

raft: introduce persistent leader term in raft log #122446

Open
pav-kv opened this issue Apr 16, 2024 · 6 comments
Open

raft: introduce persistent leader term in raft log #122446

pav-kv opened this issue Apr 16, 2024 · 6 comments
Labels
A-kv-replication Relating to Raft, consensus, and coordination. C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-kv-replication KV Replication Team

Comments

@pav-kv
Copy link
Collaborator

pav-kv commented Apr 16, 2024

The raft log currently does not "remember" the last term of the leader who appended entries to the log. The state of a raft instance currently contains the Term of its latest vote, which might or might not be the leader. This means that the content of the log is not necessarily a prefix of this Term's leader.

The impact of this manifests in multiple ways:

We should introduce a "leader term" field into the state (both the HardState and the in-memory state of the raft log), with the following invariant:

Log.Entry[last].Term <= LeaderTerm <= Term

The LeaderTerm should be updated every time the log accepts an append from a leader. The "leader term" can be used for safety checks on the follower, before advancing the commit index. It can also be used for a simpler async log protocol.

Ultimately, the LeaderTerm is the missing piece of state that makes Raft log equivalent to Paxos acceptor (TODO: link to the doc). The equivalence is that the LeaderTerm of the log is the "max accepted proposal ID" in Paxos.

The introduction of LeaderTerm can be done as:

  1. A start-up migration that initializes LeaderTerm = Log.Entry[last].Term.
  2. Followed by running the code that maintains this field.

Jira issue: CRDB-37894

@pav-kv pav-kv added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) A-kv-replication Relating to Raft, consensus, and coordination. T-kv-replication KV Replication Team labels Apr 16, 2024
Copy link

blathers-crl bot commented Apr 16, 2024

cc @cockroachdb/replication

@nvanbenschoten
Copy link
Member

We should introduce a "leader term" field into the state (both the HardState and the in-memory state of the raft log)

We are also planning to persist Lead into HardState in the near future.

When we do so, we should be sure to not create confusion by also introducing a field called LeaderTerm, which is neither guaranteed to be HardState.Lead's term nor HardState.Term's leader.

@pav-kv
Copy link
Collaborator Author

pav-kv commented May 14, 2024

Naming-wise, I think we would be good with something like AccTerm / AcceptedTerm. This would also align with Paxos terminology. Or LogTerm, meaning that the state of the log is consistent with the leader at this term.

@pav-kv
Copy link
Collaborator Author

pav-kv commented May 14, 2024

@nvanbenschoten The Lead field will be guaranteed to match the Term though, right? We'll either have {Term=t, Lead=0} meaning that we don't know the leader yet, or {Term=t, Lead=n} meaning that we've learned the current-term leader. In both cases, AccTerm <= Term, and only reflects the state of the log.

@lyang24
Copy link
Collaborator

lyang24 commented May 20, 2024

do we need the leader term in Soft and Hard state?

@nvanbenschoten
Copy link
Member

The Lead field will be guaranteed to match the Term though, right? We'll either have {Term=t, Lead=0} meaning that we don't know the leader yet, or {Term=t, Lead=n} meaning that we've learned the current-term leader.

Yes, that is correct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-kv-replication Relating to Raft, consensus, and coordination. C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-kv-replication KV Replication Team
Projects
None yet
Development

No branches or pull requests

3 participants