Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lagging follower commit after message drops #138

Open
3 tasks
pav-kv opened this issue Jan 25, 2024 · 2 comments · May be fixed by #140
Open
3 tasks

Lagging follower commit after message drops #138

pav-kv opened this issue Jan 25, 2024 · 2 comments · May be fixed by #140
Assignees
Labels
enhancement New feature or request

Comments

@pav-kv
Copy link
Contributor

pav-kv commented Jan 25, 2024

The test added in #137 demonstrates that a short network blip / message drop can lead to delaying the follower to learn that entries in its log are committed up to HeartbeatInterval + 3/2 * RTT. This is suboptimal, and can be reduced to HeartbeatInterval + 1/2 * RTT.

The reason why it takes an extra roundtrip is that the leader cuts the commit index sent to the follower at Progress.Match, i.e. it never sends a commit index exceeding the known match size of the follower's log. This is to ensure that the follower does not crash at receiving a commit index greater than its log size.

However, the follower can safely process out-of-bounds commit indices by simply (with a caveat: see below) cutting them at the log size on its end. In fact, the follower already does so for the commit indices it receives via MsgApp messages. When sending MsgApp, the leader does not cut the commit index, correspondingly.

Fix

Fixing this is a matter of:

  1. changing one line at the leader send side, and
  2. at the follower receive side, ensuring that the follower's log is a prefix of the leader's.

(2) is true after the follower received at least one successful MsgApp from this leader. From that point, the follower's log is guaranteed to be a prefix of the leader's log, and it is safe to assume that any index <= Commit is committed.

Issue

However, simply doing (1) is unsafe in mixed-version clusters (during a rolling upgrade). If there are followers running old code, they will crash upon seeing a high commit index. To fix the problem completely, there must be a safe migration. Only after all the nodes are running the new code, it is safe to do (1).

Action Items

Alternative Fix

The same delay reduction can be achieved if the leader sends an empty MsgApp after the HeartbeatInterval. We already have periodic empty MsgApp sends in a throttled state, although they are currently hooked in to MsgHeartbeatResp messages (to ensure MsgApp flow is conditional to some connectivity with the follower), which in this case will result in the same +1 RTT delay.

@pav-kv pav-kv added enhancement New feature or request good first issue Good for newcomers labels Jan 25, 2024
@pav-kv pav-kv removed the good first issue Good for newcomers label Jan 26, 2024
@pav-kv pav-kv linked a pull request Jan 26, 2024 that will close this issue
@pav-kv pav-kv self-assigned this Jan 26, 2024
@erikgrinaker
Copy link
Contributor

erikgrinaker commented Jan 30, 2024

Isn't this check in place to ensure that followers' logs don't regress? Do we have any other safeguards for that? We don't check that during MsgApp because the follower's log hasn't yet reached the commit index, but once a follower's log has reached the commit index (as recorded via Match) it should not regress because it may be a necessary part of the quorum establishing that commit index.

@pav-kv
Copy link
Contributor Author

pav-kv commented Jan 30, 2024

Isn't this check in place to ensure that followers' logs don't regress?

Yes. Currently, we rely on the fact that the follower's log is in sync with the leader's up to Match, so we can commit up to that. If we send a Commit index > Match, there is a chance that by the time the message arrives the follower either doesn't have this index, or still has some entries from previous terms that are inconsistent with the leader's log.

My proposal is: teach the follower to understand that it matches the leader's log. Then this safety check can be done at the follower end on receiving Commit index, rather than on the leader when sending it. The advantage of this: the follower knowns that it matches the leader 1/2 RTT earlier than the leader, so there is a potential to make follower Commit index progress faster by 1 RTT in some cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants