[specs][doc] FizzBee spec for the Venice's LeaderFollower protocol #958

jayaprabhakar · 2024-04-23T16:41:50Z

Summary, imperative, start upper case, don't end with a period

FizzBee spec for the Venice's LeaderFollower protocol, a one-to-one translation from the TLA+ syntax to the FizzBee syntax.

How was this PR tested?

Manually run it with the fizz model checker.

fizz specs/fizzbee/LeaderFollower/VeniceLeaderFollower.fizz

Does this PR introduce any user-facing changes?

No. You can skip the rest of this section.
Yes. Make sure to explain your proposed changes and call out the behavior change.

This is a literal transilation of TLA+ syntax to FizzBee syntax.

ZacAttack · 2024-04-23T18:25:34Z

Looks great! Could I also trouble you to include a readme file with the PR for how to run the spec/install fizz?

jayaprabhakar · 2024-04-23T18:40:39Z

Sure. Added the README file pointing to the instructions at. https://github.com/fizzbee-io/fizzbee

At present, there is no precompiled binary, I have the full instructions to compile from source that I have tested in MacBook and Ubuntu on EC2.

FelixGV

Thanks for sharing this! It is my first time reading FizzBee (and I am not that familiar with TLA+ either, so please bear with me...). I left a few comments, some of which may be for Zac, others which may be about how to model a given situation in FizzBee itself... I'd be interested to hear your thoughts on these. Thanks again!

FelixGV · 2024-05-17T16:31:35Z

specs/fizzbee/LeaderFollower/VeniceLeaderFollower.fizz

+    if nodes[id]['state'] == 'LEADER':
+        if nodes[id]['vtOffset'] >= len(version_topic):
+            real_time_consume(id)
+        else:
+            version_topic_consume(id)


This seems to mean: "if the leader's VT offset is below the end offset of the VT, then consume the VT, else consume the RT"

This makes sense I suppose, though in practice (within the Venice code) this particular decision only happens as part of the transition from follower to leader, and once it is determined ("somehow") that the leader can begin consuming from the RT, it will not go back on this decision (unless it is demoted to follower). In this spec, however, I don't see this being a "sticky" decision, it seems to be evaluated every time.

I guess the question is not necessarily for @jayaprabhakar but perhaps for @ZacAttack, since Jay may be just translating what is spelled out in the TLA+ spec. Zac, does the above make sense? Is the spec intentionally modeling a simplification of the real logic, or does this somehow map precisely to the system's behavior, and I'm just missing some detail?

The above makes sense. The TLA+ spec as written is currently evaluating weather or not it's caught up based on this, and it was written that way at the time as a simplification of the Venice code.

It might be more interesting to instead model it as a state. Like, the state of a node is either caught up or not caught up, and that state flips only after it's read through the VT to topic switch.

Thanks. I noticed this. In the current spec, if [nodeId].rtOffset (nodes[nodeId]['rtOffset']) of the leader is updated but not the [nodeId].vtOffset (nodes[nodeId]['vtOffset']). That would imply, when a leader is demoted, the newly demoted node would take a long time to catchup. So, the two ways to avoid slow and unnecessary catchup, was either to doing it continuously like the current spec, or to use a local state variable. I wasn't sure what the design decision was.

So I decided to translate the TLA+ spec as it is.

From the discussion, I assume, we just need to have a state variable to mark the follower as caught up.
(FYI, changing it by updating vtoffset + local state reduces the possible states and the model checking time significantly)

On other side, when the newly promoted leader catches up with the version topic, it still starts to consume real time topic from where that node left of last when it got demoted, instead of picking the offset from the latest version topic message.

That is clearly impractical. Can I also change that to use the offset from the latest version topic?

Yeah, a state variable is good enough. The way it works in production is the follower just keeps a highwatermark based on what's it's consumed from VT. So the offset from the latest version topic I think is good.

FelixGV · 2024-05-17T16:36:59Z

specs/fizzbee/LeaderFollower/VeniceLeaderFollower.fizz

+    nodes[nodeId]['rtOffset'] += 1
+    nodes[nodeId]['persistedRecords'][key] = value
+
+    # Append the consumed event to the version topic
+    version_topic.append((key, value, offset))


This may be is a minor detail, but in today's implementation, the order of operation is: VT append, local persistence, update RT offset. And the reason why we do it this way is so that we have idempotency in case we crash at any point in between the operations (and so, perhaps this function should not have the atomic keyword?)

We are actually considering changing the ordering of these operations (and it might be interesting to model it both ways), but that change (#910) is not integrated (or even necessarily fully agreed upon...) yet.

So actually, you touch on a really interesting facet that would probably tell us that we should consider either a refinement of the leader follower spec, or, tweaking it a bit. The way TLA+ works is that you define states. Currently, the model is very simplistic. It defines the leader actions of consuming from RT, updating local state, and producing to VT as a single state. If we wanted to vet the ordering of these actions, and make sure that the order is safe in the face of calamity, we should probably model all of them as separate states in the spec.

I can change this, but to model it correctly, I need one additional information. How do you handle duplicate message publication to the topic?
Kafka doesn't have a built deduplication like that of SQS or other queueing services. But I know Kafka streams and client side libraries handle it in a way this never becomes an issue.
Let me know, how should I model that? Can I assume the deduplication is somehow handled without actually implementing it?

The scenario:

RealTimeConsume
i. VT append ((k1,v1,0))
ii. Crash

RealTimeConsume
i. VT append (k1,v1,0)
ii. Crash
This can go on infinitely (leading to infinite state space)
So, I could quickly add a way to dedupe if an exact message is sent, the append would ignore and we will proceed with the remaining steps.

Correct me if I missed something.

So in Venice the way we essentially deal with it is we have write collision resolution (last writer wins). So I think your suggestion of append ignore is the right way to go.

FelixGV · 2024-05-17T16:40:17Z

specs/fizzbee/LeaderFollower/VeniceLeaderFollower.fizz

+    nodes[nodeId]['vtOffset'] += 1
+    nodes[nodeId]['persistedRecords'][key] = value


In a similar vein as above, we do the persistence first, and the VT offset increment (i.e. checkpointing) after, so that we tolerate crashing in-between the two operations (and we may want to remove atomic...).

Same as above response, because the model here is very simplistic (high level?) the line ordering here doesn't matter too much. It'd be better to split each of these into their own states.

1. Maintain a bool variable newly_promoted to indicate a new leader who has not caught up on previous version topics. 2. Continue to update vtOffset and rtOffset for each node irrespective of whether the node is a leader or follower. 3. Make the leader and follower consumption serial instead of atomic - The leader will first post to version_topic, then, update the local datastore, then update the offsets - Similarly, the follower will first update its local datastore and then update the offsets. The system can crash at either of these points, and the design is verified to work even if the system crashes in between

jayaprabhakar · 2024-05-20T23:31:42Z

@FelixGV @ZacAttack I have made the changes you asked for in this commit: c721c5e

Please take a look and let me know.

FizzBee spec for the Venice's LeaderFollower protocol

b9dedc9

This is a literal transilation of TLA+ syntax to FizzBee syntax.

README

c8b49f4

FelixGV reviewed May 17, 2024

View reviewed changes

jayaprabhakar requested review from ZacAttack and FelixGV May 20, 2024 23:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[specs][doc] FizzBee spec for the Venice's LeaderFollower protocol #958

[specs][doc] FizzBee spec for the Venice's LeaderFollower protocol #958

jayaprabhakar commented Apr 23, 2024 •

edited

ZacAttack commented Apr 23, 2024

jayaprabhakar commented Apr 23, 2024

FelixGV left a comment

FelixGV May 17, 2024

ZacAttack May 18, 2024

jayaprabhakar May 18, 2024

jayaprabhakar May 18, 2024 •

edited

ZacAttack May 24, 2024

FelixGV May 17, 2024

ZacAttack May 18, 2024 •

edited

jayaprabhakar May 18, 2024

ZacAttack May 24, 2024

FelixGV May 17, 2024

ZacAttack May 18, 2024

jayaprabhakar commented May 20, 2024

		nodes[nodeId]['vtOffset'] += 1
		nodes[nodeId]['persistedRecords'][key] = value

[specs][doc] FizzBee spec for the Venice's LeaderFollower protocol #958

Are you sure you want to change the base?

[specs][doc] FizzBee spec for the Venice's LeaderFollower protocol #958

Conversation

jayaprabhakar commented Apr 23, 2024 • edited

Summary, imperative, start upper case, don't end with a period

How was this PR tested?

Does this PR introduce any user-facing changes?

ZacAttack commented Apr 23, 2024

jayaprabhakar commented Apr 23, 2024

FelixGV left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jayaprabhakar May 18, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ZacAttack May 18, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jayaprabhakar commented May 20, 2024

jayaprabhakar commented Apr 23, 2024 •

edited

jayaprabhakar May 18, 2024 •

edited

ZacAttack May 18, 2024 •

edited