Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data structure discrepancy on deferred attestation #7805

Closed
hcnam opened this issue Dec 8, 2023 · 6 comments
Closed

Data structure discrepancy on deferred attestation #7805

hcnam opened this issue Dec 8, 2023 · 6 comments

Comments

@hcnam
Copy link

hcnam commented Dec 8, 2023

Description

We’ve checked several implementations, but they adopt different data structures for deferring future attestations: Lighthouse and Nimbus are using queue, and Prysm and Teku using map to save early received attestations (Prysm uses non-deterministic map, but Teku uses deterministic map structure).
Due to the above discrepancy, if a slashable pair of attestation (like double voting two different block that shares the same parent block) enter onAttestation, Prysm makes a non-deterministic fork choice result. See scenario below.

Steps to Reproduce

Scenario
Assumption: No attestation except Attestation X and Y, or B and C has same vote. attester_slashing was not received yet.
Steps:

 N      N+1     N+2  
[A]  –  [B]  –  [ ] 
     \  [ ]  –  [C] 
 
Slot N   : Block A
Slot N+1 : Block B (parent: A)
Slot N+2 : Block C (parent also A)
           Receive Attestation_0 from validator X of slot N+3 (attesting block B)
           Receive Attestation_1 from validator X of slot N+3 (attesting block C)
Slot N+3 : Check head

Expect: head is B due to other use queue for deferred attestation (later ignored)
Result: Teku deterministically makes the same decisions while Prysm nondeterministically makes different decisions (Note that the results from Teku and Prysm may be different from Lighthouse and Nimbus).

References:
Lighthouse: queued_attestations
Teku: deferredAttestations, futureAttestations
Prysm: processAttestations, AttCaches
Nimbus: queuedAttestations

@rolfyone
Copy link
Contributor

I'm not sure how to progress this one, Is the assertion here that we're making an incorrect decision? Any future / deferred attestations should be not processed and how they're queue'd shouldn't matter?
Deferred attestations for example are typically deferred because we don't have the block it's referring to iirc, and in that case they can't count towards fork choice...
I'm not sure if I'm missing something, so I've put my line of thinking here in the hope that maybe there's a clarification on what we want out of this issue before I categorise how to proceed...

@hcnam
Copy link
Author

hcnam commented Dec 11, 2023

Before starting, I'm not saying that Teku is wrong. Teku is doing well under the consensus specification.
It's just that the current specification doesn't specify how to defer fork choice when handling with future attestations that vote on a known block (attestation for unknown block will ignored on assertion), which seems to have led to discrepancy between different clients in implementing it.
Therefore, this question relates to a mechanism to defer attestations from the future that target already known blocks. As I remember, in consensus-spec, validate_on_attestation asks to delay consideration for future attestations until their slot is in the past, but not mentioning how.

For example, in the above scenario, validator X is making double-voting (yes, this is slashable) attestation and releasing it before the proper slot. Therefore, both double-voting attestations are considered attestations from the future.
Now, the implementations that are using queue (like Lighthouse or Nimbus) will store those future attestations on queue, and try to handle it when the proper slot has arrived with the arrival order. That is, if the arrival order of attestation is [att_0, att_1] then, the client will process att_0 properly and ignore att_1 because it is duplicated.

However, map -based implementations (like Teku or Prysm) will make different decisions (because, as you know, the map is based on key-value unlike the queue, which doesn't guarantee the order). Therefore, it is not 100% deterministic whether the processing order will be [att_0, att_1] or [att_1, att_0] unlike queue-based implementations.

This is a minor and non-urgent discrepancy. Because, as you know, in the real world this may not be a big problem due to there being more than 900K validators and a lot of slashers are watching it. Also, the order of arrival of the attestations will be unpredictable (except when packet orders are manipulated by a larger-scale network attacker).
But, I want to suggest that the possibility exists that such a minor discrepancy like this may lead to real problems in the future.

@michaelsproul
Copy link

michaelsproul commented Dec 12, 2023

@hcnam This is an interesting idea! I think your issue probably belongs on the consensus-specs, because as you said the queueing order isn't defined by the specification.

My intuition is that it's kind of OK not to specify the queueing order because attestation arrival is unpredictable anyway. The P2P gossip conditions don't mandate propagation of early attestations, so an attacker could quite easily queue their different attestations on different nodes even without the different queue structures (e.g. gossip just att1 to some nodes, and gossip just att2 to others). When the attestation slot(s) arrive and they are processed, some clients will have fork choice output based off att1 and others will have output based off att2. This is called "view splitting", and in our case should eventually be reconciled by a block that includes one of the attestations, causing the other side to detect the slashing (and broadcast it).

There are some proposals floating around for making fork choice more resilient to these type of split views, e.g. https://ethresear.ch/t/view-merge-as-a-replacement-for-proposer-boost/13739. You might like to get involved and contribute to that effort? IMO it would be great to prove some resilience properties about the combination of the P2P conditions & the fork choice spec.

@michaelsproul
Copy link

michaelsproul commented Dec 12, 2023

To add some nuance to the above, there are a few complicating factors to the gossip-based view split:

  1. Re-queueing future attestations is optional, and some implementations (including Lighthouse) choose to just drop them: https://github.com/sigp/lighthouse/blob/441fc1691b69f9edc4bbdc6665f3efab16265c9b/beacon_node/network/src/network_beacon_processor/gossip_methods.rs#L1603-L1628.
  2. If we (Lighthouse) were to requeue future attestations, as we do for attestations to unknown blocks, then we would likely delay marking them as IGNOREd on gossipsub until they were re-processed. If they were processed successfully we would ACCEPT them, which would result in them being gossiped to our peers. i.e. the network wouldn't need to wait for the attestation to be included in a block before reconciling the split view.

@rolfyone
Copy link
Contributor

If we're moving this to consensus-specs shall I close this issue here? Not much we can really do to progress this directly...

@hcnam
Copy link
Author

hcnam commented Dec 19, 2023

I've moved this issue to consensus-specs/issues/3498

@hcnam hcnam closed this as completed Dec 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants