Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Client: Beacon Sync Backfilling Failure #3289

Open
holgerd77 opened this issue Feb 20, 2024 · 1 comment
Open

Client: Beacon Sync Backfilling Failure #3289

holgerd77 opened this issue Feb 20, 2024 · 1 comment

Comments

@holgerd77
Copy link
Member

The latest attempts to sync our client on the Holesky network have shown that we still have various caveats and stability issues in the code, particularly around the robustness of the sync process both regarding optimally not to break in the first place as well as then being able to recover in the second (third thing to mention might be to provide more explicit guidance in the logging messages what to do in certain cases, e.g. to mention to restart the client in defined cases where this will allow to recover).

The following is some log from a EthereumJS/Lighthouse sync:

grafik

As one can see backfilling stops after such an error and things are stalled.

I would very much assume this is not deterministically bound to the slot/block numbers and just occurs at some point. We should nevertheless see if we can trace what is internally happening in such a case and see if there is a fix.

I have run clients with the following commands:

EthereumJS (master)

npm run client:start -- --network=holesky --rpc --rpcEngine

Lighthouse (v4.6.0)

lighthouse bn --network=holesky --execution-endpoint=http://localhost:8551 --execution-jwt=[ PATH_TO_JWT_SECRET_FROM_ETHEREUMJS ] --checkpoint-sync-url=https://holesky.checkpoint.sigp.io --http

Might be possible to address this by going into the internals of this error message and then do some preparatory log output or console.log additions to then be able to grasp the failure if occuring again and then just start both clients and "wait for it to fail" (and do useful things as the main task). 😂

I have not tested yet if this would recover if restarted, will do so after writing this down. In case this will this would be a case for minimally adding this note to the log error output, since otherwise one is not getting to the idea to even try.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant