Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What CL behaviour would trigger beacon syncer reorging? #27962

Closed
michaelsproul opened this issue Aug 22, 2023 · 3 comments
Closed

What CL behaviour would trigger beacon syncer reorging? #27962

michaelsproul opened this issue Aug 22, 2023 · 3 comments

Comments

@michaelsproul
Copy link

michaelsproul commented Aug 22, 2023

We have a report from a Lighthouse user that their Geth node is failing to sync with "beacon syncer reorging" errors, despite no apparent re-orgs on the CL side.

I only have a short excerpt of logs so far:

Aug 21 19:21:29 NUC-1 lighthouse[37902]: Aug 22 02:21:29.001 WARN Syncing deposit contract block cache    est_blocks_remaining: initializing deposits, service: slot_notifier
Aug 21 19:21:41 NUC-1 lighthouse[37902]: Aug 22 02:21:41.006 INFO Syncing                                 est_time: 10 mins, speed: 0.12 slots/sec, distance: 76 slots (15 mins), peers: 108, service: slot_notifier
Aug 21 19:21:41 NUC-1 lighthouse[37902]: Aug 22 02:21:41.007 WARN Syncing deposit contract block cache    est_blocks_remaining: initializing deposits, service: slot_notifier
Aug 21 19:21:41 NUC-1 lighthouse[37902]: Aug 22 02:21:41.441 WARN Execution engine call failed            error: ServerMessage { code: -32000, message: "beacon syncer reorging" }, service: exec
Aug 21 19:21:41 NUC-1 lighthouse[37902]: Aug 22 02:21:41.441 WARN Error whilst processing payload status  error: Api { error: ServerMessage { code: -32000, message: "beacon syncer reorging" } }, service: exec
Aug 21 19:21:41 NUC-1 lighthouse[37902]: Aug 22 02:21:41.442 CRIT Failed to update execution head         error: ExecutionForkChoiceUpdateFailed(EngineError(Api { error: ServerMessage { code: -32000, message: "beacon syncer reorging" } })), service: beacon
Aug 21 19:21:53 NUC-1 lighthouse[37902]: Aug 22 02:21:53.002 INFO Syncing                                 est_time: 10 mins, speed: 0.12 slots/sec, distance: 77 slots (15 mins), peers: 102, service: slot_notifier
Aug 21 19:21:53 NUC-1 lighthouse[37902]: Aug 22 02:21:53.003 WARN Syncing deposit contract block cache    est_blocks_remaining: initializing deposits, service: slot_notifier
Aug 21 19:22:05 NUC-1 lighthouse[37902]: Aug 22 02:22:05.006 INFO Syncing                                 est_time: 5 mins, speed: 0.23 slots/sec, distance: 73 slots (14 mins), peers: 107, service: slot_notifier
Aug 21 19:22:05 NUC-1 lighthouse[37902]: Aug 22 02:22:05.008 WARN Syncing deposit contract block cache    est_blocks_remaining: initializing deposits, service: slot_notifier
Aug 21 19:22:17 NUC-1 lighthouse[37902]: Aug 22 02:22:17.001 INFO Syncing                                 est_time: 4 mins, speed: 0.25 slots/sec, distance: 69 slots (13 mins), peers: 110, service: slot_notifier
Aug 21 19:22:17 NUC-1 lighthouse[37902]: Aug 22 02:22:17.011 WARN Syncing deposit contract block cache    est_blocks_remaining: initializing deposits, service: slot_notifier
Aug 21 19:22:17 NUC-1 lighthouse[37902]: Aug 22 02:22:17.550 WARN Execution endpoint is not synced        last_seen_block_unix_timestamp: 0, endpoint: http://127.0.0.1:8551/, auth=true, service: deposit_contract_rpc
Aug 21 19:22:17 NUC-1 lighthouse[37902]: Aug 22 02:22:17.551 ERRO Error updating deposit contract cache   error: Failed to get remote head and new block ranges: EndpointError(FarBehind), retry_millis: 60000, service: deposit_contract_rpc

The user seems to be running the latest versions of Lighthouse (v4.3.0) and Geth (v1.12.2), and hasn't missed a hard-fork as far as I know. Their Lighthouse node appears to be only a short distance from the head.

Is the error from Geth indicating that the CL is providing blocks from different chains that conflict with Geth's view of finalization? Something else?

@michaelsproul
Copy link
Author

cc @holiman 🙏 (sorry to bug you, but I've enjoyed our previous discussions of sync intricacies 😁 )

@holiman
Copy link
Contributor

holiman commented Aug 22, 2023

The beacon syncer reorging is introduced fairly recently, in #27397.

When a reorg happens, the beacon/skeleton syncer needs to stop, swap out it's internal state to the new head and restart. If the downloader is busy importing blocks in the mean time, it will prevent the beacon syncer from completing it's restart loop until all queued blocks are consumed.

In other words, we want to stop/restart our "skeleton syncer". While we are in the process of shutting that down, we cannot bother with new CL updates, because that will only prolong the shutdown, and lead to memory problems. So we just drop them on the floor, and issue the message beacon syncer reorging.

@michaelsproul
Copy link
Author

Hmm ok, it seems like the logs above should be shortly prior to a full recovery then. I'll try to check back in with this user to see if their node synced back up.

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants