What CL behaviour would trigger `beacon syncer reorging`? #27962

michaelsproul · 2023-08-22T02:38:34Z

We have a report from a Lighthouse user that their Geth node is failing to sync with "beacon syncer reorging" errors, despite no apparent re-orgs on the CL side.

I only have a short excerpt of logs so far:

Aug 21 19:21:29 NUC-1 lighthouse[37902]: Aug 22 02:21:29.001 WARN Syncing deposit contract block cache    est_blocks_remaining: initializing deposits, service: slot_notifier
Aug 21 19:21:41 NUC-1 lighthouse[37902]: Aug 22 02:21:41.006 INFO Syncing                                 est_time: 10 mins, speed: 0.12 slots/sec, distance: 76 slots (15 mins), peers: 108, service: slot_notifier
Aug 21 19:21:41 NUC-1 lighthouse[37902]: Aug 22 02:21:41.007 WARN Syncing deposit contract block cache    est_blocks_remaining: initializing deposits, service: slot_notifier
Aug 21 19:21:41 NUC-1 lighthouse[37902]: Aug 22 02:21:41.441 WARN Execution engine call failed            error: ServerMessage { code: -32000, message: "beacon syncer reorging" }, service: exec
Aug 21 19:21:41 NUC-1 lighthouse[37902]: Aug 22 02:21:41.441 WARN Error whilst processing payload status  error: Api { error: ServerMessage { code: -32000, message: "beacon syncer reorging" } }, service: exec
Aug 21 19:21:41 NUC-1 lighthouse[37902]: Aug 22 02:21:41.442 CRIT Failed to update execution head         error: ExecutionForkChoiceUpdateFailed(EngineError(Api { error: ServerMessage { code: -32000, message: "beacon syncer reorging" } })), service: beacon
Aug 21 19:21:53 NUC-1 lighthouse[37902]: Aug 22 02:21:53.002 INFO Syncing                                 est_time: 10 mins, speed: 0.12 slots/sec, distance: 77 slots (15 mins), peers: 102, service: slot_notifier
Aug 21 19:21:53 NUC-1 lighthouse[37902]: Aug 22 02:21:53.003 WARN Syncing deposit contract block cache    est_blocks_remaining: initializing deposits, service: slot_notifier
Aug 21 19:22:05 NUC-1 lighthouse[37902]: Aug 22 02:22:05.006 INFO Syncing                                 est_time: 5 mins, speed: 0.23 slots/sec, distance: 73 slots (14 mins), peers: 107, service: slot_notifier
Aug 21 19:22:05 NUC-1 lighthouse[37902]: Aug 22 02:22:05.008 WARN Syncing deposit contract block cache    est_blocks_remaining: initializing deposits, service: slot_notifier
Aug 21 19:22:17 NUC-1 lighthouse[37902]: Aug 22 02:22:17.001 INFO Syncing                                 est_time: 4 mins, speed: 0.25 slots/sec, distance: 69 slots (13 mins), peers: 110, service: slot_notifier
Aug 21 19:22:17 NUC-1 lighthouse[37902]: Aug 22 02:22:17.011 WARN Syncing deposit contract block cache    est_blocks_remaining: initializing deposits, service: slot_notifier
Aug 21 19:22:17 NUC-1 lighthouse[37902]: Aug 22 02:22:17.550 WARN Execution endpoint is not synced        last_seen_block_unix_timestamp: 0, endpoint: http://127.0.0.1:8551/, auth=true, service: deposit_contract_rpc
Aug 21 19:22:17 NUC-1 lighthouse[37902]: Aug 22 02:22:17.551 ERRO Error updating deposit contract cache   error: Failed to get remote head and new block ranges: EndpointError(FarBehind), retry_millis: 60000, service: deposit_contract_rpc

The user seems to be running the latest versions of Lighthouse (v4.3.0) and Geth (v1.12.2), and hasn't missed a hard-fork as far as I know. Their Lighthouse node appears to be only a short distance from the head.

Is the error from Geth indicating that the CL is providing blocks from different chains that conflict with Geth's view of finalization? Something else?

The text was updated successfully, but these errors were encountered:

michaelsproul · 2023-08-22T02:39:49Z

cc @holiman 🙏 (sorry to bug you, but I've enjoyed our previous discussions of sync intricacies 😁 )

holiman · 2023-08-22T07:23:57Z

The beacon syncer reorging is introduced fairly recently, in #27397.

When a reorg happens, the beacon/skeleton syncer needs to stop, swap out it's internal state to the new head and restart. If the downloader is busy importing blocks in the mean time, it will prevent the beacon syncer from completing it's restart loop until all queued blocks are consumed.

In other words, we want to stop/restart our "skeleton syncer". While we are in the process of shutting that down, we cannot bother with new CL updates, because that will only prolong the shutdown, and lead to memory problems. So we just drop them on the floor, and issue the message beacon syncer reorging.

michaelsproul · 2023-08-22T09:00:19Z

Hmm ok, it seems like the logs above should be shortly prior to a full recovery then. I'll try to check back in with this user to see if their node synced back up.

Thanks

michaelsproul added the type:docs label Aug 22, 2023

MariusVanDerWijden closed this as completed Aug 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What CL behaviour would trigger `beacon syncer reorging`? #27962

What CL behaviour would trigger `beacon syncer reorging`? #27962

michaelsproul commented Aug 22, 2023 •

edited by holiman

Loading

michaelsproul commented Aug 22, 2023

holiman commented Aug 22, 2023

michaelsproul commented Aug 22, 2023

What CL behaviour would trigger beacon syncer reorging? #27962

What CL behaviour would trigger beacon syncer reorging? #27962

Comments

michaelsproul commented Aug 22, 2023 • edited by holiman Loading

michaelsproul commented Aug 22, 2023

holiman commented Aug 22, 2023

michaelsproul commented Aug 22, 2023

What CL behaviour would trigger `beacon syncer reorging`? #27962

What CL behaviour would trigger `beacon syncer reorging`? #27962

michaelsproul commented Aug 22, 2023 •

edited by holiman

Loading