Use persistent data structures in fork choice #2059

michaelsproul · 2020-12-07T07:28:48Z

Issue Addressed

Closes #2028

Proposed Changes

This PR fixes issues with fork choice becoming out of sync with the database after failed writes (the root cause of #2028). It does this using quite a drastic approach: rather than mutating fork choice directly, we take a clone of it, mutate the clone, and then atomically update the original once the database write has succeeded. In order to make the clone cheap(er), we use persistent data structures from the im crate.

In order to make this work, I've added SSZ implementations for Arc and Vector.

Additional Info

I'm still assessing the performance and memory impact of this change. We should definitely test it for a substantial period of time on our Pyrmont nodes to ensure the performance penalty is tolerable.

michaelsproul · 2020-12-07T11:03:41Z

Bugger, looks like cargo audit is failing for good reason 😭

bodil/im-rs#153

I'll investigate further tomorrow

michaelsproul · 2020-12-07T11:29:11Z

Alternative solutions to consider, in roughly improving order:

Commit to disk before doing any of the fork choice checks or modifications (messy?)
Clone fork choice before each mutation without making it persistent (slow?)
Rollback the in-memory fork-choice to the version from disk if the write fails (maybe OK?)

Or, run on a forked + fixed version of im...

michaelsproul · 2020-12-07T22:37:20Z

It seems to be about 2x slower in the average case too, which is pretty bad. The graph below shows a Pyrmont node syncing with this branch (13:00-16:45), and keeping up with the head. At 18:30 I switched to v1.0.3 for comparison, and you can see average fork choice time dropped from around 40ms to 20ms. At 22:30 I switched back to this branch, and you can see the runtime doubled again. The peaks are also a bit higher.

I'm going to shelve this approach for now and investigate alternatives.

## Issue Addressed Closes #2028 Replaces #2059 ## Proposed Changes If writing to the database fails while importing a block, revert fork choice to the last version stored on disk. This prevents fork choice from being ahead of the blocks on disk. Having fork choice ahead is particularly bad if it is later successfully written to disk, because it renders the database corrupt (see #2028). ## Additional Info * This mitigation might fail if the head+fork choice haven't been persisted yet, which can only happen at first startup (see #2067) * This relies on it being OK for the head tracker to be ahead of fork choice. I figure this is tolerable because blocks only get added to the head tracker after successfully being written on disk _and_ to fork choice, so even if fork choice reverts a little bit, when the pruning algorithm runs, those blocks will still be on disk and OK to prune. The pruning algorithm also doesn't rely on heads being unique, technically it's OK for multiple blocks from the same linear chain segment to be present in the head tracker. This begs the question of #1785 (i.e. things would be simpler with the head tracker out of the way). Alternatively, this PR could just revert the head tracker as well (I'll look into this tomorrow).

Use persistent data structures in fork choice

642d904

michaelsproul added work-in-progress PR is a work-in-progress A0 t Consensus & Verification labels Dec 7, 2020

michaelsproul closed this Dec 7, 2020

michaelsproul mentioned this pull request Dec 8, 2020

[Merged by Bors] - Revert fork choice if disk write fails #2068

Closed

michaelsproul deleted the persistent-fork-choice branch February 15, 2021 21:58

michaelsproul added the consensus An issue/PR that touches consensus code, such as state_processing or block verification. label Nov 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use persistent data structures in fork choice #2059

Use persistent data structures in fork choice #2059

michaelsproul commented Dec 7, 2020

michaelsproul commented Dec 7, 2020

michaelsproul commented Dec 7, 2020

michaelsproul commented Dec 7, 2020

Use persistent data structures in fork choice #2059

Use persistent data structures in fork choice #2059

Conversation

michaelsproul commented Dec 7, 2020

Issue Addressed

Proposed Changes

Additional Info

michaelsproul commented Dec 7, 2020

michaelsproul commented Dec 7, 2020

michaelsproul commented Dec 7, 2020