core/state/snapshot: ensure Cap retains a min number of layers #22331

karalabe · 2021-02-16T08:03:30Z

This is a relatively tiny change with non-obvious refactors.
Fixes an issue where the snapshot based state pruner fails due to not enough diff layers.

The snapshots have a Cap functionality, which takes the snapshot tree (disk layer + diff layer tree on top) and limits it to a number of requested layers, merging anything above the limit downwards. The goal of Cap is to ensure that we always have 128 recent block's state available in memory, but anything above that can be pushed to disk. Turns out the definition of "retain 128 layers" is not that obvious.

Before diving into the problem, an important detail is that although snapshots have 2 types of layers (diskLayer backed by the database and diffLayer backed by memory), it actually differentiates the bottom most diffLayer from the others on top. In theory, every time we have more than 128 diff layers, we could push the bottom ones disk. This, however, would result in thrashing the database, since each new block would push some old diff to disk. In practice, whenever the layer cap is exceeded, we accumulate the writes into the bottom-most diff. Only when this bottom diff layer exceeds a threshold of 4MB, is it pushed to disk.

That catch 22 is that we need to merge data into the bottom diff layer (accumulator) first, and only then will we know if it overflows. This means that when we request 128 layers from Cap, it does not know in advance if the accumulator will survive, and as such will be unable to retain exactly 128 layers. The current behavior of cap (lower case) was to retain at max 128, but if the accumulator overflown, it retained only 127.

This is an issue from Geth's perspective, because Geth's assumption is that there are always 128 recent snapshots available. This invariant is not necessarily broken after a flatten cap({stale disk layer, accumulator layer, diff layers}) == {recent disk layer, diff layers}; however seeing just a {disk layer, diff layers} set without context, we can't say if the disk layer is stale or not (it is the parent of the first diff, we just don't know if the first diff is an accumulator covering a range of blocks or a single block).

Long story short, if Geth needs a guarantee that there are always 128 recent states tracked by the snapshotter, we can't rely on the accumulator layer in that 128 because it may or may not survive a Cap operation. The naive solution would be to update Geth to maintain 129 layers, but that seems like fixing the symptom instead of the problem.

This PR solves the actual problem by changing Cap so that instead of including the accumulator layer (which may or may not survive the cap) in the layer count, it excludes it. The result is the same that instead of retaining 128 or 127 diff layers, it will now retain 129 by default and occasionally 128.

An interesting this is the corner case handling when requesting 0 or 1 layers. Previously 0 flattened everything into the disk layer and 1 flattened everything into the accumulator, potentially ending up with 0 either way if the accumulator became too large. The PR also changes this so that 0 retains the same meaning (otherwise there's no way to flatten everything + the effect is mostly the same, we get the same root availability), but 1 becomes 1 diff layer and maybe an optional accumulator. It becomes impossible to have a call that "merge everything into the accumulator". I think it's acceptable, these special cases were mostly used by the tests and we just need to ensure all strange cornercases are covered.

With these ever so slight semantic modifications, the tests needed some extensions and heavier changes to make sure they still behave the same way they intended to.

holiman

LGTM (a few spelling-nits though)

core/state/snapshot/snapshot.go

holiman · 2021-02-16T09:05:20Z

core/state/snapshot/snapshot.go

+//
+// Note, the final diff layer count in general will be one more than the amount
+// requested. This happens because the bottommost diff layer is the acumulator
+// which may or may not overflow and cascade to disk. Since this last layer's


Suggested change

// which may or may not overflow and cascade to disk. Since this last layer's

// which may or may not overflow and cascade to disk. Since this last layers

It's possessive form, last layer's survival, as in "the survival of the last layer".

karalabe added this to the 1.10.0 milestone Feb 16, 2021

karalabe requested a review from holiman February 16, 2021 08:03

karalabe requested a review from rjl493456442 as a code owner February 16, 2021 08:03

holiman approved these changes Feb 16, 2021

View reviewed changes

core/state/snapshot: ensure Cap retains a min number of layers

9ec3329

karalabe force-pushed the enforce-min-snap-difflayers branch from bd2ccad to 9ec3329 Compare February 16, 2021 13:25

karalabe merged commit 6291fc9 into ethereum:master Feb 16, 2021

quorumbot mentioned this pull request Sep 3, 2021

[Upgrade] Go-Ethereum release v1.10.0 Consensys/quorum#1249

Merged

9 tasks

baptiste-b-pegasys mentioned this pull request Sep 3, 2021

[Upgrade] Go-Ethereum release v1.10.0 baptiste-b-pegasys/quorum#16

Closed

9 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

core/state/snapshot: ensure Cap retains a min number of layers #22331

core/state/snapshot: ensure Cap retains a min number of layers #22331

karalabe commented Feb 16, 2021

holiman left a comment

holiman Feb 16, 2021

karalabe Feb 16, 2021

	// which may or may not overflow and cascade to disk. Since this last layer's
	// which may or may not overflow and cascade to disk. Since this last layers

core/state/snapshot: ensure Cap retains a min number of layers #22331

core/state/snapshot: ensure Cap retains a min number of layers #22331

Conversation

karalabe commented Feb 16, 2021

holiman left a comment

Choose a reason for hiding this comment

holiman Feb 16, 2021

Choose a reason for hiding this comment

karalabe Feb 16, 2021

Choose a reason for hiding this comment