-
Notifications
You must be signed in to change notification settings - Fork 20.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
core, eth, trie: filter out boundary nodes and remove dangling nodes in stacktrie #28327
Conversation
If you first make a standalone PR containing only the |
42b088d
to
5f68032
Compare
trie/stacktrie.go
Outdated
NoLeftBoundary bool // Flag whether the nodes on the left boundary are skipped for committing | ||
NoRightBoundary bool // Flag whether the nodes on the right boundary are skipped for committing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we have
leftBoundary []byte // left boundary (may be nil). Must be hex-encoded path to left boundary, if set.
rightBoundary []byte // right boundary (may be nil). Must be hex-encoded path to left boundary, if set.
And also have a SetLeftBoundary(path []byte)
which we can invoke when we insert the first item?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this approach, the left boundary should always be the first entry, namely the first entry along with its path should be filtered out as boundary.
We can't use Origin
as the boundary to also keep the first entry, because this is no guarantee that the first entry will be at the same position as the one in full tree.
The same logic can apply for the right boundary. The last entry should also be excluded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will try to response with some more logical words.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will try to response with some more logical words.
No that's fine, I understand the problem and what you are saying. But I don't see why we would need a NoLeftBoundary
boolean: we could just let a nil
border signify NoLeftBoundary
.
So in my suggestion: we would explicitly set the left boundary to the first element we feed it (or, if initiated with a proof, then we'd set it to the origin). And vice versa with right boundary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good to me.
Heh, just as I changed my mind, and made a PR (#28361) with lastPath
in the stacktrie, thus necessitating having the booleans in the options :)
3915540
to
cd40e91
Compare
trie/stacktrie.go
Outdated
} | ||
t.last = append([]byte{}, k...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be neat to not have to copy this (since it's an extra copy for every inserted item). I think it's not needed, the k
is freshly allocated in this scope (not external access), and even though a stNode may have a reference to it, I don't think it will ever be modified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, we could have t.last
be initialized as a e..g 32-byte slice, and then we could do t.last = append(t.last[:0], k...)
or something to that effect
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not 100% confident about it. Because we do hexToCompactInPlace
for extension and leaf node, which mutates the key slice directly.
and for leaf node, it will directly keep the passed key slice when we construct it.
case emptyNode: /* Empty */
st.typ = leafNode
st.key = key
st.val = value
But I can definitely go with second approach, getting rid of slice allocation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally I don't like hexToCompactInPlace
optimization, because it might introduce some nasty issues which are pretty hard to debug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally I don't like hexToCompactInPlace optimization, because it might introduce some nasty issues which are pretty hard to debug.
I agree!
005d8d0
to
9454e72
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the whole looks pretty good to me
eth/protocols/snap/sync.go
Outdated
options = options.WithCleaner(func(path []byte) { | ||
s.cleanPath(subtask.genBatch, owner, path) | ||
}) | ||
options = options.SkipBoundary(true, true, func(path []byte, hash common.Hash, blob []byte) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cleaner
should only be used in path mode, but the SkipBoundary
might aswell used regardless, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Theoretically SkipBoundary
can be used in hash. But I won't do it now to avoid some unexpected node missing issue.
I don't think so if i understand correctly. If we just receive a part of storage slots, we will switch to chunk-mode anyway. Because we have no idea how many slots left, we just blindly create 16 chunks to fetch concurrently. |
if
We get a new range starting at the last key, ending in |
Ah interesting, I missed that mechanism, thanks for pointing it out. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
if owner != (common.Hash{}) && rawdb.ExistsStorageTrieNode(s.db, owner, path) { | ||
rawdb.DeleteStorageTrieNode(batch, owner, path) | ||
deletionGauge.Inc(1) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, would a blind delete be more expensive vs the current check-and-delete?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think check-and-delete
is more expensive. However, the overhead is accepted, especially when we have the pebble fixed(start to use bloom filter).
I don't have strong opinion, but this current approach we can expose more information to metrics(e.g. how many dangling nodes we really detect).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a good question, and def not straight-forward answer. blind delete would put a bunch of tombstones in level0, so it's definitely not a given that it would be faster -- and if it is, it might make other parts slower due to the tombstone processing during e.g. compaction.
Co-authored-by: Martin Holst Swende <martin@swende.se>
ae07a53
to
573676a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM
…in stacktrie (ethereum#28327) * core, eth, trie: filter out boundary nodes in stacktrie * eth/protocol/snap: add comments * Update trie/stacktrie.go Co-authored-by: Martin Holst Swende <martin@swende.se> * eth, trie: remove onBoundary callback * eth/protocols/snap: keep complete boundary nodes * eth/protocols/snap: skip healing if the storage trie is already complete * eth, trie: add more metrics * eth, trie: address comment --------- Co-authored-by: Martin Holst Swende <martin@swende.se>
…g nodes in stacktrie (ethereum#28327)" This reverts commit 6fe5774.
…g nodes in stacktrie (ethereum#28327)" This reverts commit 6fe5774.
…in stacktrie (ethereum#28327) * core, eth, trie: filter out boundary nodes in stacktrie * eth/protocol/snap: add comments * Update trie/stacktrie.go Co-authored-by: Martin Holst Swende <martin@swende.se> * eth, trie: remove onBoundary callback * eth/protocols/snap: keep complete boundary nodes * eth/protocols/snap: skip healing if the storage trie is already complete * eth, trie: add more metrics * eth, trie: address comment --------- Co-authored-by: Martin Holst Swende <martin@swende.se>
This pull request implements two optional features for stacktrie:
These two features can be used to enhance the snap sync in the case of path mode.
Why to filter out boundary nodes?
In the snap sync, a large trie is separated into several chunks(usually 16) to sync concurrently. For single chunk, there is a corresponding stack trie to accept the states within this range and build the merkle trie nodes on top.
However, the lack of states ahead and behind, the generated merkle trie nodes on boundary are incomplete. They do not match the nodes in the full trie.
In this example above, the trie is divided into three ranges. The partial trie of these ranges will generate incorrect boundary nodes. e.g. node A was in a deeper position, but now moved to a higher level. The root node is also incorrect due to missing children.
How to detect boundary nodes?
All nodes along the path of the 'first-inserted-state' node key are considered left-boundary nodes.
All nodes along the path of the 'last-inserted-state' node key are considered right-boundary nodes.
How can we guarantee that nodes except the left and right boundaries are all correct?
Because these states fall within the range are complete. The first and last states can uniquely determine the position of the subtrie which contains all the states in the middle. In another word, the subtrie of the internal states(except first and last) is completely consistent with the one in the full trie, and the position of this subtrie can be uniquely determined by the boundary states.
Therefore, we can conclude that the nodes except the left and right boundaries are all correct.
Why to remove the dangling nodes fall within the range covered by extension node?
Since the snap syncer might sync the storage trie multiple times due to sync cycle termination and resumption, and the sync target can change because of pivot movement. This scenario can occur which disrupts state healing.
In cycle 1, the storage trie appears as follows. The snap syncer successfully retrieved the entire storage trie and stored the nodes in the database.
However, cycle 1 was terminated due to a pivot movement at that time, with an incomplete storage trie of another account ahead. Consequently, this account range will need to be reprocessed in the next cycle, and, if the storage trie has changed, it will be resynced.
In cycle 2, the shape of the storage trie changes. Specifically, the short node at [0, 1, 2, 3, 5] is modified, and a few node branches are removed, causing nodes M-3 and M-4 to become dangling.
When the snap sync is completed, the state healer begins filling in the missing nodes. If, at that time, the storage reverts to the state it was in during cycle 1, the state healer will stop at M-3. This is because the state healer operates under the assumption that if a node exists, the entire sub-trie should also be present. However, in this case, M-3 is merely a dangling node, and the corresponding sub-trie is incomplete. For instance, the node at the path [0, 1, 2, 3, 5] is N-4, does not matching with M-3.
Thus, we can conclude that whenever we commit nodes into the database, we must ensure that the entire path space is uniquely occupied and remove all the dangling nodes within that path space.
Overhead analysis
In order to detect the dangling nodes, a lot of database reads will be conducted. The statistics from a live sync show that 950K database reads are performed in total(and nothing detected, kind of expected, very rare to occur).
Although the overhead is not trivial, but also not significant. We have to live with it to avoid corruption that can happen in a very low possibility.