Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ordered trie for computing roots #744

Merged
merged 3 commits into from
Oct 8, 2024
Merged

Ordered trie for computing roots #744

merged 3 commits into from
Oct 8, 2024

Conversation

arnetheduck
Copy link
Member

@arnetheduck arnetheduck commented Oct 7, 2024

Root encoding is on the hot path for block verification both in the consensus (when syncing) and execution clients and oddly constitutes a significant part of resource usage even though it is not that much work.

While the trie code is capable of producing a transaction root and similar feats, it turns out that it is quite inefficient - even for small work loads.

This PR brings in a helper for the specific use case of building tries of lists of values whose key is the RLP-encoded index of the item.

As it happens, such keys follow a particular structure where items end up "almost" sorted, with the exception for the item at index 0 which gets encoded as [0x80], ie the empty list, thus moving it to a new location.

Armed with this knowledge and the understanding that inserting ordered items into a trie easily can be done with a simple recursion, this PR brings a ~100x improvement in CPU usage (360ms vs 33s) and a ~50x reduction in memory usage (70mb vs >3gb!) for the simple test of encoding 1000000 keys.

In part, the memory usage reduction is due to a trick where the hash of the item is computed as the item is being added instead of storing it in the value.

There are further reductions possible such as maintaining a hasher per level instead of storing hash values as well as using a direct-to-hash rlp encoder.

Root encoding is on the hot path for block verification both in the
consensus (when syncing) and execution clients and oddly consititutes a
significant part of resource usage even though it is not that much work.

While the trie code is capable of producing a transaction root and
similar feats, it turns out that it is quite inefficient - even for
small work loads.

This PR brings in a helper for the specific use case of building tries
of lists of values whose key is the RLP-encoded index of the item.

As it happens, such keys follow a particular structure where items end
up "almost" sorted, with the exception for the item at index 0 which
gets encoded as `[0x80]`, ie the empty list, thus moving it to a new
location.

Armed with this knowledge and the understanding that inserting ordered
items into a trie easily can be done with a simple recursion, this PR
brings a ~100x improvement in CPU usage (360ms vs 33s) and a ~50x
reduction in memory usage (70mb vs >3gb!) for the simple test of
encoding 1000000 keys.

In part, the memory usage reduction is due to a trick where the hash of
the item is computed as the item is being added instead of storing it in
the value.

There are further reductions possible such as maintaining a hasher per
level instead of storing hash values as well as using a direct-to-hash
rlp encoder.
@arnetheduck arnetheduck merged commit 00c91a1 into master Oct 8, 2024
18 checks passed
@arnetheduck arnetheduck deleted the ordered-trie branch October 8, 2024 18:03
@etan-status
Copy link
Contributor

This does not work well when there are gaps among keys, e.g., due to [] values which are not supported inside MPT (setting a key to [] is equivalent to deleting the key from the MPT, as in, the key is not existing at all instead of being set to an empty value).

See ethereum/consensus-specs#3885 which now finally disallows this for transactions, but there may still be other uses for having gaps in the keys.

if the gap case is intentionally unsupported, there should be an assert / comment / log to make this case explicit. the old logic (before ordered trie) was working correctly with present gaps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants