-
Notifications
You must be signed in to change notification settings - Fork 20.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
core/state/snapshot: reuse memory data instead of hitting disk when generating #22667
Conversation
If I understand this PR correctly, you are adding a new cache layer into the iterator, so that while traversing the trie, if anything is in memory already (in your bastard-trie) then we just read that, but otherwise consider it the same as if it were on disk already? |
That is correct. We just resolve from a primary source (which is fully in memory), and if that fails, resolve from the secondary source (the disk trie). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've pushed the minor change on top which we've discussed to use ethdb.KeyValueStore instead of a trie.Database, since the former is easy to construct (potentially useful elsewhere too) whereas the latter requires key/value data, can't use just a random soup of trie nodes.
The PR is a bit more complicated than I'd like, but an essential detail is that all simpler variations placed the memcache below the live trie.Database, which is problematic as we really don't want potentially non-persisted data to end up in the trie.Database clen caches.
Why isn't generate concurrent, even though we are using a lot of async IO? From what I understand, the purpose of using generate range is not to build the Trie for the entire state, which would OOM. However, how does one decide to delete Trie nodes if using the partially-built Trie as memcache? Is there a pruning mechanism to do so easily? Further, I am writing the following tool: Async Trie resolver. The model is that one submits keys to fetch from asynctrieresolver. The resolver fetches data async and stores in a Notice that since async IO is always the bottleneck when it comes to Trie operations, if one simply requires eventual consistency, we can use our Further, maintaining a sequential consistency in the order in which the operations are called is also a difficulty (but this can be resolved with an applyOperation queue that is sequential). The guarantees are: 1) thread safety on copy (regardless if copy calls A programming example is as follows:
Whenever In the concurrent tryGet rather than trie building model, as is the case in snapshot generation, one can skip building the Trie altogether (past a certain depth) and go straight to using a Question: Does this really work for Yes, since walking for hash resolution is stable under mutation. A cached resolved node is "live" for two purposes: to update the in-memory Trie, and to cache intermediate paths for async resolver workers to walk the Trie. The sequential update thread only queries for a HashNode if it does not yet live in memory. If it does not live in memory, then it did not live along any path of mutation. Thus, whenever this node is queried by the sequential updater, it is the correct yet unmodified version. One can strictly enforce this by having only the updater thread mark an In essence, one can replace the prefetcher entirely using this async approach. One just calls If the order in which applying operations do not matter, (unfortunately for us it most of the time does). |
…hen generating (ethereum#22667) * core/state/snapshot: reuse memory data instead of hitting disk when generating * trie: minor nitpicks wrt the resolver optimization * core/state/snapshot, trie: use key/value store for resolver * trie: fix linter Co-authored-by: Péter Szilágyi <peterke@gmail.com>
This PR is a successor to rjl493456442#6 (some charts can be found there).
This PR attempts to minimize the number of disk reads when doing the snapshot generation.
When we have a slice of snapshot values which do not match the trie data, we currently iterate the disk to retrieve the canonical data.
What this PR does, is commit the (incorrect) snapshot data to a trie, which will be 99% correct. When iterating the trie, we then use the snapshot-trie-database for resolving hashes. By doing so, we can read 99% of the leaves the from the memory db instead of resolving from disk.
It can also be implemented differently (this was my third variation). Other variants which also "work"
cleans
cache in the existing trie.Database.The upside of this particular way to implement it is that most of the modifications are performed in the node iterator, and does not touch the trie database.
This approach can probably be further polished/optimized if we retain the trie used by the prover, instead of rebuilding it once again -- but this PR focuses on the disk access part, not so much on getting the cpu cycles down