Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Save all visited nodes into cache on block level #5920

Closed
Longarithm opened this issue Dec 20, 2021 · 6 comments
Closed

Save all visited nodes into cache on block level #5920

Longarithm opened this issue Dec 20, 2021 · 6 comments
Labels
A-transaction-runtime Area: transaction runtime (transaction and receipts processing, state transition, etc) P-medium Priority: medium

Comments

@Longarithm
Copy link
Member

Longarithm commented Dec 20, 2021

Idea
[Investigation](https://near.zulipchat.com/#narrow/stream/308695-near-eng-private/topic/trie.20depth/near/263925418) of some Aurora call shows that 30-40% gas is spent on touching trie node, and 38% of these nodes are top-level ones. So we just charge for taking the same nodes from DB again and again, which is suboptimal.

At the same time, in TrieUpdate we store committed and prospective changes which work as cache with key-value modificaions in the current block. If we save all intermediate nodes, we could theoretically save ~13% of current Aurora costs.

Caveats

  • Need to modify Trie internally - e.g. save all intermediate nodes visited in Trie::lookup
  • Need to check how much RAM is used and which cost we need to charge for cached nodes
  • Key-value modifications are simple to cache, but we may need to be more careful with caching nodes
  • As a protocol modification, it needs to be as simple as possible.
  • We already have TrieCache storing recent touched nodes. However, it is not accounted by the protocol.
  • Somewhere in historical PRs it was mentioned that we have a strong assumption that TrieUpdate must store updates only for one block (I may be imprecise here). So it can't be easily extended

Additional context
Because of the last caveat, we could consider implementing cache on transaction level, e.g. LRU cache storing last nodes queried by last X transactions. It makes more sense, but should be implemented from scratch and it also should be as simple as possible.

@Longarithm Longarithm added A-transaction-runtime Area: transaction runtime (transaction and receipts processing, state transition, etc) P-medium Priority: medium labels Dec 20, 2021
@austinabell
Copy link
Contributor

It May or not be relevant, but with the new SDK collections, accesses are cached to avoid duplicate access, so this change might only be beneficial for Aurora and anyone accessing storage manually. I suppose that if the accesses are cached per block, it may help if multiple other txs access duplicate keys, but unsure if the overhead of having the cache and potential increase of cost of this would offset.

@Longarithm
Copy link
Member Author

Hey @austinabell, this is interesting! Could you give some examples?

I'm curious how it can be cached, because if cache touches contract storage, I don't see how it can avoid touching trie nodes.

@austinabell
Copy link
Contributor

Hey @austinabell, this is interesting! Could you give some examples?

I'm curious how it can be cached, because if cache touches contract storage, I don't see how it can avoid touching trie nodes.

Yeah, so in near_sdk::store we are rebuilding the APIs to match Rust std collections equivalents, which also comes with an intermediate layer for every operation so reads are lazily loaded into memory and writes are done in memory and are only written when they are dropped. Can see some tangible impls in https://github.com/near/near-sdk-rs/tree/master/near-sdk/src/store.

@bowenwang1996
Copy link
Collaborator

Update: some further investigation showed that among all the trie nodes touched, only 10% of them are unique, which means that if we cache all trie nodes visited during the execution of a transaction/receipt, we can potentially save 90% of the touching trie node cost for transactions that do a lot of storage reads and writes. See https://near.zulipchat.com/#narrow/stream/295558-nearinc.2Fprotocol-core/topic/touching.20trie.20node.20counter/near/267074859 for more details.

@Longarithm Longarithm linked a pull request Feb 15, 2022 that will close this issue
@stale
Copy link

stale bot commented Apr 7, 2022

This issue has been automatically marked as stale because it has not had recent activity in the last 2 months.
It will be closed in 7 days if no further activity occurs.
Thank you for your contributions.

@stale stale bot added the S-stale label Apr 7, 2022
@Longarithm
Copy link
Member Author

Done in #6628.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-transaction-runtime Area: transaction runtime (transaction and receipts processing, state transition, etc) P-medium Priority: medium
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants