Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Add a CachedReqwestProvider to cache RPC requests using a ReqwestProvider #770

Open
puma314 opened this issue May 22, 2024 · 12 comments
Labels
enhancement New feature or request

Comments

@puma314
Copy link

puma314 commented May 22, 2024

Component

provider, pubsub

Describe the feature you would like

For use-cases like SP1-Reth or Kona, we often want to execute a (historical) block, but we don't have the entire state in memory and we execute this block with a ProviderDb that fetches accounts, storage, etc. using an RPC. Fetching from the network is slow and often takes minutes for all of the accesses required for an entire block.

Often we re-run these blocks to debug things or tune performance, etc. and each time the feedback loop on iteration is very slow because it requires waiting for all the network requests each time. It would be nice to add a very simple caching layer on top of ReqwestProvider that can cache the results of RPC calls to a file (or some other easy to set up format) and then first check the cache before sending a network request.

This would speed up iteration time for use-cases like Kona and SP1-Reth tremendously.

An interface like this might make sense:

let provider = ReqwestProvider::new_http(rpc_url).cache("my_file.txt")

In our case, we are usually querying old blocks (not near the tip of the chain), so re-org awareness is not important for our use-case. We just want a really simple caching layer.

Additional context

No response

@puma314 puma314 added the enhancement New feature or request label May 22, 2024
@gakonst
Copy link
Member

gakonst commented May 22, 2024

Could this be a tower layer?

Seeing https://docs.rs/tower/latest/tower/ready_cache/cache/struct.ReadyCache.html - cc @mattsse does this work?

@puma314
Copy link
Author

puma314 commented May 23, 2024

I'm not fully sure I understand how the tower works, but noting that we'd want to save stuff to a file so its persisted across instantiations (and not just have the cache in memory as an example).

@prestwich
Copy link
Member

We won't add caching at the Transport layer via tower because caching (unlike rate limiting or retrying) needs to be aware of the RPC semantics and potentially the provider heartbeat task, so that it can invalidate caches on new blocks and reorgs. This means we need it to be a provider alloy_provider::Layer producing CachingProvider<P, T, N>, rather than a tower::Layer producing CachingTransport<T>.

This is blocked by #736 (which is pretty straightforward to resolve)

Is the use case here making a high volume of requests against specific deep historical states? It sounds like you actually don't want to cache to a file. You want an in-memory cache that is persisted to a file when your program stops? I'm in general not in favor of caching to/from a file directly, as responses get invalidated so regularly, fs access degrades perf, and the target user for alloy doesn't have an archive node and doesn't make queries against the deep state. Would it be enough to have the cache internals be (de)serializable and a way to instantiate the cache with data in it?

@gakonst
Copy link
Member

gakonst commented May 23, 2024

This means we need it to be a provider alloy_provider::Layer producing CachingProvider<P, T, N>, rather than a tower::Layer producing CachingTransport.

Good point, supportive.

It sounds like you actually don't want to cache to a file. You want an in-memory cache that is persisted to a file when your program stops?

@puma314 basically this means:

  1. first run you start with no cache file on disk
  2. first request goes to RPC, gets cached
  3. second request goes to the cache
  4. when you ctrl +c the cache's drop impl gets called, persisting everything to disk
  5. when you start up the process again, the entire file is loaded in memory OR the data is "just in time" loaded from the file, either would work i think

@puma314
Copy link
Author

puma314 commented May 23, 2024

Yup that sounds great. @prestwich our use-case is that we are querying getProof and getStorage on blocks potentially hours, etc. in the past (so blocks that are well past the reorg window). We are using this for generating a ZKP, so we wouldn't want to generate a ZKP of a block that could be re-orged, if that makes sense.

@gakonst's proposed suggestion looks great to me as a potential devex.

@prestwich
Copy link
Member

when you ctrl +c the cache's drop impl gets called, persisting everything to disk

serialization and fs ops are fallible and cant be reliably used in a Drop. so I wouldnt recommend this approach

More broadly tho, a file system-backed cache of finalized responses is not broadly applicable and requires us to make decisions about the user's fs. I am not in favor of including it in the main alloy crates. A memory cache that can be loaded from fs at runtime and serialized to fs on demand is applicable to a lot of users, and could be in the main provider crate. Would that fit your need?

Assuming you're running your own infra, the need may also be better served by accessing reth db or staticfiles directly? If running alongside reth, retrieving proofs and then storing them to the file system is duplicating data that's already in the file system, no?

@gakonst
Copy link
Member

gakonst commented May 23, 2024

serialization and fs ops are fallible and cant be reliably used in a Drop. so I wouldnt recommend this approach
More broadly tho, a file system-backed cache of finalized responses is not broadly applicable and requires us to make decisions about the user's fs. I am not in favor of including it in the main alloy crates.

I've used this method before multiple times for debugging (e.g in MEV Inspect) and it's generally been fine, so I personally don't worry about the fallibility, but OK with doing this as a separate crate.

A memory cache that can be loaded from fs at runtime and serialized to fs on demand is applicable to a lot of users, and could be in the main provider crate. Would that fit your need?

How should the cache be populated in this case? Still via ProviderLayer where each method populates an LRU of the data on cache miss? And is it responsibility of the user to flush the cache to disk?

Assuming you're running your own infra, the need may also be better served by accessing reth db or staticfiles directly? If running alongside reth, retrieving proofs and then storing them to the file system is duplicating data that's already in the file system, no?

Proofs aren't part of the Reth DB, they get generated on the fly, don't think this would work

@puma314
Copy link
Author

puma314 commented May 23, 2024

A memory cache that can be loaded from fs and saved to fs would work for me. I'm not running my own infra in this case--the point is that for basically any chain we can get all the storage slots & proofs for running a block in a zkVM, without the need to have a local node running that is synced for that chain. It's a lot lower friction if we can just plug in an RPC vs. having to sync a reth instance. (Also I'm not sure if reth has getProof implemented yet).

let mut cache = MemoryCache::load("file.txt");
let provider = RequestProvider.(...).with_cache(cache);
// do stuff with provider
cache.save("file.txt")

seems totally fine to me.

@gakonst
Copy link
Member

gakonst commented May 24, 2024

SG re: the API above! Confirming that if you do stuff with provider that hit the actual backend and not the cache, the new file.txt should 1) include all the requests which were not cached before, 2) all the previous contents of the cache?

eth_getProof is implemented in Reth, but not the historical variant for arbitrary lookback due to limitations of the Erigon DB design which we inherit.

@prestwich
Copy link
Member

I've used this method before multiple times for debugging (e.g in MEV Inspect) and it's generally been fine, so I personally don't worry about the fallibility, but OK with doing this as a separate crate.

Panics in drops cause aborts, so you can do it, but it's not a decision we want to make on behalf of all users, as we don't know what conditions they're running in

A memory cache that can be loaded from fs and saved to fs would work for me. I'm not running my own infra in this case--the point is that for basically any chain we can get all the storage slots & proofs for running a block in a zkVM, without the need to have a local node running that is synced for that chain. It's a lot lower friction if we can just plug in an RPC vs. having to sync a reth instance. (Also I'm not sure if reth has getProof implemented yet).

let mut cache = MemoryCache::load("file.txt");
let provider = RequestProvider.(...).with_cache(cache);
// do stuff with provider
cache.save("file.txt")

seems totally fine to me.

instantiation should run through the builder API, so the sketch here is something like:

/// Cache object
struct Cache { ... }
/// Caching configuration object
struct CachingLayer { cache: Option<Cache> // other fields? }
/// Provider with cache
struct CachingProvider<P,N,T> { inner: P, cache: Cache }

let provider = builder.layer(CachingLayer::from_file("file.txt")?).http(url)

do you have a ballpark for number of proofs/etc you intend to cache?

@puma314
Copy link
Author

puma314 commented May 28, 2024

I think we would need low 100s of proofs per block, since it's all accounts/state that was touched during a block.

@prestwich
Copy link
Member

so i think actionable steps for implementing this are:

  • continue feature: ProviderCall #788
  • Develop cache invalidation policies
    • make a chart of each rpc endpoint with its safe caching point
    • e.g. "chainId" is safe to cache at "latest" while "getProof" is safe to tag at "final" or older
    • how is caching handled for block num/hash?
    • does the caching provider make an extra request to determine whether the block is sufficiently old to cache the response?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants