Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core, internal/ethapi: add and use LRU cache for receipts #17610

Merged
merged 1 commit into from
Sep 29, 2018

Conversation

ryanschneider
Copy link
Contributor

This PR adds an LRU cache for receipts to BlockChain, following the model of the other LRUs in that object, and also removes the last direct rawdb reads from EthAPIBackend, which uses the embedded blockchain object for reading receipts instead.

I've noticed recently that eth_getLogs performance is suffering, to the point that a node under a moderate amount of RPC load actually gets to a point where it can't keep up with the new blocks coming in over devp2p and eventually reverts to syncing.

When debugging this, I used a less than optimal, but not uncommon, query to get all the ERC-20 transfers between two addresses on mainnet:

{
  "method":"eth_getLogs","id":1,"jsonrpc":"2.0", 
  "params":[{
    "fromBlock":"0x1","toBlock":"latest",
    "topics":[ 
      "0xddf252ad1be2c89b69c2b068fc378daa952ba7f163c4a11628f55a4df523b3ef",
      "0x0000000000000000000000009035e9004a4f1e88c296be507c5cea8f99ff8a63", 
      "0x0000000000000000000000004be0cd2553356b4abb8b6a1882325dabc8d3013d"
    ],
    "address":"0x09d8b66c48424324b25754a873e290cae5dca439"
  }]
}

I was surprised to see that even under a moderate level of concurrency (5-20 copies of the same RPC above) that the node was spending almost 50% of it's CPU time decoding receipts:

      flat  flat%   sum%        cum   cum%
         0     0%     0%     50.31s 52.25%  github.com/ethereum/go-ethereum/eth/filters.(*Filter).Logs
         0     0%     0%     50.31s 52.25%  github.com/ethereum/go-ethereum/eth/filters.(*PublicFilterAPI).GetLogs
...
     0.02s 0.021% 0.052%     48.17s 50.03%  github.com/ethereum/go-ethereum/eth.(*EthAPIBackend).GetLogs
     0.01s  0.01% 0.062%     47.86s 49.70%  github.com/ethereum/go-ethereum/core/rawdb.ReadReceipts

Going through the code, I saw that while there are LRU caches for most other items in Blockchain, there isn't one for transaction receipts. I found that adding a large enough LRU cache for receipts drastically improved performance:

  • first request (cold LRU): 5.167s
  • second request (warm LRU, max 256 blocks): 2.640s
  • second request (warm LRU, max 2048 blocks): 0.735s

For this particular query, "large enough" needs to be greater than 547, as that's how many blocks the bloom filter scans let through. Which is also in itself a little surprising, since of those 547 blocks, only 6 were a match. I'm not well versed at all in how bloom filters work, but I suspect the recent increase in standardized events like ERC-20 and ERC-721 has led to more and more blocks containing receipts with topic[0] matching one of those standard events, and that is leading to an increase in the false positive rate of the filter.

So, currently, this PR contains a receipt LRU of size 256 to match the other LRU caches, but I'd be more than happy to discuss raising that (and potentially all the LRU sizes). A value of 256 should work very well for a low-traffic node with lots of log queries around the chain head, but multiple "full scan" queries with wide fromBlock/toBlock ranges will just lead to eviction races.

I'll file issues for the "high" false positive rate, and the fact that running several eth_getLogs RPCs concurrently greatly effects node availability so those items can be discussed separately.

@karalabe
Copy link
Member

How large is a receipt? The 256 numbers were pretty arbitrary for the other things, I'm happy to bump receipts as high as needed as long as we know in advance the memory consumption and can live with it.

@ryanschneider
Copy link
Contributor Author

ryanschneider commented Sep 10, 2018 via email

@ryanschneider
Copy link
Contributor Author

As mentioned, the size of this LRU cache is variable, since the LRU is per-block, and each block can have a variable amount of receipts in it, and the receipts themselves can vary in size based on the data contained in them.

I actually think for the default in geth we might want to make this cache value (for receipts) smaller, say around ~32, but ideally make this configurable (which would be a separate PR). For our nodes on dedicated infrastructure I've tested w/ LRU sizes up to 10,000 and everything works as expected (but w/ very large RAM usage).

I'm having trouble getting pprof output to show the actual LRU memory usage, since the memory is actually allocated reading the receipts out of leveldb, and so is just about everything else, it's very hard to determine the actual size of the cache. I think I'd have to add some sort of introspection to the lru module used, or find a way to tell pprof that the LRU should be the new "owner" of the memory when outputting statistics.

But, FWIW the bodyCache and bodyRLPCache LRUs suffer from this same issue (variable cache sizes).

So, would you be ok w/ accepting this PR with a slightly smaller cache size (e.g. 32 as I proposed), and then me later submitting a separate PR to make all the cache sizes configurable via .toml?

@ryanschneider
Copy link
Contributor Author

ryanschneider commented Sep 21, 2018

ping. I just rebased off master and lowered the cache size to 32 entries, so it should stay small for running on non-dedicated hardware but still help greatly with performance of getLogs queries near head.

FWIW we've been running this patch on our servers this week (with a larger LRU size) and it's greatly improved node reliability.

@ryanschneider
Copy link
Contributor Author

Er hold on, looks like I fat fingered the rebase and didn't actually push it.

@ryanschneider
Copy link
Contributor Author

Ok, actually rebased now. :)

@fjl fjl changed the title core, eth: add an LRU for receipts, use it over rawdb reads in api backend core, internal/ethapi: add and use LRU cache for receipts Sep 29, 2018
@fjl fjl merged commit b69942b into ethereum:master Sep 29, 2018
@fjl
Copy link
Contributor

fjl commented Sep 29, 2018

Thank you :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants