Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cacheKvStore: Keep dirty items sorted #2286

Closed
4 tasks
ValarDragon opened this issue Sep 10, 2018 · 3 comments · Fixed by #4265
Closed
4 tasks

cacheKvStore: Keep dirty items sorted #2286

ValarDragon opened this issue Sep 10, 2018 · 3 comments · Fixed by #4265
Assignees
Labels

Comments

@ValarDragon
Copy link
Contributor

Summary

In the cacheKVStore we resort the dirty items on every call to create an iterator. This is a performance problem, and from conversation with @jaekwon this is likely worth fixing prelaunch.

Numbers

From benchmarks due to the simulator, on SIM_COMMIT=false make test_sim_gaia_profile, sorting dirty items took 70% of the time. The calls were due to stakings IterateDelegations and UpdateBondedValidators. (Both of these are being called an above-average amount due to governance tallying, and slashing at the end of each block though. This will be much less of a problem on a live network if we rate-limit governance proposals) However, even before governance slashing / tallying ever happens (e.g. in the first 100 blocks) this is responsible for approximately 31% of the time. (entirely due to delegation, and stake end blocker, both of which we can expect alot of on the live network)

In concrete numbers, the iterator stuff for each delegation takes 4 times the time verifying its signature would take. From simulation, it appears that the bulk of the dirty items list is growing in each call, and therefore this will become at greater block heights / as more delegations accrue.

Proposal

From some brief analysis, it appears that the bulk of the dirty items are repeat items per call. So only sorting that part once and just having slightly slower insertions seems like it should be a significant speed improvement. (Like I picked a random key from the last dirty items sort, and it was sorted 876 times over the last 100 blocks)

To fix this, we could keep the dirty items sorted, and when adding a new dirty item, insert at the correct location. This may be tricky, because the way we may have some ramifications on memory, (e.g. the memory overhead of a BST here may be too much) but using an array would have costly inserts. Using a heap may strike a good balance here, its compact in its memory. Iterating through it would still be n log n, but if we use https://en.wikipedia.org/wiki/Heapsort#Bottom-up_heapsort leafsort method we would should have decent speed. (I already have an implementation of this, and a PR against golang to use this in heapsort. Its fairly simple to port into the Heap implementation and we could directly copy that here)

Prelaunch I think we should either use a heap or a BST and do a more rigorous analysis postlaunch. I'm in favor of Heap since theres less fear of big memory impacts.


For Admin Use

  • Not duplicate issue
  • Appropriate labels applied
  • Appropriate contributors tagged
  • Contributor assigned/self-assigned
@alexanderbez
Copy link
Contributor

alexanderbez commented Jan 4, 2019

Also not sure if this is necessary prelaunch although I'd love to work on it.

@jackzampolin
Copy link
Member

@alexanderbez When @cwgoes and I triaged issues last time we decided this should be prelaunch. If you want to work on it, go ahead!

@cwgoes
Copy link
Contributor

cwgoes commented Feb 15, 2019

(@mossid wanted to pick this up, and I think it will be a substantial performance improvement)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants