-
Notifications
You must be signed in to change notification settings - Fork 294
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speed up finding jumpdests #80
Conversation
a199687
to
18f421e
Compare
Codecov Report
@@ Coverage Diff @@
## master #80 +/- ##
==========================================
+ Coverage 83.27% 87.66% +4.38%
==========================================
Files 20 20
Lines 1985 1986 +1
Branches 218 216 -2
==========================================
+ Hits 1653 1741 +88
+ Misses 307 220 -87
Partials 25 25 |
110f662
to
9ca8739
Compare
How does having two vectors instead of a vector of pairs help? Shorter data better fits into the cache line? |
Does this beat |
Only the first vector is traversed. In this PR we lowered the memory on which search happens by 4. In worst case of Anyway, different variants were benchmarked in https://github.com/ethereum/evmone/tree/internal_benchmarks (to be merged independently). One "easy" missing optimization is to pack both vectors into single memory allocation and not to over-allocate. I was thinking about first using stack space of There is also possibility to use SIMD to compare 16 or 32 items at a time. Or even use "k-Ary Search": https://event.cwi.nl/damon2009/DaMoN09-KarySearch.pdf
I will have to check. I hope it is because |
Using I think we will have to revisit using hash map here later on. I've seen some interesting recent work in the subject, including hashmaps using continuous memory. |
a058908
to
d3cfba8
Compare
This replaces the linear search if the jumpdest with binary search. It also applies data-driven approach where the jumpdest "map" is not vector of pairs, but two vectors of offsets and targets. We also shrank the size of elements from
int
toint16_t
.There is a small trade-off here. The analysis takes longer because requires 2x more vector resizes. And for contracts with small number of jumpdests (like blake2b_huff which only has 3 of them) the time increase in analysis might hide the gain in execution. But still I believe it's worth the 33% speed increase in blake2b_shifts which has a lot of jumpdests.