-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Add a benchmark for snapshot::account::to_fat_rlps() #11185
Conversation
`to_fat_rlps()` is a hot call during snapshots. I don't think it has a perf problem per se, but better to have benchmark for it. The data used is a piece of Ropsten data sized to the ~95% percentile of account size on that network.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just one question.
let basic_account: BasicAccount = rlp::decode(&*account_data).expect("rlp from disk is ok"); | ||
let account_db = AccountDB::from_hash(hashdb, account_hash); | ||
let progress = Progress::new(); | ||
let mut used_code = HashSet::new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should used_code
and progress
be moved inside b.iter
? not sure which case we're trying to measure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do you mean "which case"? You mean inside to_fat_rlps()
? It's a big method and I'm not specifically targeting a particular branch but it started more from the observation of "wow, a loop in a loop in there, and I have no idea how many iterations this typically does" and then, looking at the tests which are rather artificial, I thought I'd collect some real data and see if call in its entirety is actually slow or not.
Turns out it's not, ~10us (and I'm using an input that is much bigger than the average). I ran a few tests to check what typical loop counts are: outer loop once, sometimes twice, inner loop was like 5 times. Then I collected histograms of the # of bytes returned from here and it's mostly small and the 16 parts of the state seem mostly homogeneous in this sense. If there are "lumps" in the data I didn't find it.
Not sure if I have answered your question though! I don't think either needs to be in b.iter()
but maybe I'm misunderstanding you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I mean inside to_fat_rlps
. The perf is different depending on the content of used_code
and you're passing the same reference to the used_code
to b.iter()
which runs the code in a loop.
I've got the following difference when moving the used_code
inside b.iter()
to_fat_rlps, 5282 bytes, (0x1041…cf2d)
time: [6.2453 us 6.2689 us 6.2954 us]
change: [+18.545% +20.243% +23.000%] (p = 0.00 < 0.05)
Performance has regressed.
Found 10 outliers among 100 measurements (10.00%)
3 (3.00%) high mild
7 (7.00%) high severe
to_fat_rlps, 5908 bytes, (0x104f…f0ea)
time: [7.8668 us 7.8816 us 7.8975 us]
change: [+6.5079% +7.2548% +7.9737%] (p = 0.00 < 0.05)
Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) high mild
1 (1.00%) high severe
to_fat_rlps, 6344 bytes, (0x2032…8275)
time: [7.8879 us 7.9030 us 7.9183 us]
change: [+10.102% +10.399% +10.715%] (p = 0.00 < 0.05)
Performance has regressed.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severe
to_fat_rlps, 6723 bytes, (0x2075…0dd7)
time: [9.8169 us 9.8328 us 9.8490 us]
change: [+8.6896% +9.0660% +9.4109%] (p = 0.00 < 0.05)
Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high mild
to_fat_rlps, 6936 bytes, (0x3042…f803)
time: [1.1720 us 1.1759 us 1.1808 us]
change: [+175.43% +177.79% +179.97%] (p = 0.00 < 0.05)
Performance has regressed.
Found 13 outliers among 100 measurements (13.00%)
4 (4.00%) high mild
9 (9.00%) high severe
the biggest difference is for the last chunk
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, good catch. Ok, let's move it inside then.
That last chunk is from mainnet btw, the others from ropsten.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, seeing the same effect:
Running /Users/dvd/dev/parity/parity-ethereum/target/release/deps/to_fat_rlps-25811198c1c2ab93
to_fat_rlps, 5282 bytes, (0x1041…cf2d)
time: [8.6684 us 8.8360 us 9.0185 us]
change: [+19.126% +22.714% +26.009%] (p = 0.00 < 0.05)
Performance has regressed.
Found 4 outliers among 100 measurements (4.00%)
3 (3.00%) high mild
1 (1.00%) high severe
to_fat_rlps, 5908 bytes, (0x104f…f0ea)
time: [10.474 us 10.588 us 10.710 us]
change: [+8.7095% +13.323% +17.690%] (p = 0.00 < 0.05)
Performance has regressed.
Found 10 outliers among 100 measurements (10.00%)
9 (9.00%) high mild
1 (1.00%) high severe
to_fat_rlps, 6344 bytes, (0x2032…8275)
time: [10.401 us 10.536 us 10.687 us]
change: [+8.6806% +12.271% +15.996%] (p = 0.00 < 0.05)
Performance has regressed.
Found 9 outliers among 100 measurements (9.00%)
4 (4.00%) high mild
5 (5.00%) high severe
to_fat_rlps, 6723 bytes, (0x2075…0dd7)
time: [12.859 us 13.089 us 13.350 us]
change: [+9.0429% +12.423% +16.180%] (p = 0.00 < 0.05)
Performance has regressed.
Found 14 outliers among 100 measurements (14.00%)
7 (7.00%) high mild
7 (7.00%) high severe
to_fat_rlps, 6936 bytes, (0x3042…f803)
time: [2.0104 us 2.0549 us 2.1035 us]
change: [+113.79% +122.66% +131.16%] (p = 0.00 < 0.05)
Performance has regressed.
Found 9 outliers among 100 measurements (9.00%)
6 (6.00%) high mild
3 (3.00%) high severe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, looking again I have doubts: in chunk_state()
the use_code
is instantiate once per part and re-used for all the millions of calls to to_fat_rlps()
so maybe I got it right the first time? :/
This reverts commit cff33ab.
* master: (21 commits) ropsten #6631425 foundation #8798209 (#11201) Update list of bootnodes for xDai chain (#11236) ethcore/res: add mordor testnet configuration (#11200) [chain specs]: activate `Istanbul` on mainnet (#11228) [builtin]: support `multiple prices and activations` in chain spec (#11039) Insert explicit warning into the panic hook (#11225) Snapshot restoration overhaul (#11219) Fix docker centos build (#11226) retry on gitlab system failures (#11222) Update bootnodes. (#11203) Use provided usd-per-eth value if an endpoint is specified (#11209) Use a lock instead of atomics for snapshot Progress (#11197) [informant]: `MillisecondDuration` -> `as_millis()` (#11211) Step duration map configuration parameter ported from the POA Network fork (#10902) Upgrade jsonrpc to latest (#11206) [export hardcoded sync]: use debug for `H256` (#11204) Pause pruning while snapshotting (#11178) Type annotation for next_key() matching of json filter options (#11192) Crypto primitives removed from ethkey (#11174) Made ecrecover implementation trait public (#11188) ...
* master: Clarify what first_block `None` means (#11269) removed redundant VMType enum with one variant (#11266) Ensure jsonrpc threading settings are sane (#11267) Return Ok(None) when the registrar contract returns empty slice (#11257) Add a benchmark for snapshot::account::to_fat_rlps() (#11185) Fix misc compile warnings (#11258) simplify verification (#11249)
to_fat_rlps()
is a hot call during snapshots. I don't think it has a perf problem per se, but better to have benchmark for it.The RLP data used in the benchmark is collected from Mainnet/Ropsten and sized to the 99-percentile of account sizes.
Example results:
The last chunk,
6936 bytes, (0x3042…f803)
, is from mainnet. I'm a little surprised to see such a large difference between Ropsten and mainnet, but maybe there's a reason for that?