Skip to content

Commit 2eebeb8

Browse files
committed
auto merge of #12081 : cgaebel/rust/robinhood-hashing, r=alexcrichton
Partially addresses #11783. Previously, rust's hashtable was totally unoptimized. It used an Option per key-value pair, and used very naive open allocation. The old hashtable had very high variance in lookup time. For an example, see the 'find_nonexisting' benchmark below. This is fixed by keys in 'lucky' spots with a low probe sequence length getting their good spots stolen by keys with long probe sequence lengths. This reduces hashtable probe length variance, while maintaining the same mean. Also, other optimization liberties were taken. Everything is as cache aware as possible, and this hashtable should perform extremely well for both large and small keys and values. Benchmarks: ``` comprehensive_old_hashmap 378 ns/iter (+/- 8) comprehensive_new_hashmap 206 ns/iter (+/- 4) 1.8x faster old_hashmap_as_queue 238 ns/iter (+/- 8) new_hashmap_as_queue 119 ns/iter (+/- 2) 2x faster old_hashmap_insert 172 ns/iter (+/- 8) new_hashmap_insert 146 ns/iter (+/- 11) 1.17x faster old_hashmap_find_existing 50 ns/iter (+/- 12) new_hashmap_find_existing 35 ns/iter (+/- 6) 1.43x faster old_hashmap_find_notexisting 49 ns/iter (+/- 49) new_hashmap_find_notexisting 34 ns/iter (+/- 4) 1.44x faster Memory usage of old hashtable (64-bit assumed): aligned(8+sizeof(Option)+sizeof(K)+sizeof(V))/0.75 + 48ish bytes Memory usage of new hashtable: (aligned(sizeof(K)) + aligned(sizeof(V)) + 8)/0.9 + 112ish bytes Timing of building librustc: compile_and_link: x86_64-unknown-linux-gnu/stage0/lib/rustlib/x86_64-unknown-linux-gnu/lib/librustc time: 0.457 s parsing time: 0.028 s gated feature checking time: 0.000 s crate injection time: 0.108 s configuration 1 time: 1.049 s expansion time: 0.219 s configuration 2 time: 0.222 s maybe building test harness time: 0.223 s prelude injection time: 0.268 s assinging node ids and indexing ast time: 0.075 s external crate/lib resolution time: 0.026 s language item collection time: 1.016 s resolution time: 0.038 s lifetime resolution time: 0.000 s looking for entry point time: 0.030 s looking for macro registrar time: 0.061 s freevar finding time: 0.138 s region resolution time: 0.110 s type collecting time: 0.072 s variance inference time: 0.126 s coherence checking time: 9.110 s type checking time: 0.186 s const marking time: 0.049 s const checking time: 0.418 s privacy checking time: 0.057 s effect checking time: 0.033 s loop checking time: 1.293 s compute moves time: 0.182 s match checking time: 0.242 s liveness checking time: 0.866 s borrow checking time: 0.150 s kind checking time: 0.013 s reachability checking time: 0.175 s death checking time: 0.461 s lint checking time: 13.112 s translation time: 4.352 s llvm function passes time: 96.702 s llvm module passes time: 50.574 s codegen passes time: 154.611 s LLVM passes time: 2.821 s running linker time: 15.750 s linking compile_and_link: x86_64-unknown-linux-gnu/stage1/lib/rustlib/x86_64-unknown-linux-gnu/lib/librustc time: 0.422 s parsing time: 0.031 s gated feature checking time: 0.000 s crate injection time: 0.126 s configuration 1 time: 1.014 s expansion time: 0.251 s configuration 2 time: 0.249 s maybe building test harness time: 0.273 s prelude injection time: 0.279 s assinging node ids and indexing ast time: 0.076 s external crate/lib resolution time: 0.033 s language item collection time: 1.028 s resolution time: 0.036 s lifetime resolution time: 0.000 s looking for entry point time: 0.029 s looking for macro registrar time: 0.063 s freevar finding time: 0.133 s region resolution time: 0.111 s type collecting time: 0.077 s variance inference time: 0.565 s coherence checking time: 8.953 s type checking time: 0.176 s const marking time: 0.050 s const checking time: 0.401 s privacy checking time: 0.063 s effect checking time: 0.032 s loop checking time: 1.291 s compute moves time: 0.172 s match checking time: 0.249 s liveness checking time: 0.831 s borrow checking time: 0.121 s kind checking time: 0.013 s reachability checking time: 0.179 s death checking time: 0.503 s lint checking time: 14.385 s translation time: 4.495 s llvm function passes time: 92.234 s llvm module passes time: 51.172 s codegen passes time: 150.809 s LLVM passes time: 2.542 s running linker time: 15.109 s linking ``` BUT accesses are much more cache friendly. In fact, if the probe sequence length is below 8, only two cache lines worth of hashes will be pulled into cache. This is unlike the old version which would have to stride over the stoerd keys and values, and would be more cache unfriendly the bigger the stored values got. And did you notice the higher load factor? We can now reasonably get a load factor of 0.9 with very good performance. Please review this very closely. This is my first major contribution to Rust. Sorry for the ugly diff!
2 parents 4d64441 + 5bdbd21 commit 2eebeb8

File tree

1 file changed

+1412
-594
lines changed

1 file changed

+1412
-594
lines changed

0 commit comments

Comments
 (0)