-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Commit 5bdbd21

Clark Gaebel
Performance-oriented hashtable.
Previously, rust's hashtable was totally unoptimized. It used an Option
per key-value pair, and used very naive open allocation.
The old hashtable had very high variance in lookup time. For an example,
see the 'find_nonexisting' benchmark below. This is fixed by keys in
'lucky' spots with a low probe sequence length getting their good spots
stolen by keys with long probe sequence lengths. This reduces hashtable
probe length variance, while maintaining the same mean.
Also, other optimization liberties were taken. Everything is as cache
aware as possible, and this hashtable should perform extremely well for
both large and small keys and values.
Benchmarks:
comprehensive_old_hashmap 378 ns/iter (+/- 8)
comprehensive_new_hashmap 206 ns/iter (+/- 4)
1.8x faster
old_hashmap_as_queue 238 ns/iter (+/- 8)
new_hashmap_as_queue 119 ns/iter (+/- 2)
2x faster
old_hashmap_insert 172 ns/iter (+/- 8)
new_hashmap_insert 146 ns/iter (+/- 11)
1.17x faster
old_hashmap_find_existing 50 ns/iter (+/- 12)
new_hashmap_find_existing 35 ns/iter (+/- 6)
1.43x faster
old_hashmap_find_notexisting 49 ns/iter (+/- 49)
new_hashmap_find_notexisting 34 ns/iter (+/- 4)
1.44x faster
Memory usage of old hashtable (64-bit assumed):
aligned(8+sizeof(K)+sizeof(V))/0.75 + 6 words
Memory usage of new hashtable:
(aligned(sizeof(K))
+ aligned(sizeof(V))
+ 8)/0.9 + 6.5 words
BUT accesses are much more cache friendly. In fact, if the probe
sequence length is below 8, only two cache lines worth of hashes will be
pulled into cache. This is unlike the old version which would have to
stride over the stoerd keys and values, and would be more cache
unfriendly the bigger the stored values got.
And did you notice the higher load factor? We can now reasonably get a
load factor of 0.9 with very good performance.1 parent 3316a0e commit 5bdbd21Copy full SHA for 5bdbd21
1 file changed
+1412
-594
lines changed
0 commit comments