You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A comment in the code claims Perfect hash collisions should not occur in practice since we perform rehashing after using 55 bits (MAX_SHIFT) of hash. Here's how rehashing is explained in Bagwell's paper: The algorithm requires that the hash can be extended to an arbitrary number of bits. This was accomplished by rehashing the key combined with an integer representing the trie level, zero being the root. Hence if two keys do give the same initial hash then the rehash has a probability of 1 in 232 of a further collision. However the Julia implementation rehashes not the original key, but the previous hash value. If two hashes collide, so will the rehashed hashes.
So looking at IDdict it also "just" uses objectid and then uses the typical probe + egal check and grow the table on conflict.
For HAMT that statregy wouldn't work. We would probably need to introduce a "PerfectConflict" node with a linear probe,
but as you noted with Symbols this is a general property of using objectid as source of the hash.
I'm looking at the implementation of PersistentDict and HAMT in https://github.com/JuliaLang/julia/blob/master/base/hamt.jl. This implementation attempts to avoid hash collisions by using rehashing, but I believe it does not provide the intended effect.
A comment in the code claims Perfect hash collisions should not occur in practice since we perform rehashing after using 55 bits (MAX_SHIFT) of hash. Here's how rehashing is explained in Bagwell's paper: The algorithm requires that the hash can be extended to an arbitrary number of bits. This was accomplished by rehashing the key combined with an integer representing the trie level, zero being the root. Hence if two keys do give the same initial hash then the rehash has a probability of 1 in 232 of a further collision. However the Julia implementation rehashes not the original key, but the previous hash value. If two hashes collide, so will the rehashed hashes.
Here's a test case:
Note that
IdDict
can handle such keys:The text was updated successfully, but these errors were encountered: