-
Notifications
You must be signed in to change notification settings - Fork 470
MurmurHash3
MurmurHash3 information and brief performance results
(Note - this information is a bit stale, and refers to an earlier version of Murmur3. I'll rewrite it when I have time.)
MurmurHash3 is the successor to MurmurHash2. It comes in 3 variants - a 32-bit version that targets low latency for hash table use and two 128-bit versions for generating unique identifiers for large blocks of data, one each for x86 and x64 platforms.
MurmurHash3's mix functions are based on this snippet -
k *= c1;
k = rotl(k,r1);
k *= c2;
h ^= k;
h = rotl(h,r1);
h = h*m1+n1;
k
is a block of the key, h
is a block of the hash state, and rN
/mN
/cN
are constants.
For each block of the key, we pre-mix it using two constants and a rotate, xor it into the hash block, and them mix the hash block using a rotate and a multiply-add.
MurmurHash3's 32-bit finalizer is
h ^= h >> 16;
h *= 0x85ebca6b;
h ^= h >> 13;
h *= 0xc2b2ae35;
h ^= h >> 16;
and its 64-bit finalizer is
h ^= h >> 33;
h *= 0xff51afd7ed558ccd;
h ^= h >> 33;
h *= 0xc4ceb9fe1a85ec53;
h ^= h >> 33;
The constants for the finalizers were generated by a simple simulated-annealing algorithm, and both avalanche all bits of h
to within 0.25% bias.
The 128-bit variants mix multiple blocks of key data in parallel. To ensure that all the intermediate hash blocks affect each other, Murmurhash3 does a few simple operations interleaved with the block mix -
The 64-bit, 2-block inter-mix is
h1 += h2;
h2 = _rotl64(h2,41);
h2 += h1;
The 32-bit, 4-block inter-mix is
h1 += h2;
h1 += h3;
h1 += h4;
h1 = _rotl(h1,17);
h2 += h1;
h3 += h1;
h4 += h1;
where hN
is one block of the hash value.
Results are from an Intel Core 2 Quad Q9650 running at 3.0 ghz, running on a single core.
Hash | Speed |
---|---|
FNV_x86_32 | 554 mb/sec |
FNV_x64_32 | 715 mb/sec |
SuperFastHash_x86_32 | 1224 mb/sec (1) |
SuperFastHash_x64_32 | 1311 mb/sec |
Lookup3_x86_32 | 1234 mb/sec |
Lookup3_x64_32 | 1265 mb/sec |
MurmurHash2_x86_32 | 2577 mb/sec |
MurmurHash2_x86_64 | 3352 mb/sec (2) |
MurmurHash2_x64_64 | 2857 mb/sec |
MurmurHash3_x86_32 | 3105 mb/sec |
MurmurHash3_x86_128 | 2684 mb/sec |
MurmurHash3_x64_128 | 5058 mb/sec (3) |
(1) - SuperFastHash has very poor collision properties, which have been documented elsewhere.
(2) - MurmurHash2_x86_64 computes two 32-bit results in parallel and mixes them at the end, which is fast but means that collision resistance is only as good as a 32-bit hash. I suggest avoiding this variant.
(3) - That's about 1.68 bytes per cycle, or about 9.5 cycles per 16-byte chunk. The inner loop is 20 instructions long, so we're sustaining over 2 instructions per cycle. Hooray for modern platforms with fast 64-bit multipliers and superscalar architectures. :)