forked from Gankra/hash-rs
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathindex.html
executable file
·185 lines (160 loc) · 6.21 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
<!DOCTYPE HTML>
<html>
<head>
<script src="http://cdnjs.cloudflare.com/ajax/libs/dygraph/1.1.1/dygraph-combined.js"></script>
<style>
.dygraph-legend {
background: none !important;
}
p {
max-width: 60em;
}
</style>
</head>
<body id="inject">
<p>
These are benchmarks of how the different Rust hashing algorithm implementations
compare. In particular, they are using the Hasher interface, which may not be
optimal for all algorithms. Implementation quality may also vary.
</p>
<p>
Numbers are generated from a MacBook Pro (x64 OSX) with a 2.8 GHz Intel Core
i7 (Haswell) and 16GB of RAM. This is unfortunately quite important, as
even different cpu revisions by the same manufacturer (e.g. Haswell vs Sandy Bridge)
can produce significantly different results for these functions. As always,
bench your workload on your hardware for best results.
</p>
<p>
Variance bars excluded because this dang charting library was absolutely
freaking out with them. Note also that both axes are on a log-scale, so small
vertical differences may actually represent a large difference.
</p>
<p>
<a href="https://github.com/Gankro/hash-rs">Source Can be Found Here</a>
</p>
<p>
The hash functions on display are:
</p>
<p>
<strong>SipHash 2-4 (sip)</strong>: This is the state-of-the-art for hashing that is
completely resistant against even theoretical hash flooding DDOS attacks when
paired with a random seed. While most applications probably do not require
this, it's important for some usecases (for instance, many webservers can be
completely locked up by POST requests with colliding keys). Since this is a
relatively obscure attack, but quite costly when applicable, we believe that
it's important to default to such a hash function. As such, this is the default hasher
used by Rust's HashMap, but it can be overloaded. Sip's performance is overall
mediocre but consistent. As such, it's not an awful default.
</p>
<p>
Performance could be gained by moving to SipHash 2-3, which has slightly inferior
cryptographic properties, but is sufficient for preventing flooding.
</p>
<p>
<strong>Fowler-Noll-Vo (fnv)</strong>: A hash function with excellent distribution, but no particular
cryptographic properties. FNV is the best at small inputs, but does terribly on large
inputs. Since small inputs are much more common, it's the
Rust community's most popular override for SipHash where security doesn't matter.
The Rust compiler itself uses this function.
</p>
<p>
<strong>xxHash (xx)</strong>: Like fnv, a hash function with good distribution, but no crypto.
It's our best performer for large inputs, and performs strongly on small ones.
</p>
<p>
<strong>FarmHash (farm)</strong>: Another decent distribution, but no crypto. This algorithm is
intended to perform excellently on all sizes of input, but is massively pessimized
by Rust's streaming hashing architecture, as it's designed to process the input
in one shot, forcing the entire input to be buffered in a growable array. In order
for it to compete, Rust needs to adopt a solution that allows the top-level type
in a hashing to identify whether it can be hashed in a one-shot manner, for a
specialized result. However this can lead to confusing or inconsistent results,
particularly if hashing derivation is done incorrectly.
</p>
<p>
<strong>Horner (horner)</strong>: A universal hash function without the full cryptographic
strength of SipHash. It defends against all hash flooding attacks that have
been demonstrated in practice, but is theoretically vulnerable to a timing
attack if the attacker is allowed to continuously query against maps with the
same seed. It's unclear how practical this attack is under ideal conditions for
the attacker, but for most situations this hasher is likely sufficient.
Performance is roughly on-par with xxHash.
</p>
<p>
<strong>btree</strong> is not a hash function at all, but rather feeding the same
inputs into Rust's BTreeMap, which is based on comparisons instead of hashing. This
serves as a baseline, as well as a demonstration of how ordered data structures
can completely stomp unordered data structures under particular workloads, even
when not used for their ordering properties. <strong>btreenew</strong> is an
experimental rewrite of std's btree that is kicking its butt.
</p>
<script type="text/javascript">
function makeBench(name, title, xLabel) {
var benchDiv = document.createElement("div");
var timeDiv = makeDiv();
var tputDiv = makeDiv();
var heading = document.createElement("h1");
heading.textContent = title;
heading.style = "text-align: center;"
benchDiv.appendChild(heading);
benchDiv.appendChild(timeDiv);
benchDiv.appendChild(tputDiv);
document.getElementById("inject").appendChild(benchDiv);
makeGraph("Time (lower is better)",
xLabel,
"time (ns)",
name + "-time.csv",
timeDiv);
makeGraph("Throughput (higher is better)",
xLabel,
"throughput (MB/s)",
name + "-throughput.csv",
tputDiv);
}
function makeDiv() {
var div = document.createElement("div");
div.style = "display: inline-block; width:650px; height:600px;";
return div;
}
function makeGraph(title, xLabel, yLabel, csv, div) {
var graph = new Dygraph(
div,
csv,
{
logscale: true,
title: title,
xlabel: xLabel,
ylabel: yLabel,
strokeWidth: 2.0,
labelsDivWidth: 550,
legend: "always",
axes: {
x: { logscale: true, drawGrid: false },
y: { logscale: true, drawGrid: false },
},
series: {
sip: { color: "#cc0000" },
xx: { color: "#00cc00" },
fnv: { color: "#0000cc" },
farm: { color: "#cc00cc" },
murmur: { color: "#cccc00" },
city: { color: "#00cccc" },
btree: { color: "#000000" },
btreenew: { color: "#aaaaaa" },
horner: { color: "#884444" }
},
}
);
}
makeBench("bytes",
"Hashing an array of bytes",
"bytes hashed");
makeBench("mapcountdense",
"Counting number of occurrences of X byte-strings (mostly duplicates)",
"bytes per string");
makeBench("mapcountsparse",
"Counting number of occurrences of X byte-strings (mostly unique)",
"bytes per string");
</script>
</body>
</html>