-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
faster hashing #11203
faster hashing #11203
Conversation
narimiran
commented
May 8, 2019
- multibyte hashing for:
- string and string slices
- cstring
- string, ignoring case
- string, ignoring style
- openArray of byte or char
lib/pure/hashes.nim
Outdated
@@ -136,6 +139,17 @@ proc hash*[T: Ordinal](x: T): Hash {.inline.} = | |||
## Efficient hashing of other ordinal types (e.g. enums). | |||
result = ord(x) | |||
|
|||
template multibyteHashImpl(result: Hash, x: typed, start, stop: int) = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here are some results on my machine (i7-970) for some basic tests I did. These are average times, but there is still variance so single-digit percent difference should be taken as zero.
test | old algo (ms) | new algo (ms) | difference |
---|---|---|---|
10_000 English words | 0.731 | 0.750 | +2.6% |
20_000 English words | 1.517 | 1.563 | +3.0% |
50_000 English words | 4.559 | 4.678 | +2.6% |
--- | --- | --- | --- |
20_000 random 4-char strings | 0.497 | 0.503 | +1.2% |
20_000 random 8-char strings | 0.775 | 1.077 | +38.9% ?? |
20_000 random 9-char strings | 0.867 | 0.614 | -29.1% |
20_000 random 12-char strings | 0.952 | 0.687 | -27.8% |
20_000 random 16-char strings | 1.162 | 0.592 | -49.0% |
20_000 random 24-char strings | 1.504 | 0.600 | -60.1% |
20_000 random 32-char strings | 1.860 | 0.656 | -64.7% |
20_000 random 64-char strings | 3.274 | 0.851 | -74.0% |
As the length of string/array/seq increases, so does the performance of the new algorithm.
The only exception is length of 8, which is unexpectedly slow (the first while loop is called once, the second one is called zero times). Increasing the length to 9 (first loop once, second once) has a speed-up which would be expected for length of 8 too.
I've tried several different ways to write the first loop, but all had the similar results.
@mratsim do you maybe know what is going on here? Any tips?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you share your benchmarking script?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you share your benchmarking script?
It was just a quick/simple "hash a word, add a hash to set, do something with that set (don't time it) just in case", repeat that dozens of time, and report the average:
proc hashTesting(wordlist: seq[string], f: proc (x: string): Hash, name: string): float =
let
amount = 20_000
repetitions = 48
var
collisions = newSeq[int](repetitions)
highest = newSeq[int](repetitions)
times = newSeq[float](repetitions)
for i in 0 ..< repetitions:
let words = wordList[amount*i ..< amount*i+amount]
var s = initHashSet[int](sets.rightSize(amount))
var t = cpuTime()
for w in words:
s.incl f(w)
t = cpuTime() - t
times[i] = t
collisions[i] = amount - s.len
highest[i] = s.toSeq.max
echo "--"
echo name
echo highest.min
echo collisions.sum / collisions.len
echo "max: ", times.max * 1000.0
echo "min: ", times.min * 1000.0
return times.sum / times.len.float * 1000.0
where f: proc (x: string): Hash
is either the old one or the new one (or several others that I've tested).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only exception is length of 8, which is unexpectedly slow
this is explained here: #11581 (and it's not just length 8)
* multibyte hashing for: * string and string slices * cstring * string, ignoring case * string, ignoring style * openArray of byte or char
297c95e
to
6875294
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Beyond the perf that I will check, I found a couple issues in the current and new hashes:
- The test suite is very empty (https://github.com/nim-lang/Nim/blob/devel/tests/stdlib/thashes.nim), I'm pretty sure empty strings/sequences will crash.
- The new hashing scheme does not work in the VM, so we wouldn't be able to use string based hashing at compile-time.
lib/pure/hashes.nim
Outdated
template multibyteHashImpl(result: Hash, x: typed, start, stop: int) = | ||
var i = start | ||
while i <= stop+1 - IntSize: | ||
let n = cast[ptr int](unsafeAddr x[i])[] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have a fallback for hashing strings in the VM and a test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also while i < stop - IntSize:
is cleaner
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also
while i < stop - IntSize:
is cleaner
You almost had me there ;)
Your version is true only when stop >= 9
or higher (length of 10
), mine is when stop >= 7
(length of 8
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Argh off-by-one :/
lib/pure/hashes.nim
Outdated
lowerString = newStringOfCap(len(x)) | ||
i = 0 | ||
while i <= high(x): | ||
lowerString.addLowercaseChar(x[i]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this crashes if x = ""
while i < x.len:
wouldn't crash
A test case for empty strings would be nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, high for empty strings produces -1, not 0.
lib/pure/hashes.nim
Outdated
lowerString = newStringOfCap(remainingLength) | ||
i = sPos | ||
while i <= ePos: | ||
lowerString.addLowercaseChar(sBuf[i]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If string is empty and we pass sPos = ePos = 0, this probably crash.
Test case as well.
(Original code probably suffer from the same issue).
lib/pure/hashes.nim
Outdated
@@ -282,8 +280,12 @@ proc hash*[T: tuple](x: T): Hash = | |||
|
|||
proc hash*[A](x: openArray[A]): Hash = | |||
## Efficient hashing of arrays and sequences. | |||
for it in items(x): result = result !& hash(it) | |||
result = !$result | |||
when A is char|byte: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
Do we add uint8 as well?
-
Do we add other integers? And floats?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uint8 is an alias for byte.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since high("") returns -1, this shouldn't crash, I'd like some empty string tests though to avoid unlucky refactors.
I'll check the perf later today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.
I can't reproduce your outlier on my machine, for 20000 8-char random string the new implementation is about 25% faster, script here: https://gist.github.com/mratsim/e5a2d1d74adc2763ab7b080a7c40ef1e. So for me we only need a fallback path for the VM. |
Thanks for testing it! And now with your version of the script I also get the speed improvements for the length of 8.
What do you recommend? How would that fallback look like? Use the old version for VM? |
|
0d07f07
to
fb58f44
Compare
* use optimized version for all ints * add more tests * make it work in VM * put warnings about differences between CT and runtime * minor style tweaks
db04f15
to
d886d35
Compare
d886d35
to
28b9097
Compare
Btw, here are some more benchmark results, now that I made a version that works on all integers. This is hashing of 100_000 containers with X elements in each. int8
int16
int32
|
d6ce277
to
4ca7bcd
Compare
Have you seen Meow ❔ |