Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add hashWangYi1 #13823

Merged
merged 37 commits into from
Apr 15, 2020
Merged
Show file tree
Hide file tree
Changes from 17 commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
70c81c4
Unwind just the "pseudorandom probing" (whole hash-code-keyed variable
c-blake Mar 31, 2020
0433f4b
Fix `data.len` -> `dataLen` problem.
c-blake Mar 31, 2020
5dc4b0d
Merge /u/cb/pkg/nim/Nim-devel into devel
c-blake Mar 31, 2020
5a43e2c
This is an alternate resolution to https://github.com/nim-lang/Nim/is…
c-blake Mar 31, 2020
1b69485
Merge branch 'devel' of https://github.com/Araq/Nim into add_hashWangYi1
c-blake Mar 31, 2020
e89fd81
Merge branch 'devel' of https://github.com/Araq/Nim into add_hashWangYi1
c-blake Apr 1, 2020
4b0c41a
Re-organize to work around `when nimvm` limitations; Add some tests; Add
c-blake Apr 1, 2020
a7cada9
Add less than 64-bit CPU when fork.
c-blake Apr 1, 2020
4ed8e1a
Fix decl instead of call typo.
c-blake Apr 1, 2020
875e7fb
First attempt at fixing range error on 32-bit platforms; Still do the
c-blake Apr 1, 2020
dc4dba2
Merge branch 'devel' of https://github.com/Araq/Nim into add_hashWangYi1
c-blake Apr 1, 2020
b3510c3
A second try at making 32-bit mode CI work.
c-blake Apr 1, 2020
38fe03c
Use a more systematic identifier convention than Wang Yi's code.
c-blake Apr 2, 2020
1474ae4
Fix test that was wrong for as long as `toHashSet` used `rightSize` (a
c-blake Apr 2, 2020
b78e18d
Fix another stringified test depending upon hash order.
c-blake Apr 2, 2020
29424cc
Oops - revert the string-keyed test.
c-blake Apr 2, 2020
add55fe
Fix another stringify test depending on hash order.
c-blake Apr 2, 2020
00a708e
Add a better than always zero `defined(js)` branch.
c-blake Apr 2, 2020
2febb78
It turns out to be easy to just work all in `BigInt` inside JS and thus
c-blake Apr 2, 2020
33240f9
These string hash tests fail for me locally. Maybe this is what causes
c-blake Apr 3, 2020
f5aae61
Oops. That failure was from me manually patching string hash in hashe…
c-blake Apr 3, 2020
56e1df8
Merge branch 'devel' of https://github.com/Araq/Nim into add_hashWangYi1
c-blake Apr 3, 2020
131667a
Import more test improvements from https://github.com/nim-lang/Nim/pu…
c-blake Apr 4, 2020
fca9bb5
Fix bug where I swapped order when reverting the test. Ack.
c-blake Apr 4, 2020
1f11f9c
Oh, just accept either order like more and more hash tests.
c-blake Apr 4, 2020
0ca8acc
Iterate in the same order.
c-blake Apr 8, 2020
0c6162a
Merge branch 'devel' of https://github.com/Araq/Nim into add_hashWangYi1
c-blake Apr 8, 2020
41dd4a6
`return` inside `emit` made us skip `popFrame` causing weird troubles.
c-blake Apr 8, 2020
137e1f1
Oops - do Windows branch also.
c-blake Apr 8, 2020
b525246
`nimV1hash` -> multiply-mnemonic, type-scoped `nimIntHash1` (mnemonic
c-blake Apr 9, 2020
aa30554
Re-organize `when nimvm` logic to be a strict `when`-`else`.
c-blake Apr 14, 2020
fa879a5
Merge other changes.
c-blake Apr 14, 2020
97e685b
Merge branch 'devel' of https://github.com/Araq/Nim into add_hashWangYi1
c-blake Apr 14, 2020
67eab12
Lift constants to a common area.
c-blake Apr 14, 2020
f3c236e
Fall back to identity hash when `BigInt` is unavailable.
c-blake Apr 14, 2020
dfe9ee8
Merge branch 'devel' of https://github.com/Araq/Nim into add_hashWangYi1
c-blake Apr 15, 2020
e97c81a
Increase timeout slightly (probably just real-time perturbation of CI
c-blake Apr 15, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,9 @@
"undefined reference to `__builtin_saddll_overflow`" compile your programs
with `-d:nimEmulateOverflowChecks`.

- The default hash for `Ordinal` has changed to something more bit-scrambling.
`import hashes; proc hash(x: myInt): Hash = hashIdentity(x)` recovers the old
one in an instantiation context while `-d:nimV1hash` recovers it globally.

### Breaking changes in the standard library

Expand Down
2 changes: 1 addition & 1 deletion lib/pure/collections/sets.nim
Original file line number Diff line number Diff line change
Expand Up @@ -1008,7 +1008,7 @@ when isMainModule and not defined(release):

block toSeqAndString:
var a = toHashSet([2, 7, 5])
var b = initHashSet[int]()
var b = initHashSet[int](rightSize(3))
for x in [2, 7, 5]: b.incl(x)
assert($a == $b)
#echo a
Expand Down
53 changes: 51 additions & 2 deletions lib/pure/hashes.nim
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,46 @@ proc `!$`*(h: Hash): Hash {.inline.} =
res = res + res shl 15
result = cast[Hash](res)

proc hiXorLoFallback64(a, b: uint64): uint64 {.inline.} =
let # Fall back in 64-bit arithmetic
aH = a shr 32
aL = a and 0xFFFFFFFF'u64
bH = b shr 32
bL = b and 0xFFFFFFFF'u64
rHH = aH * bH
rHL = aH * bL
rLH = aL * bH
rLL = aL * bL
t = rLL + (rHL shl 32)
var c = if t < rLL: 1'u64 else: 0'u64
let lo = t + (rLH shl 32)
c += (if lo < t: 1'u64 else: 0'u64)
let hi = rHH + (rHL shr 32) + (rLH shr 32) + c
return hi xor lo

proc hiXorLo(a, b: uint64): uint64 {.inline.} =
# Xor of high & low 8B of full 16B product
when nimvm:
result = hiXorLoFallback64(a, b) # `result =` is necessary here.
else:
when Hash.sizeof < 8:
result = hiXorLoFallback64(a, b)
elif defined(gcc) or defined(llvm_gcc) or defined(clang):
{.emit: """__uint128_t r = a; r *= b; return (r >> 64) ^ r;""".}
elif defined(windows) and not defined(tcc):
{.emit: """a = _umul128(a, b, &b); return a ^ b;""".}
else:
result = hiXorLoFallback64(a, b)

proc hashWangYi1*(x: int64|uint64|Hash): Hash {.inline.} =
## Wang Yi's hash_v1 for 8B int. https://github.com/rurban/smhasher has more
## details. This passed all scrambling tests in Spring 2019 and is simple.
## NOTE: It's ok to define ``proc(x: int16): Hash = hashWangYi1(Hash(x))``.
const P0 = 0xa0761d6478bd642f'u64
const P1 = 0xe7037ed1a0b428db'u64
const P5x8 = 0xeb44accab455d165'u64 xor 8'u64
cast[Hash](hiXorLo(hiXorLo(P0, uint64(x) xor P1), P5x8))

proc hashData*(data: pointer, size: int): Hash =
## Hashes an array of bytes of size `size`.
var h: Hash = 0
Expand Down Expand Up @@ -112,10 +152,19 @@ proc hash*[T: proc](x: T): Hash {.inline.} =
else:
result = hash(pointer(x))

proc hash*[T: Ordinal](x: T): Hash {.inline.} =
## Efficient hashing of integers.
proc hashIdentity*[T: Ordinal](x: T): Hash {.inline.} =
## The identity hash. I.e. ``hashIdentity(x) = x``.
cast[Hash](ord(x))

when defined(nimV1hash):
proc hash*[T: Ordinal](x: T): Hash {.inline.} =
## Efficient hashing of integers.
hashIdentity(x)
else:
proc hash*[T: Ordinal](x: T): Hash {.inline.} =
## Efficient hashing of integers.
hashWangYi1(uint64(ord(x)))

proc hash*(x: float): Hash {.inline.} =
## Efficient hashing of floats.
var y = x + 0.0 # for denormalization
Expand Down
5 changes: 4 additions & 1 deletion tests/collections/tcollections.nim
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,10 @@ doAssert $(toOrderedSet(["1", "2", "3"])) == """{"1", "2", "3"}"""
doAssert $(toOrderedSet(['1', '2', '3'])) == """{'1', '2', '3'}"""

# Tests for tables
doAssert $({1: "1", 2: "2"}.toTable) == """{1: "1", 2: "2"}"""
when defined(nimV1hash):
doAssert $({1: "1", 2: "2"}.toTable) == """{1: "1", 2: "2"}"""
else:
doAssert $({1: "1", 2: "2"}.toTable) == """{2: "2", 1: "1"}"""
doAssert $({"1": 1, "2": 2}.toTable) == """{"1": 1, "2": 2}"""

# Tests for deques
Expand Down
5 changes: 4 additions & 1 deletion tests/collections/tcollections_to_string.nim
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,10 @@ doAssert $(toOrderedSet(["1", "2", "3"])) == """{"1", "2", "3"}"""
doAssert $(toOrderedSet(['1', '2', '3'])) == """{'1', '2', '3'}"""

# Tests for tables
doAssert $({1: "1", 2: "2"}.toTable) == """{1: "1", 2: "2"}"""
when defined(nimV1hash):
doAssert $({1: "1", 2: "2"}.toTable) == """{1: "1", 2: "2"}"""
else:
doAssert $({1: "1", 2: "2"}.toTable) == """{2: "2", 1: "1"}"""
doAssert $({"1": 1, "2": 2}.toTable) == """{"1": 1, "2": 2}"""

# Tests for deques
Expand Down
17 changes: 17 additions & 0 deletions tests/stdlib/thashes.nim
Original file line number Diff line number Diff line change
Expand Up @@ -15,3 +15,20 @@ suite "hashes":
test "0.0 and -0.0 should have the same hash value":
var dummy = 0.0
check hash(dummy) == hash(-dummy)

test "VM and runtime should make the same hash value (hashIdentity)":
const hi123 = hashIdentity(123)
check hashIdentity(123) == hi123

test "VM and runtime should make the same hash value (hashWangYi1)":
const wy123 = hashWangYi1(123)
check hashWangYi1(123) == wy123

test "hashIdentity value incorrect at 456":
check hashIdentity(456) == 456

test "hashWangYi1 value incorrect at 456":
when Hash.sizeof < 8:
check hashWangYi1(456) == 1293320666
else:
check hashWangYi1(456) == -6421749900419628582