-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
faster BigInt hashing #33790
faster BigInt hashing #33790
Conversation
+1 |
Very clever! I'm happy to merge this whenever. |
So I reviewed again in detail, still looks good to me :) I had tested with many thousands of values that the old and new versions match. But I wonder now whether I should revert deleting the |
That is a good thought. Let's do that and then get this off the runway! |
bump |
Anything left to do here besides merge? |
Faster BigInt hashing is ready to be merged. Working on a custom type, I have run into another (though related) The removed function is shown below for reference.
|
I assumed that not restoring the function indicated a desire to leave it out. If that is incorrect, so much the better, please put it back [covering paths accessed by reasonable use is reasonable]. |
08255d4
to
3a5f2ab
Compare
Sure, I restored |
@StefanKarpinski skies are clear, it is a beautiful day for merging. |
I didn't understand the reason for keeping |
IIUC, it's because external code may be relying on it. |
I encountered this a few years ago, and the fine grained specifics are a little hazy.
Unless Writing special purpose |
To phrase it in a different way than @JeffreySarnoff put it, it's for user-defined integer types which might use the generic
She doesn't have to know. But if we remove it here, while not breaking, she might notice a significant slow-down in hashing her type.
I believe not. We could add a few tests I guess.
Yes, but then the generic slow fall-back will take over. |
Alright, it just seems non-ideal to have untested "dead" code in Base "because it can be useful". Anyway, I don't want to hold up this PR but at some point, having a test to make it covered and some comment saying when it is being hit and that it exists as a performance optimization (like in the comment above) would be good imo. |
Yeah I agree, and that's why I deleted it initially. But actually, I was stupid, it's not dead code and is used by julia> @btime hash($(big(2)//3), UInt(0))
127.409 ns (3 allocations: 48 bytes)
0x36fc23415080622e
julia> m = collect(methods(Base.hash_integer))[1]
hash_integer(n::BigInt, h::UInt64) in Base at hashing2.jl:17
julia> Base.delete_method(m)
julia> @btime hash($(big(2)//3), UInt(0))
257.855 ns (7 allocations: 128 bytes)
0x36fc23415080622e |
3a5f2ab
to
d2bcd9c
Compare
that's a wrap .. time to merge |
h ⊻= hash_uint(ifelse(s < 0, -b, b) ⊻ h) | ||
for k = 2:abs(s) | ||
h ⊻= hash_uint(unsafe_load(p, k) ⊻ h) | ||
if GMP.Limb === UInt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like it would be worth adding a comment noting what happens when this isn't true. Am I reading it correctly that when GMP.Limb !== UInt
this falls back to the above hash_integer(n::Integer, h::UInt)
method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes this then falls back to the slower hash_integer(::Integer, ::UInt)
. Ok, will add a comment. I'm not even certain this happens in practive, but I couldn't prove that it can't, so it seemed safer to state clearly the assumptions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before pushing again and running CI for only comments change, let's check first if the comment is fine:
# this condition is true most (all?) of the time, and in this case we can define
# an optimized version of the above hash_integer(::Integer, ::UInt) method for BigInt
if GMP.Limb === UInt
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, looks good. You could also just push a separate [ci skip]
commit and then we can squash on merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok will do, for some reason I stayed with the idea that [ci skip]
didn't work anymore, since like 2 years ago or so!
* faster BigInt hashing
When the code was moved out of hashing2.jl, some imports where missed, cancelling the benefits of #33790.
…liaLang#40881) When the code was moved out of hashing2.jl, some imports where missed, cancelling the benefits of JuliaLang#33790.
…liaLang#40881) When the code was moved out of hashing2.jl, some imports where missed, cancelling the benefits of JuliaLang#33790.
The improvements concerns mainly:
Limb
(by special casing these more directly)Here is a small table containing the time for 1 hash (on my laptop) for few different sizes and for even/odd numbers. In parenthesis is the approximate ratio compared to master. "16 (sparse)" means something like
2^1000
, while "16 (dense)" is something like2^1000+1
or2^1000+2
.