-
-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use safer computation of midpoint in sorting (fixes #33977) #34106
Conversation
Hmm, I am getting $ make
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 501 100 501 0 0 1318 0 --:--:-- --:--:-- --:--:-- 1314
100 1423k 100 1423k 0 0 1562k 0 --:--:-- --:--:-- --:--:-- 1562k
===============================================================================
ERROR: sha512 checksum failure on SuiteSparse.v5.4.0-2.x86_64-linux-gnu.tar.gz, should be:
841050c51b5965dc3ccb5b933df09d33e5912b435c39cf67098aed4d30386cf1
9825433aea2b2af4668f7358f49ddc178444831eeeeb996bb35328c51bb2441d
But `sha512sum /home/tim/src/julia-master/deps/srccache/SuiteSparse.v5.4.0-2.x86_64-linux-gnu.tar.gz | awk '{ print $1; }'` results in:
7b49f0bd1a20104df6993e949ab0e8aa25b7ea64250a726e9c1b0312debdbb87
c949e4449790de4eaffd072a671da32da8a5301ee62b589ce05ea350f65a3e3e
This can happen due to bad downloads or network proxies, please check your
network proxy/firewall settings and delete
/home/tim/src/julia-master/deps/srccache/SuiteSparse.v5.4.0-2.x86_64-linux-gnu.tar.gz
to force a redownload when you are ready
===============================================================================
/home/tim/src/julia-master/deps/suitesparse.mk:144: recipe for target '/home/tim/src/julia-master/usr/manifest/suitesparse' failed
make[1]: *** [/home/tim/src/julia-master/usr/manifest/suitesparse] Error 2
Makefile:60: recipe for target 'julia-deps' failed
make: *** [julia-deps] Error 2 Anyone else seeing this? (Yes, I tried deleting that file, but it happened again.) Even a |
|
1e8d5f4
to
5664788
Compare
@nanosoldier |
5664788
to
cec4b81
Compare
@nanosoldier |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan |
cec4b81
to
0738131
Compare
Some of those regressions were real. Let's see if Knuth can come to the rescue. @nanosoldier |
0738131
to
fd838be
Compare
@nanosoldier |
From local measurements I expect a slight regression (e.g., |
I wonder if it would be better to do something like this: lo + ((hi - lo) >>> 1) It does pass all the tests, for what it's worth. |
In my tests it's the same speed. But I guess the Knuth method is necessary in C where they don't have I guess one advantage of the Knuth method is that it gives julia> Base.Sort.midpoint(3, 1)
2 whereas julia> Base.Sort.midpoint(3, 1)
-9223372036854775806 |
Actually, I take that back about them being the same speed. The results are pretty variable, so I have to run it many times to be sure, but a fairly typical result of the Knuth method is julia> run(BaseBenchmarks.SUITE[["sparse", "index", ("spmat", "row", "logical", 1000)]])
BenchmarkTools.Trial:
memory estimate: 4.81 KiB
allocs estimate: 11
--------------
minimum time: 7.694 μs (0.00% GC)
median time: 8.395 μs (0.00% GC)
mean time: 8.450 μs (0.00% GC)
maximum time: 216.727 μs (0.00% GC)
--------------
samples: 10000
evals/sample: 1 but if I write this as julia> run(BaseBenchmarks.SUITE[["sparse", "index", ("spmat", "row", "logical", 1000)]])
BenchmarkTools.Trial:
memory estimate: 4.81 KiB
allocs estimate: 11
--------------
minimum time: 5.884 μs (0.00% GC)
median time: 6.393 μs (0.00% GC)
mean time: 7.154 μs (0.00% GC)
maximum time: 4.049 ms (0.00% GC)
--------------
samples: 10000
evals/sample: 1 If I write the shift with |
Yes, the version I posted does require that Another consideration, however is whether the definition makes sense for any integer type—it's usually good to think about bigints. I'm not sure the function midpoint(a::Integer, b::Integer)
lo, hi = minmax(a, b)
lo + ((hi - lo) >> 1)
end |
Here's a breakdown of algorithms and timing. All 3 are overflow- and negative-safe, but only Algs 1 and 3 are safe for either ordering of inputs. mastermidpoint(lo::T, hi::T) where T<:Integer = (lo + hi) >>> 1 julia> run(BaseBenchmarks.SUITE[["sparse", "index", ("spmat", "row", "logical", 1000)]])
BenchmarkTools.Trial:
memory estimate: 4.81 KiB
allocs estimate: 11
--------------
minimum time: 5.820 μs (0.00% GC)
median time: 7.173 μs (0.00% GC)
mean time: 7.355 μs (0.00% GC)
maximum time: 579.311 μs (0.00% GC)
--------------
samples: 10000
evals/sample: 1 Alg 1function midpoint(a::T, b::T) where T<:Integer
lo, hi = minmax(a, b)
return lo + ((hi - lo) >>> 0x01)
end julia> run(BaseBenchmarks.SUITE[["sparse", "index", ("spmat", "row", "logical", 1000)]])
BenchmarkTools.Trial:
memory estimate: 4.81 KiB
allocs estimate: 11
--------------
minimum time: 9.932 μs (0.00% GC)
median time: 10.518 μs (0.00% GC)
mean time: 10.589 μs (0.00% GC)
maximum time: 331.517 μs (0.00% GC)
--------------
samples: 10000
evals/sample: 1 Alg 2midpoint(lo::T, hi::T) where T<:Integer = lo + ((hi - lo) >>> 0x01) julia> run(BaseBenchmarks.SUITE[["sparse", "index", ("spmat", "row", "logical", 1000)]])
BenchmarkTools.Trial:
memory estimate: 4.81 KiB
allocs estimate: 11
--------------
minimum time: 5.736 μs (0.00% GC)
median time: 6.173 μs (0.00% GC)
mean time: 6.530 μs (0.00% GC)
maximum time: 303.521 μs (0.00% GC)
--------------
samples: 10000
evals/sample: 1 Alg 3midpoint(lo::T, hi::T) where T<:Integer = (lo & hi) + ((lo ⊻ hi) >> 0x01) ```julia
julia> run(BaseBenchmarks.SUITE[["sparse", "index", ("spmat", "row", "logical", 1000)]])
BenchmarkTools.Trial:
memory estimate: 4.81 KiB
allocs estimate: 11
--------------
minimum time: 7.396 μs (0.00% GC)
median time: 8.072 μs (0.00% GC)
mean time: 8.642 μs (0.00% GC)
maximum time: 2.117 ms (0.00% GC)
--------------
samples: 10000
evals/sample: 1 As you can see, Alg 2 ≈ master are the fastest, followed by Alg 3, followed by Alg 1. |
If it's only for internal use, it seems like we can safely assume that |
I tend to agree. Perhaps we should do both: implement Alg2 in Edit: there's also the issue, though, that so far we've only been worrying about |
fd838be
to
eb83468
Compare
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan |
@nanosoldier |
I’m sure float midpoint is gonna be even more fun :grimace: |
I've already added a candidate implementation here. I kept it pretty simple. |
Do we already have a middle function for floating point and others? It seems weird that the integer version rounds down while the float version computes the middle. Just smells a bit like there are actually different functions. |
Yes, but it lives in Statistics. |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan |
Benchmarks look clean, so if folks are happy with the code & tests then this seems good to go. |
eb83468
to
ace9503
Compare
It seems like it would be best to separate bug fix part here without any API changes, which would also be backportable, from the new API. The bug fix can be merged right away and backported while the API change can be discussed further. |
I read your mind ahead of time 😄. The first commit is the bugfix, the second commit is the API change. You could merge just the first commit. |
Ok, I’m not in a position to do that just now but if you want to separate them and merge the bug fix when it passes CI, that would be 👌🏼 |
ace9503
to
fe7006d
Compare
This looks like it was merged before branching for 1.4 so removing that label. |
Fixes #33977
I've tested that
÷2
is as fast as a shift locally, but let's see what nanosoldier says:@nanosoldier
runbenchmarks(ALL, vs=":master")