-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
faster maximum, minimum (#28849) #30320
Conversation
Can somebody rerun appveyor? |
Looking at
Can we get properly vectorized (SIMD) code here? Does LLVM understand that min/max are associative? |
Oh nice @chethega. The timing of |
I don't know. Your PR gives a nice 5x boost, but there is another big factor on the table. I guess some llvm expert needs to chime in on associativity of min/max algebra; did we write this in a bad way, are we missing an optimization pass or is this a problem of llvm? Or is this just stuck upstream and will solve itself soon? If the latter, then I'd say merge and keep an issue open to revisit this later, once the fundamentals in llvm are ready. Nobody got grey hair waiting for the slow Second point: I don't understand why your implementation is correct with
Does your implementation handle that correctly? Regardless, could you add that test-case, just for my piece of mind? I think this is an example that any future PR can plausibly get wrong, which makes it a good test (even better test: |
Yes the implentation should be julia> fm(x,y)=ifelse(x > y, x, y)
fm (generic function with 1 method)
julia> A=[1.0, NaN, 2.0];
julia> fm(fm(A[1],A[2]), A[3])
2.0
julia> fm(A[3], fm(A[1],A[2]))
NaN |
My problem is that I don't see how you ensure that. If you fold a
|
You are right @chethega the current state of the PR is incorrect. |
Should be fixed (without changing speed of PR). |
That looks correct, but gives almost a 2x slowdown compared to the original PR :( Somewhat faster variant for floats on my machine is
I think this is because (1) the above |
That is weird, on my machine, it is the same speed as original PR. |
One question is, whether it is okay to confuse the order of maximum([0.0, -0.0]) === -0.0 happen? In numpy this can happen. AFAICT without this PR this does not happen, but is not covered by unit tests. |
As a user I would expect that (In an ideal case I would like to be able to pass an |
Nice, no more objections from me. We might want a Can we get a nanosoldier run? Just in case of surprises? |
@nanosoldier |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. cc @ararslan |
Any ideas why CI fails? |
|
Can somebody rerun appveyor? |
Any objections left? |
start = first | ||
stop = start + chunk_len - 4 | ||
while stop <= last | ||
isnan(v1) && return v1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This broke maximum
on non-numeric values. This is visible only with long arrays:
julia> maximum(rand(Char, 500))
ERROR: MethodError: no method matching isnan(::Char)
Closest candidates are:
isnan(::BigFloat) at mpfr.jl:879
isnan(::Missing) at missing.jl:83
isnan(::Float16) at float.jl:530
...
Stacktrace:
[1] mapreduce_impl(::typeof(identity), ::typeof(max), ::Array{Char,1}, ::Int64, ::Int64) at ./reduce.jl:482
[2] _mapreduce at ./reduce.jl:320 [inlined]
[3] _mapreduce_dim at ./reducedim.jl:308 [inlined]
[4] #mapreduce#549 at ./reducedim.jl:304 [inlined]
[5] mapreduce at ./reducedim.jl:304 [inlined]
[6] _maximum at ./reducedim.jl:653 [inlined]
[7] _maximum at ./reducedim.jl:652 [inlined]
[8] #maximum#555 at ./reducedim.jl:648 [inlined]
[9] maximum(::Array{Char,1}) at ./reducedim.jl:648
[10] top-level scope at none:0
Right now we are only backporting bugfixes to 1.1.0. This can go in 1.1.1. |
Reverting due to the reported breakage. We should also add the non-numeric case to the tests. |
Borrowing an idea from @antoine-levitt #26003 (comment).
There is one slight difference to the currentmaximum/minimum
behavior. Namely-0.0
and0.0
are not always ordered correctly (the same thing happens with numpy.):Old
Numpy
NewNewNew