Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vararg vs "..." slurping has huge performance impacts which defy analysis #32761

Closed
iamed2 opened this issue Aug 1, 2019 · 15 comments · Fixed by #32817
Closed

Vararg vs "..." slurping has huge performance impacts which defy analysis #32761

iamed2 opened this issue Aug 1, 2019 · 15 comments · Fixed by #32817

Comments

@iamed2
Copy link
Contributor

iamed2 commented Aug 1, 2019

Here are two almost-identical functions:

function getindex1!(dest::AbstractArray, src::AbstractArray, I::Union{Real, AbstractArray}...)
    @inbounds for (i, j) in zip(eachindex(dest), Iterators.product(I...))
        dest[i] = src[j...]
    end
    return dest
end
function getindex2!(dest::AbstractArray, src::AbstractArray, I::Vararg{Union{Real, AbstractArray}, N}) where N
    @inbounds for (i, j) in zip(eachindex(dest), Iterators.product(I...))
        dest[i] = src[j...]
    end
    return dest
end

And now a benchmark:

julia> a = zeros(300, 300); b = rand(500, 500);

julia> @benchmark getindex1!($a, $b, 201:500, 201:500)
BenchmarkTools.Trial:
  memory estimate:  20.59 MiB
  allocs estimate:  629500
  --------------
  minimum time:     17.831 ms (0.00% GC)
  median time:      20.793 ms (0.00% GC)
  mean time:        20.941 ms (5.62% GC)
  maximum time:     31.831 ms (8.14% GC)
  --------------
  samples:          239
  evals/sample:     1

julia> @benchmark getindex2!($a, $b, 201:500, 201:500)
BenchmarkTools.Trial:
  memory estimate:  352 bytes
  allocs estimate:  6
  --------------
  minimum time:     77.827 μs (0.00% GC)
  median time:      77.973 μs (0.00% GC)
  mean time:        87.117 μs (0.00% GC)
  maximum time:     514.022 μs (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1

The code_lowered and code_typed are identical.
For a function containing a call to this function, the code_typed is different:

julia> foo1(a, b) = getindex1!(a, b, 201:500, 201:500)
foo1 (generic function with 1 method)

julia> foo2(a, b) = getindex2!(a, b, 201:500, 201:500)
foo2 (generic function with 1 method)

julia> @code_typed foo1(a, b)
CodeInfo(
1 ─ %1 = invoke Main.getindex1!(_2::Array{Float64,2}, _3::Array{Float64,2}, $(QuoteNode(201:500))::UnitRange{Int64}, $(QuoteNode(201:500))::Vararg{UnitRange{Int64},N} where N)::Array{Float64,2}
└──      return %1
) => Array{Float64,2}

julia> @code_typed foo2(a, b)
CodeInfo(
1 ─ %1 = invoke Main.getindex2!(_2::Array{Float64,2}, _3::Array{Float64,2}, $(QuoteNode(201:500))::UnitRange{Int64}, $(QuoteNode(201:500))::UnitRange{Int64})::Array{Float64,2}
└──      return %1
) => Array{Float64,2}

The reason for this difference in performance/compilation is not clear. The function signatures describe identical methods. If you define both as methods for the same function, there is still only one method in the function. I think this is a bug?

@KristofferC
Copy link
Member

@iamed2
Copy link
Contributor Author

iamed2 commented Aug 1, 2019

This is not a dup, as this issue is primarily about the resulting performance issue and how expressing the same method in two different ways causes wildly different performance. I don't care what the @code_ tools do specifically, I was just showing that I did not have the tools to identify what was causing the performance difference.

@KristofferC
Copy link
Member

#28720 (comment) then?

@JeffBezanson
Copy link
Member

This is due to the N static parameter. It forces us to specialize the method for every value of N; otherwise we'll try to reuse the same code for different calls.

@iamed2
Copy link
Contributor Author

iamed2 commented Aug 1, 2019

Is this documented somewhere?

Julia is always described as specializing on the runtime types of the arguments. I don't think many people know that type parameters change the specialization behaviour of otherwise identical methods.

@KristofferC
Copy link
Member

I thought this was documented but I might have mixed it up with variables captured in closures. I think adding something small about this is a good idea. It tends to happen for Function, Type and Vararg.

@pablosanjose
Copy link
Contributor

Count me as flabbergasted by this too. I would have expected both versions to be identical. Actually, I don't even understand the explanation about the specialization on N. The slow benchmark involves a single value N=2, so only one method should be generated and reused, right?

@KristofferC
Copy link
Member

The method generated will work for any N.

@iamed2
Copy link
Contributor Author

iamed2 commented Aug 2, 2019

I thought this was documented but I might have mixed it up with variables captured in closures. I think adding something small about this is a good idea. It tends to happen for Function, Type and Vararg.

Cool, I'll write up an addition to Performance Tips today and check to see if there are any good places to link to it from other places.

@iamed2
Copy link
Contributor Author

iamed2 commented Aug 2, 2019

I've come up with some good examples for Type and Vararg but I'm having trouble getting one for Function. Anyone have any suggestions?

@timholy
Copy link
Member

timholy commented Aug 2, 2019

Sometimes foo(f, args...) behaves differently from foo(f::F, args...) where F. Generally, I think, when things start getting more complicated (and dependent on Julia version). Do a grep "::F.*where F" * on base/ for possible sources of inspiration.

@iamed2
Copy link
Contributor Author

iamed2 commented Aug 2, 2019

Thanks! I had checked `"::F.*Function" but yours returns more useful results

@iamed2
Copy link
Contributor Author

iamed2 commented Aug 6, 2019

Interestingly, the two most prominent uses in Base specialize either way (ntuple(::F, ::Integer) where F and map!(::F, ::AbstractArray, ::AbstractArray) where F. I wonder if this just isn't applicable to functions anymore.

@mbauman
Copy link
Member

mbauman commented Aug 6, 2019

If I remember correctly, the f::F where F specialization was only required if you didn't call f directly inside the function body but instead just passed it to another function — probably a non-inlined function. Or at least that's the way it worked a long time ago (year+).

@iamed2
Copy link
Contributor Author

iamed2 commented Aug 6, 2019

@mbauman That seems to be the key observation, thank you! That also explains why I had to search a long time to find an example for Type.

Based on that (and checking specialization behaviour in the REPL), it seems like there are a number of functions in Base which could have their F<:Function parameter removed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants