Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make broadcast eltype widening type-stable #31174

Merged
merged 1 commit into from
Feb 27, 2019

Conversation

cstjean
Copy link
Contributor

@cstjean cstjean commented Feb 25, 2019

The broadcast code that copies to the new array was type-unstable, causing allocation and slowdown in v2:

Before:

julia> using BenchmarkTools

julia> v = fill(2.0, 50000);

julia> v2 = vcat(v, missing);

julia> v3 = vcat(missing, v);

julia> @btime $v .* 3.0;
  49.728 μs (2 allocations: 390.70 KiB)

julia> @btime $v2 .* 3.0;
  1.490 ms (99498 allocations: 2.33 MiB)

julia> @btime $v3 .* 3.0;
  103.577 μs (7 allocations: 439.77 KiB)

After:

julia> @btime $v .* 3.0;
  50.147 μs (2 allocations: 390.70 KiB)

julia> @btime $v2 .* 3.0;
  147.158 μs (8 allocations: 830.45 KiB)

julia> @btime $v3 .* 3.0;
  92.611 μs (7 allocations: 439.77 KiB)

This fixes the second half of #30455. First half was fixed by #30480

@mbauman
Copy link
Member

mbauman commented Feb 25, 2019

Nice. Makes sense. I know it's a small cost, but since we're outlining here, perhaps we should just outline everything after that similar call. E.g.,

newdest = Base.similar(dest, promote_typejoin(T, typeof(val)))
return restart_copyto_nonleaf!(newdest, dest, bc, val, I, iter, state, count)
function restart_copyto_nonleaf!(newdest, dest, bc, val, I, iter, state, count)
    # Function barrier that makes the copying to newdest type stable
    for II in Iterators.take(iter, count)
        newdest[II] = dest[II]
    end
    newdest[I] = val
    return copyto_nonleaf!(newdest, bc, iter, state, count+1)
end

That should just have one dynamic dispatch instead of three. In my quick tests of the cases you listed, it looks like it saves another ~25%.

@mbauman mbauman added performance Must go faster broadcast Applying a function over a collection labels Feb 25, 2019
@cstjean cstjean force-pushed the faster-broadcast-widening branch from 412b356 to a86ca21 Compare February 27, 2019 15:30
@cstjean
Copy link
Contributor Author

cstjean commented Feb 27, 2019

Sure, I just took your code and updated the PR. I updated the benchmarks.

@mbauman
Copy link
Member

mbauman commented Feb 27, 2019

The Travis failures look unrelated — 64 bit linux is an assertion failure in the Profile tests and Mac is a Pkg timeout.

@mbauman mbauman merged commit 6b0c5c2 into JuliaLang:master Feb 27, 2019
@cstjean
Copy link
Contributor Author

cstjean commented Feb 27, 2019

Awesome, could it be backported to 1.1?

@KristofferC KristofferC added triage This should be discussed on a triage call backport 1.1 labels Feb 27, 2019
@mbauman
Copy link
Member

mbauman commented Feb 27, 2019

We're trying to be a bit more conservative in backporting performance fixes, but this is about as straightforward as a patch could be — it's literally just copy-pasting code into an outlined function. I'd support a backport.

@Keno Keno removed the triage This should be discussed on a triage call label Feb 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
broadcast Applying a function over a collection performance Must go faster
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants