Regression in performance of sum on Broadcasted with small Union eltype #39425

nalimilan · 2021-01-27T20:38:42Z

The performance of sum(Base.Broadcast.Broadcasted(*, (x, z))) when z has eltype Union{Float64, Missing} regressed dramatically between Julia 1.3 and 1.4. It improved again in 1.5 but it's still much slower on master than on 1.3.

On Julia 1.3.1:

julia> using BenchmarkTools

julia> using LinearAlgebra

julia> x = rand(10_000);

julia> z = Vector{Union{Float64, Missing}}(x);

julia> f(A, w) = sum(Base.Broadcast.Broadcasted(*, (A, w)))
f (generic function with 1 method)

julia> @btime f(x, z);
  31.150 μs (11 allocations: 240 bytes)

On 1.4.2:

julia> @btime f(x, z);
  2.070 ms (40004 allocations: 1.07 MiB)

On master:

julia> @btime f(x, z);
  941.751 μs (40002 allocations: 1.07 MiB)

Cc: @tkf. Found while investigating JuliaStats/StatsBase.jl#518 (comment).

The text was updated successfully, but these errors were encountered:

nalimilan · 2021-01-27T20:53:09Z

Though after calling instantiate everything is very fast on master:

julia> g(A, w) = sum(Base.Broadcast.instantiate(Base.Broadcast.Broadcasted(*, (A, w))));

julia> @btime g(x, z);
  23.826 μs (1 allocation: 16 bytes)

vtjnash · 2021-03-31T02:16:18Z

From that observation, I suspect this is also due to an excess usage of @inline in broadcast.jl. We've observed that those may hurt performance more often than they help, so as a rough cut, we may just want to remove all of them to solve all these performance regressions, then slowly bring them back more carefully, if necessary.

inkydragon · 2022-08-14T11:24:46Z

test code

x = rand(10_000);
z = Vector{Union{Float64, Missing}}(x);
f(A, w) = sum(Base.Broadcast.Broadcasted(*, (A, w)))
@time f(x, z);
@time f(x, z);
@time f(x, z);

✔️ 1.3.1 (2019-12-30): (15 allocations: 400 bytes)
✔️ 1.4.0-DEV.559 (2019-12-04) 7642e2b: (11 allocations: 240 bytes)
❌ 1.4.0-DEV.560 (2019-12-04) 3c182bc: (40.00 k allocations: 1.068 MiB)
Transducer as an optimization: map, filter and flatten #33526 cc: @tkf
❌ 1.4.0-rc1 (2020-01-23): (40.00 k allocations: 1.068 MiB)
❌ 1.4.2 (2020-05-23): (40.00 k allocations: 1.068 MiB)
❌ 1.6.4 (2021-11-19)

vtjnash · 2023-08-24T19:49:21Z

This still reproduces on master

Tortar · 2024-07-10T17:45:24Z

This doesn't reproduce anymore on 1.11 and nightly

nalimilan added performance Must go faster regression Regression in behavior compared to a previous version broadcast Applying a function over a collection missing data Base.missing and related functionality labels Jan 27, 2021

oscardssmith closed this as completed Jul 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regression in performance of sum on Broadcasted with small Union eltype #39425

Regression in performance of sum on Broadcasted with small Union eltype #39425

nalimilan commented Jan 27, 2021

nalimilan commented Jan 27, 2021

vtjnash commented Mar 31, 2021

inkydragon commented Aug 14, 2022

vtjnash commented Aug 24, 2023

Tortar commented Jul 10, 2024

Regression in performance of sum on Broadcasted with small Union eltype #39425

Regression in performance of sum on Broadcasted with small Union eltype #39425

Comments

nalimilan commented Jan 27, 2021

nalimilan commented Jan 27, 2021

vtjnash commented Mar 31, 2021

inkydragon commented Aug 14, 2022

vtjnash commented Aug 24, 2023

Tortar commented Jul 10, 2024