-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Broadcasting with missings is slow #30455
Comments
Probably a duplicate of #28382. |
After encountering |
There will always be some performance impact, but yes, Matt provided a possible fix for this at #28382. |
Isn't that patch just to improve the output type, after the computations have been done? I don't think it will improve speed of the broadcasting itself. If the compiler already knows the output's eltype, then julia> out = Vector{Union{Missing, Float64}}(undef, length(v3));
julia> @btime broadcast!(*, $out, $v3, 3.0);
68.585 μs (0 allocations: 0 bytes) In |
Right, that's a different issue. Actually, the reason why What's more problematic is that |
Fantastic! Thank you so much, this will make a huge difference for us.
We occasionally have large vectors with very few julia> @eval Base.Broadcast function copyupto!(newdest, dest, iter, count)
for II in Iterators.take(iter, count)
newdest[II] = dest[II]
end
end
copyupto! (generic function with 1 method)
julia> @eval Base.Broadcast @inline function copyto_nonleaf!(dest, bc::Broadcasted, iter, state, count)
T = eltype(dest)
while true
y = iterate(iter, state)
y === nothing && break
I, state = y
@inbounds val = bc[I]
S = typeof(val)
if S <: T
@inbounds dest[I] = val
else
# This element type doesn't fit in dest. Allocate a new dest with wider eltype,
# copy over old values, and continue
newdest = Base.similar(dest, promote_typejoin(T, S))
copyupto!(newdest, dest, iter, count)
newdest[I] = val
return copyto_nonleaf!(newdest, bc, iter, state, count+1)
end
count += 1
end
return dest
end
copyto_nonleaf! (generic function with 1 method)
julia> @btime $v2 .* 3.0;
484.567 μs (9 allocations: 830.47 KiB) |
It's type-unstable because |
Good point! Can you make another PR once #30480 is merged? #30076 might be relevant: maybe we could merge all similar copying function in a single one (maybe not)? You could even try to make this a bit faster by passing |
|
Ah, good point, carry on. Then the only improvement we can make would be to avoid copying completely (#26681). |
@nalimilan Unrelated, but if I wanted to understand the source of |
This might just be #28126, but it looks like a separate problem. On 1.0.2:
Apart from the 40X performance difference, two things stand out:
v2
. Why?v3
slower thanv2
, in spite of allocating much less?The text was updated successfully, but these errors were encountered: