-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix performance bug for *
with AbstractQ
#44615
Conversation
Performance results following the issue to be closed: using LinearAlgebra, BenchmarkTools
n = 300
M = rand(n,n);
dl = ones(n-1);
d = ones(n);
D = Diagonal(d);
Bi = Bidiagonal(d, dl, :L);
Tri = Tridiagonal(dl, d, dl);
Sym = SymTridiagonal(d, dl);
F = qr(ones(n, 1));
A = F.Q';
for A in (F.Q, F.Q'), B in (M, D, Bi, Tri, Sym)
@btime $B*$A
@btime $A*$B
end yields 132.073 μs (3 allocations: 705.67 KiB)
141.561 μs (3 allocations: 705.67 KiB)
112.906 μs (3 allocations: 705.67 KiB)
122.037 μs (3 allocations: 705.67 KiB)
114.454 μs (3 allocations: 705.67 KiB)
123.033 μs (3 allocations: 705.67 KiB)
113.941 μs (3 allocations: 705.67 KiB)
124.189 μs (3 allocations: 705.67 KiB)
110.328 μs (3 allocations: 705.67 KiB)
122.690 μs (3 allocations: 705.67 KiB)
130.392 μs (3 allocations: 705.67 KiB)
143.679 μs (3 allocations: 705.67 KiB)
113.639 μs (3 allocations: 705.67 KiB)
122.109 μs (3 allocations: 705.67 KiB)
113.455 μs (3 allocations: 705.67 KiB)
122.141 μs (3 allocations: 705.67 KiB)
113.554 μs (3 allocations: 705.67 KiB)
123.823 μs (3 allocations: 705.67 KiB)
114.012 μs (3 allocations: 705.67 KiB)
122.848 μs (3 allocations: 705.67 KiB) including the latest |
8ec2be0
to
8702ee7
Compare
CI failures:
All unrelated to this PR but I'm always wary of backporting things that fail CI. |
Thanks for checking. I understand your concern, but the regression that is fixed here is horrible. I don't think we want to release v1.8 with that performance trap. |
Maybe we can wait for a couple of days for this to get better here to make sure CI is more green, and then merge. At that point, it can also be backported with peace of mind. |
The macOS failure has been fixed on master, so that at least would likely be fixed here with a rebase. |
I think this is good to merge - the 32-bit CI seems to be having trouble. |
This broke JuMP's tests because it's not the case that Here's what used to happen: julia> using JuMP, LinearAlgebra
julia> model = Model();
julia> @variable(model, x[1:2])
2-element Vector{VariableRef}:
x[1]
x[2]
julia> D = Diagonal(x)
2×2 Diagonal{VariableRef, Vector{VariableRef}}:
x[1] ⋅
⋅ x[2]
julia> Matrix(D)
2×2 Matrix{AffExpr}:
x[1] 0
0 x[2] Note the promotion in element type from cc @blegat |
Thanks for the quick report!!! I'm very sorry about that. I overlooked the special type promotion in |
Thanks for the fix! |
This fixes a major regression discovered in JuliaLang/LinearAlgebra.jl#800 in multiplication of
Q
timesDiagonal
(and other structurally sparse matrices). The reason was that these "sparse" matrices took precedence, however assuming that the other factor was easily indexable. In older versions,Q*D = rmul!(copyto!(Matrix{}(undef, size(Q)), Q), D)
, so we sped upcopyto!
forQ
s. Recently (in the v1.8 cycle) we defined diagonal multiplication by 3-argmul!
, thus avoiding the copy. But now multiplication fell back to scalar indexing, in this case to performance hell. So, in this PR we teach Julia that in the case of two special matrices,Q
should take precedence, and the other factor should by densified, since the result will be dense anyway.Closes JuliaLang/LinearAlgebra.jl#800.