Fix performance bug for `*` with `AbstractQ` #44615

dkarrasch · 2022-03-14T19:16:31Z

This fixes a major regression discovered in JuliaLang/LinearAlgebra.jl#800 in multiplication of Q times Diagonal (and other structurally sparse matrices). The reason was that these "sparse" matrices took precedence, however assuming that the other factor was easily indexable. In older versions, Q*D = rmul!(copyto!(Matrix{}(undef, size(Q)), Q), D), so we sped up copyto! for Qs. Recently (in the v1.8 cycle) we defined diagonal multiplication by 3-arg mul!, thus avoiding the copy. But now multiplication fell back to scalar indexing, in this case to performance hell. So, in this PR we teach Julia that in the case of two special matrices, Qshould take precedence, and the other factor should by densified, since the result will be dense anyway.

Closes JuliaLang/LinearAlgebra.jl#800.

dkarrasch · 2022-03-15T14:54:41Z

Performance results following the issue to be closed:

using LinearAlgebra, BenchmarkTools
n = 300
M = rand(n,n);
dl = ones(n-1);
d = ones(n);
D = Diagonal(d);
Bi = Bidiagonal(d, dl, :L);
Tri = Tridiagonal(dl, d, dl);
Sym = SymTridiagonal(d, dl);
F = qr(ones(n, 1));
A = F.Q';
for A in (F.Q, F.Q'), B in (M, D, Bi, Tri, Sym)
  @btime $B*$A
  @btime $A*$B
end

yields

  132.073 μs (3 allocations: 705.67 KiB)
  141.561 μs (3 allocations: 705.67 KiB)
  112.906 μs (3 allocations: 705.67 KiB)
  122.037 μs (3 allocations: 705.67 KiB)
  114.454 μs (3 allocations: 705.67 KiB)
  123.033 μs (3 allocations: 705.67 KiB)
  113.941 μs (3 allocations: 705.67 KiB)
  124.189 μs (3 allocations: 705.67 KiB)
  110.328 μs (3 allocations: 705.67 KiB)
  122.690 μs (3 allocations: 705.67 KiB)
  130.392 μs (3 allocations: 705.67 KiB)
  143.679 μs (3 allocations: 705.67 KiB)
  113.639 μs (3 allocations: 705.67 KiB)
  122.109 μs (3 allocations: 705.67 KiB)
  113.455 μs (3 allocations: 705.67 KiB)
  122.141 μs (3 allocations: 705.67 KiB)
  113.554 μs (3 allocations: 705.67 KiB)
  123.823 μs (3 allocations: 705.67 KiB)
  114.012 μs (3 allocations: 705.67 KiB)
  122.848 μs (3 allocations: 705.67 KiB)

including the latest Matrix constructor improvements, which avoid reading out the many structural zeros.

ararslan · 2022-03-16T17:05:36Z

CI failures:

Linux x86: Out of memory, which I think has been common for this builder
macOS x64: LibGit2 failures related to master vs. main branch naming
Windows x86: Something weird with profiling

All unrelated to this PR but I'm always wary of backporting things that fail CI.

dkarrasch · 2022-03-17T10:13:57Z

Thanks for checking. I understand your concern, but the regression that is fixed here is horrible. I don't think we want to release v1.8 with that performance trap.

ViralBShah · 2022-03-17T20:53:20Z

Maybe we can wait for a couple of days for this to get better here to make sure CI is more green, and then merge. At that point, it can also be backported with peace of mind.

ararslan · 2022-03-17T23:18:47Z

The macOS failure has been fixed on master, so that at least would likely be fixed here with a rebase.

ViralBShah · 2022-03-19T04:31:35Z

I think this is good to merge - the 32-bit CI seems to be having trouble.

odow · 2022-03-23T01:22:37Z

This broke JuMP's tests because it's not the case that Diagonal{T,Vector{T}} can be converted to a Matrix{T}.

Here's what used to happen:

julia> using JuMP, LinearAlgebra

julia> model = Model();

julia> @variable(model, x[1:2])
2-element Vector{VariableRef}:
 x[1]
 x[2]

julia> D = Diagonal(x)
2×2 Diagonal{VariableRef, Vector{VariableRef}}:
 x[1]  ⋅
 ⋅     x[2]

julia> Matrix(D)
2×2 Matrix{AffExpr}:
 x[1]  0
 0     x[2]

Note the promotion in element type from VariableRef to AffExpr. This happens when zero(T) isa T is false.

Now we get

cc @blegat

dkarrasch · 2022-03-23T09:15:29Z

Thanks for the quick report!!! I'm very sorry about that. I overlooked the special type promotion in diagm. I think I have a fix and maybe even a test to guard against this issue.... alright, JuMP tests pass locally with my fix!

(cherry picked from commit fc9c280)

odow · 2022-03-23T20:05:24Z

Thanks for the fix!

dkarrasch added performance Must go faster linear algebra Linear algebra backport 1.8 Change should be backported to release-1.8 labels Mar 14, 2022

KristofferC mentioned this pull request Mar 15, 2022

Backports for julia 1.8-beta3/rc1 #44623

Closed

14 tasks

dkarrasch force-pushed the dk/fixq branch 2 times, most recently from 8ec2be0 to 8702ee7 Compare March 15, 2022 17:16

Fix performance bug for * with AbstractQ

4785167

dkarrasch force-pushed the dk/fixq branch from 55a7e6b to 4785167 Compare March 15, 2022 19:09

KristofferC added this to the 1.8 milestone Mar 16, 2022

Merge branch 'master' into dk/fixq

7a87a27

KristofferC mentioned this pull request Mar 18, 2022

More backports for julia 1.8-beta2 #44675

Merged

17 tasks

dkarrasch merged commit fc9c280 into master Mar 22, 2022

dkarrasch deleted the dk/fixq branch March 22, 2022 20:44

odow mentioned this pull request Mar 23, 2022

Tests are broken on nightly jump-dev/JuMP.jl#2930

Closed

dkarrasch mentioned this pull request Mar 23, 2022

Make Matrix cntr work for structured matrices for zero(T) !isa T #44707

Merged

KristofferC pushed a commit that referenced this pull request Mar 23, 2022

Fix performance bug for * with AbstractQ (#44615)

c004dcc

(cherry picked from commit fc9c280)

KristofferC mentioned this pull request Mar 23, 2022

Backports for 1.8-rc1/beta3 #44710

Merged

22 tasks

KristofferC removed the backport 1.8 Change should be backported to release-1.8 label Mar 29, 2022

dkarrasch mentioned this pull request Aug 1, 2022

Fix multiplication of AbstractQs #46237

Merged

torfjelde mentioned this pull request Aug 21, 2022

Fix test failures on nightly -- was vcat, now kron FluxML/Tracker.jl#125

Open

dkarrasch mentioned this pull request Nov 10, 2022

Don't subtype AbstractQ <: AbstractMatrix #46196

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix performance bug for `*` with `AbstractQ` #44615

Fix performance bug for `*` with `AbstractQ` #44615

dkarrasch commented Mar 14, 2022

dkarrasch commented Mar 15, 2022 •

edited

Loading

ararslan commented Mar 16, 2022

dkarrasch commented Mar 17, 2022

ViralBShah commented Mar 17, 2022 •

edited

Loading

ararslan commented Mar 17, 2022

ViralBShah commented Mar 19, 2022

odow commented Mar 23, 2022

dkarrasch commented Mar 23, 2022

odow commented Mar 23, 2022

Fix performance bug for * with AbstractQ #44615

Fix performance bug for * with AbstractQ #44615

Conversation

dkarrasch commented Mar 14, 2022

dkarrasch commented Mar 15, 2022 • edited Loading

ararslan commented Mar 16, 2022

dkarrasch commented Mar 17, 2022

ViralBShah commented Mar 17, 2022 • edited Loading

ararslan commented Mar 17, 2022

ViralBShah commented Mar 19, 2022

odow commented Mar 23, 2022

dkarrasch commented Mar 23, 2022

odow commented Mar 23, 2022

Fix performance bug for `*` with `AbstractQ` #44615

Fix performance bug for `*` with `AbstractQ` #44615

dkarrasch commented Mar 15, 2022 •

edited

Loading

ViralBShah commented Mar 17, 2022 •

edited

Loading