Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

5-arg mul! bug fixes #32901

Merged
merged 16 commits into from
Aug 16, 2019
Merged

5-arg mul! bug fixes #32901

merged 16 commits into from
Aug 16, 2019

Conversation

tkf
Copy link
Member

@tkf tkf commented Aug 14, 2019

Another followup of #29634:

  • Enforce strong-zero behavior for alpha
  • Fix 5-arg mul! for Bi/Tri/Sym * Diag

@tkf tkf changed the title Enforce strong-zero behavior for alpha 5-arg mul! bug fixes Aug 15, 2019
@tkf tkf marked this pull request as ready for review August 15, 2019 07:35
@KristofferC KristofferC added this to the 1.3 milestone Aug 15, 2019
@ararslan ararslan requested a review from andreasnoack August 15, 2019 22:59
@tkf
Copy link
Member Author

tkf commented Aug 16, 2019

I ran the "full test" on 9da77a4 by commenting out this line

# n_samples = Inf # to try all combinations

All tests passed (after more than 6 hours...). As a side note, I find --compile=min helps a lot as the bottleneck of this test is compilation.

Copy link
Member

@andreasnoack andreasnoack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM except for the rending issue

@@ -277,81 +277,83 @@ function rmul!(A::AbstractMatrix, transB::Transpose{<:Any,<:Diagonal})
end

# Elements of `out` may not be defined (e.g., for `BigFloat`). To make
# `mul!(out, A, B)` work for such cases, `_scaledout` short-circuits
# `mul!(out, A, B)` work for such cases, `out .*ₛ beta` short-circuits
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't render for me. Which character are you using here?

Copy link
Member Author

@tkf tkf Aug 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's *_s (for short-circuiting):

image

Maybe (superscript s) or *′ (prime) is better?

Or we can pick a non-standard character (maybe ?) from:

* / ÷ % & ⋅ ∘ × |\\| ∩ ∧ ⊗ ⊘ ⊙ ⊚ ⊛ ⊠ ⊡ ⊓ ∗ ∙ ∤ ⅋ ≀ ⊼ ⋄ ⋆ ⋇ ⋉ ⋊ ⋋ ⋌ ⋏ ⋒ ⟑ ⦸ ⦼ ⦾ ⦿ ⧶ ⧷ ⨇ ⨰ ⨱ ⨲ ⨳ ⨴ ⨵ ⨶ ⨷ ⨸ ⨻ ⨼ ⨽ ⩀ ⩃ ⩄ ⩋ ⩍ ⩎ ⩑ ⩓ ⩕ ⩘ ⩚ ⩜ ⩞ ⩟ ⩠ ⫛ ⊍ ▷ ⨝ ⟕ ⟖ ⟗

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can also use normal function like mulsc. The code would look like:

mulsc.(A.diag .* in, alpha) .+ mulsc.(out, beta)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to wikipedia ⊙ is sometimes used for element-wise product.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is indeed the notation that my profs used for element-wise product.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need a non-standard binary operator and directly define the broadcasted behavior of it. (The usecase here is to hijack the broadcasted behavior to implement short-circuiting at the level of vector-vector multiplication.)

So using would mean to write .⊙ which looks a bit strange as it sounds like doing "doubly" element-wise operation on nested objects.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JeffBezanson
Copy link
Member

We can worry about the choice of operator later, to get these fixes in.

@JeffBezanson JeffBezanson merged commit b5f4e87 into JuliaLang:master Aug 16, 2019
@timholy
Copy link
Member

timholy commented Mar 9, 2022

On one of my machines, the LinearAlgebra/addmul tests take 3 times as long as the third-slowest test and twice as long as the second-slowest test. That means if you're running Julia's tests locally you spend most of the time waiting for this one test to finish. Is this something we should consider optimizing, or given buildkite is it not worth it?

Of course, it might also be a good compiler benchmark, maybe worth investigating.

@timholy
Copy link
Member

timholy commented Mar 9, 2022

Part of the answer:

julia> m = only(methods(LinearAlgebra._generic_matmatmul!))
_generic_matmatmul!(C::AbstractVecOrMat{R}, tA, tB, A::AbstractVecOrMat{T}, B::AbstractVecOrMat{S}, _add::LinearAlgebra.MulAddMul) where {T, S, R} in LinearAlgebra at /home/tim/src/julia-branch2/usr/share/julia/stdlib/v1.9/LinearAlgebra/src/matmul.jl:850

julia> length(filter(mi -> mi !== nothing, collect(m.specializations)))
5272

😆

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
linear algebra Linear algebra
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants