Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Swap adjtrans and triangular wrappers #38168

Merged
merged 5 commits into from
Nov 10, 2020
Merged

Swap adjtrans and triangular wrappers #38168

merged 5 commits into from
Nov 10, 2020

Conversation

dkarrasch
Copy link
Member

@dkarrasch dkarrasch commented Oct 25, 2020

This PR swaps the order of Adjoint/Transpose and *Triangular wrappers. It restores, in a way, the pre-v0.7 behavior (see #25364), without loosing laziness. This just feels so right: many mul/solve methods can now be simply deleted. This requires perhaps a benchmark run and/or package evaluation, to see if it breaks anything. Since tests are untouched, it shouldn't be anything user-facing, so I consider it an internal change. By the way, this is consistent with StaticArrays.

Let me ping a couple of people to collect as much feedback as possible: @mateuszbaran, @Sacha0, @andreasnoack, @ViralBShah

Closes #38093.

@dkarrasch dkarrasch added linear algebra Linear algebra sparse Sparse arrays labels Oct 25, 2020
@dkarrasch dkarrasch closed this Oct 25, 2020
@dkarrasch dkarrasch reopened this Oct 25, 2020
@dkarrasch
Copy link
Member Author

Do we need to run some kind of benchmarks or anything?

@ViralBShah
Copy link
Member

I suspect we don't have benchmarks that will test this, but perhaps good to run a few cases manually to ensure no regressions.

@dkarrasch
Copy link
Member Author

The only possible source of regression is a missing specific method and hence a fallback to a slower generic method. This is very hard to test "manually". The few manual tests did not show any regression. I made sure there are no Adjoint/Transpose{<:Any,<:*Triangular} methods left, which would not get called after we swap the wrappers. So, how shall we proceed here?

@ViralBShah
Copy link
Member

ViralBShah commented Nov 9, 2020

I think we can go ahead and merge in that case, and count on folks to report uncaught regressions, if any.

@dkarrasch
Copy link
Member Author

There's a timeout in the mac-test seemingly in the matmul-area. I'll restart to see if this a true issue.

@dkarrasch dkarrasch closed this Nov 9, 2020
@dkarrasch dkarrasch reopened this Nov 9, 2020
@dkarrasch dkarrasch merged commit 4704efd into master Nov 10, 2020
@dkarrasch dkarrasch deleted the dk/adjtranstriangular branch November 10, 2020 15:01
@ararslan
Copy link
Member

This seems likely to have broken the MixedModels.jl tests on Julia master.

@palday
Copy link
Contributor

palday commented Nov 10, 2020

Building on @ararslan's comment: The tests passed today before this branch was merged (see e.g. here), but are failing now (see e.g. here) due to failed method resolution for rdiv!.

cc: @dmbates

@palday
Copy link
Contributor

palday commented Nov 10, 2020

And can now confirm on local builds of Julia: without this change, the tests pass. With this change, they fail:

  MethodError: no method matching rdiv!(::BlockedSparse{Float64, 1, 2}, ::UpperTriangular{Float64, Adjoint{Float64, UniformBlo
ckDiagonal{Float64}}})                                         
  Closest candidates are:
    rdiv!(::StridedMatrix{T} where T, ::UpperTriangular{var"#s814", var"#s813"} where var"#s813"<:Adjoint where var"#s814") at  ~/julia/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/triangular.jl:1430
    rdiv!(::StridedMatrix{T} where T, ::UpperTriangular) at ~/julia/usr/share/julia/stdlib/v1.6/LinearAlgebra
/src/triangular.jl:1327
    rdiv!(::AbstractMatrix{T} where T, ::Transpose{var"#s826", var"#s825"} where var"#s825"<:LU where var"#s826") at ~/julia/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/lu.jl:689

@dkarrasch
Copy link
Member Author

Sorry about that. Could anyone quickly help me? Which rdiv! method used to be called before this commit? Is it really

rdiv!(::AbstractMatrix{T} where T, ::Adjoint{var"#s826", var"#s825"} where var"#s825"<:LU where var"#s826") at ~/julia/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/lu.jl:689

Seems strange, because why would you go via the LU path if you already have a triangular matrix.

@andreasnoack
Copy link
Member

@palday Could you please share the complete stacktrace? With Julia 1.5.2, I'm getting

julia> methods(rdiv!, Tuple{MixedModels.BlockedSparse,UpperTriangular})
# 0 methods for generic function "rdiv!":

so it would be good to understand why that method ends up getting called.

@palday
Copy link
Contributor

palday commented Nov 11, 2020

From the linked CI failure:

  MethodError: no method matching rdiv!(::BlockedSparse{Float64, 1, 2}, ::UpperTriangular{Float64, Adjoint{Float64, UniformBlockDiagonal{Float64}}})
  Closest candidates are:
    rdiv!(::StridedMatrix{T} where T, ::UpperTriangular{var"#s814", var"#s813"} where var"#s813"<:Adjoint where var"#s814") at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/triangular.jl:1430
    rdiv!(::StridedMatrix{T} where T, ::UpperTriangular) at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/LinearAlgebra/src/triangular.jl:1327
    rdiv!(::BlockedSparse{T, S, P}, ::Adjoint{T, var"#s35"} where var"#s35"<:LowerTriangular{T, UniformBlockDiagonal{T}}) where {T, S, P} at /home/runner/work/MixedModels.jl/MixedModels.jl/src/linalg.jl:75
    ...
  Stacktrace:
    [1] updateL!(m::LinearMixedModel{Float64})
      @ MixedModels ~/work/MixedModels.jl/MixedModels.jl/src/linearmixedmodel.jl:934
    [2] (::MixedModels.var"#obj#55"{Bool, LinearMixedModel{Float64}})(x::Vector{Float64}, g::Vector{Float64})
      @ MixedModels ~/work/MixedModels.jl/MixedModels.jl/src/linearmixedmodel.jl:332
    [3] fit!(m::LinearMixedModel{Float64}; verbose::Bool, REML::Bool)
      @ MixedModels ~/work/MixedModels.jl/MixedModels.jl/src/linearmixedmodel.jl:337
    [4] fit!(m::LinearMixedModel{Float64})
      @ MixedModels ~/work/MixedModels.jl/MixedModels.jl/src/linearmixedmodel.jl:323
    [5] macro expansion
      @ ~/work/MixedModels.jl/MixedModels.jl/test/linalg.jl:96 [inlined]
    [6] macro expansion
      @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Test/src/Test.jl:1146 [inlined]
    [7] macro expansion
      @ ~/work/MixedModels.jl/MixedModels.jl/test/linalg.jl:94 [inlined]
    [8] macro expansion
      @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.6/Test/src/Test.jl:1146 [inlined]
    [9] top-level scope
      @ ~/work/MixedModels.jl/MixedModels.jl/test/linalg.jl:78
   [10] include(fname::String)
      @ Base.MainInclude ./client.jl:444
   [11] top-level scope
      @ ~/work/MixedModels.jl/MixedModels.jl/test/runtests.jl:15
   [12] include(fname::String)
      @ Base.MainInclude ./client.jl:444
   [13] top-level scope
      @ none:6
   [14] eval(m::Module, e::Any)
      @ Core ./boot.jl:360
   [15] exec_options(opts::Base.JLOptions)
      @ Base ./client.jl:261
   [16] _start()
      @ Base ./client.jl:485

I'm not surprised that there are no generic methods -- a large part of the magic of MixedModels comes from some very specialized methods.

We've occasionally had other issues related to method resolution changing between Julia versions and causing problems, so it's possible the interaction of a change in method resolution and this change caused the problem.

@dkarrasch has proposed a MixedModels PR, that changes the signature

function LinearAlgebra.rdiv!(
    A::BlockedSparse{T,S,P},
    B::Adjoint{T,<:LowerTriangular{T,UniformBlockDiagonal{T}}},
) where {T,S,P}

to

    function LinearAlgebra.rdiv!(
        A::BlockedSparse{T,S,P},
        B::UpperTriangular{T,<:Adjoint{T,UniformBlockDiagonal{T}}},
    ) where {T,S,P}

(and places the maintenance burden on us).
Swapping the order of the wrappers means that we went from LowerTriangular to UpperTriangular.

@dkarrasch
Copy link
Member Author

There is no additional maintenance burden on you. In your package, you bothered to write a specialized method for your own types in the first place. The change from lower to upper triangular is natural, because the adjoint of a lower-triangular matrix is upper triangular, so "what you see is what you have". 😄

@palday
Copy link
Contributor

palday commented Nov 11, 2020

There is no additional maintenance burden on you. In your package, you bothered to write a specialized method for your own types in the first place.

I disagree. There are now 50 new lines of code in two new methods and a Julia-version dependent codepath. That is additional code to maintain for a change that we did not introduce. Moreover, this change was to the return type of a public-facing API in a (future) minor release and this change did not impact correctness. In other words, a breaking change in the standard lib was introduced that did not impact correctness and now package maintainers have to react. In other languages, I might not consider changing the type a big problem, but I do in Julia.

(This is not an answer to the question whether the additional methods in MixedModels is the best solution. That may very well be the case, and we're thankful for your PR.)

The change from lower to upper triangular is natural, because the adjoint of a lower-triangular matrix is upper triangular, so "what you see is what you have".

Yes, I understand that, but was mostly stating the obvious to point out why @andreasnoack's use of methods didn't find anything. But this does highlight that even obvious consequences may have non obvious impacts in terms of looking for the impacts of a change. The change in ordering means that you can't simply swap the ordering of the types but rather have to swap types. Again, the swap in types follows logically from the math behind it, but consequences are sometimes hard to think about. Much like in the case of this non breaking change actually breaking things. In other words, this puts pay to the comment:

The only possible source of regression is a missing specific method and hence a fallback to a slower generic method.

The missing specific methods in MixedModels arose because the return type of methods from the standard lib changed. Admittedly, we don't have generic methods to fall back on. But again, the return type of something in the standard lib changed.

All this highlights the old adage that tests can show the presence of errors but not the absence.

@ViralBShah
Copy link
Member

ViralBShah commented Nov 11, 2020

Your points are all valid, and they greatly helped me understand the issue here. I do consider the earlier behaviour incorrect - although I do understand arguments can be made for both ways about whether this is a breaking change or a bugfix. This now does the right thing, and since 1.6 is likely to be the new LTS, it is also timely and sets us up better going forward.

@palday
Copy link
Contributor

palday commented Nov 11, 2020

I can see the reasons for getting this change in before LTS. That said,

  1. even as a bugfix, it's still a breaking change in a minor version (and I am of two minds whether changing the return type is a bugfix or not, but am willing to go with the opinion of experts here). As a bugfix, it's 'allowed', but....
  2. ... there is no documentation update to reflect this
  3. ... same tests (for Julia-lang) pass on the old and and new behavior. No new or changed tests were part of the PR. (Or did I miss something obvious here? That happens more than I care to admit.)

The methods issue in MixedModels is more a symptom than a deep issue for me. (And despite my protests, I am willing to add/support new methods.) My problem is that there is a change in the type contract here that made it through the review process without comment. @ararslan and I were able to track it down quickly because MixedModels just happened to have two PRs a few hours before and a few hours after the change here landed.

@dkarrasch
Copy link
Member Author

I agree we should have an extensive NEWS item on that. The fact that no new tests were needed and all old tests passed was an indication for me that this changed might be "sufficiently internal". Well, it wasn't. To my defense, I didn't know what to shout to get attention by interested folks. Next time I'll call you for a review. 😉

@palday
Copy link
Contributor

palday commented Nov 11, 2020

Next time I'll call you for a review. 😉

Ha, works for me. I generally have no opinion or unreasonably strong ones and nothing in between.

That said: can we (=you) add a test that checks the implicit type contract here?

@maleadt
Copy link
Member

maleadt commented Nov 20, 2020

FWIW, this turned out massively breaking for CUDA.jl, where we didn't just "bother" to write specialized implementations for methods, but actually require them (generic fallbacks still work to some extent, but we can't be calling CPU BLAS).

maleadt added a commit to JuliaGPU/CUDA.jl that referenced this pull request Nov 20, 2020
maleadt added a commit to JuliaGPU/CUDA.jl that referenced this pull request Nov 20, 2020
@dkarrasch
Copy link
Member Author

Interesting. It didn't pop up in the pkg eval run. My apologies once again! Can I help with anything?

@maleadt
Copy link
Member

maleadt commented Nov 20, 2020

No problem, I think I have it figured out (see linked commits, but they are not easily review-able because I did some clean-up at the same time). PkgEval currently isn't doing GPU packages because the system doesn't have a GPU.

@dkarrasch
Copy link
Member Author

I found it easier to keep the function body, but change the order of the wrapper types and replace Upper and Lower in the signature, compared to only switching the wrapper types and adjusting whatever is contained in the function body. But you have already made great progress. Are you going to release new versions that are restricted to Julia v1.6+?

@maleadt
Copy link
Member

maleadt commented Nov 20, 2020

Are you going to release new versions that are restricted to Julia v1.6+?

Yes, unrelated changes to the Julia compiler necessitate that anyway.

maleadt added a commit to JuliaGPU/CUDA.jl that referenced this pull request Nov 24, 2020
maleadt added a commit to JuliaGPU/CUDA.jl that referenced this pull request Nov 24, 2020
maleadt added a commit to JuliaGPU/CUDA.jl that referenced this pull request Nov 24, 2020
maleadt added a commit to JuliaGPU/CUDA.jl that referenced this pull request Nov 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
linear algebra Linear algebra sparse Sparse arrays
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Transposes and adjoints of triangular matrices
6 participants