Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Infinite recursion of \ on sparse matices #283

Closed
yuyichao opened this issue Nov 15, 2015 · 39 comments
Closed

Infinite recursion of \ on sparse matices #283

yuyichao opened this issue Nov 15, 2015 · 39 comments
Labels
sparse Sparse arrays

Comments

@yuyichao
Copy link

Update: There are multiple problems that cause the original issue. Use this issue to track the most directly related linalg issue. See #283 for the repro. This causes a regression on 0.5 but the same issue is also present on 0.4 in different forms.

Original post,

This is causing WoodburyMatrices.jl tests to freeze.

minimum repro: code_warntype(\, Tuple{Tridiagonal{Float64},SparseMatrixCSC{Float64,Int64}}).

I don't know how long it is going to take but it takes more than five minutes on my laptop. GDB backtrace shows that it is struggling in type inference on some functions (haven't check which function/types it is since the call chain is super deep and it's a little hard to locate the function it got stuck in...).

This seems to be a recent regression. I remember the WoodburyMatrices test was fine on the version of julia I made mlubin/ReverseDiffSparse.jl#22 (15 days ago) but it's not fine on a 10 days old julia 685f1b0 and there's no update to the WoodburyMatrices package within this period. I might do a bisect later.

@timholy This breaks your package's test so you might want to know this.

@yuyichao yuyichao added the regression Regression in behavior compared to a previous version label Nov 15, 2015
@yuyichao
Copy link
Author

Bisected to 87e26c516853dd395de3c50406076f2d4d7aa413. @andreasnoack

yuyichao% git bisect good 
87e26c516853dd395de3c50406076f2d4d7aa413 is the first bad commit
commit 87e26c516853dd395de3c50406076f2d4d7aa413
Author: Andreas Noack <andreasnoackjensen@gmail.com>
Date:   Sun Nov 1 12:17:31 2015 -0500

    Consolidate the \ methods for sparse and dense and make the necessary adjustment
    to other methods.

    Solve real lhs problems with complex rhs by reinterpreting the rhs.

    Let factorize(SparseMatrixCSC) check for tringular matrices

:040000 040000 9082f51ee65f12eddae7001fc54a93f945110533 4eb784944d856fc2a301943f0421d2cb165c1214 M      base
:040000 040000 4cbaea91f2bf4fe556206914ff723d8b44dd246e eee94036e2f68946d4c8810cefa401ed138b23de M      test
office:~/projects/julia/master
yuyichao% git bisect log 
git bisect start
# good: [505be935c82e31d3ebe1d275a9f88fb8782cdb1d] Merge pull request JuliaLang/julia#13829 from JuliaLang/yyc/precompile-stderr
git bisect good 505be935c82e31d3ebe1d275a9f88fb8782cdb1d
# bad: [685f1b0b5094ed9efbaa109fbeb36d20cdc35285] Merge pull request JuliaLang/julia#13872 from JuliaLang/amitm/rrfix
git bisect bad 685f1b0b5094ed9efbaa109fbeb36d20cdc35285
# bad: [40d46e7146dd5343c91d7df9df080cbd59e253c5] on llvm37, re-disable the frame-pointer-elim optimization so that unwinding and backtraces usually work
git bisect bad 40d46e7146dd5343c91d7df9df080cbd59e253c5
# good: [9fe603724a738fe6289f152c0abea5bfea4f58e6] Fix whitespace error introduced in last commit
git bisect good 9fe603724a738fe6289f152c0abea5bfea4f58e6
# good: [26839d9f95ef01e4e3708a04369bdbea63b0afb2] Merge pull request JuliaLang/julia#13835 from hayd/markdown_spaces
git bisect good 26839d9f95ef01e4e3708a04369bdbea63b0afb2
# bad: [af3e15d37b70a9057c16cef162767ac97d30ba56] Merge pull request JuliaLang/julia#13793 from JuliaLang/anj/sparsesolve
git bisect bad af3e15d37b70a9057c16cef162767ac97d30ba56
# bad: [87e26c516853dd395de3c50406076f2d4d7aa413] Consolidate the \ methods for sparse and dense and make the necessary adjustment to other methods.
git bisect bad 87e26c516853dd395de3c50406076f2d4d7aa413
# good: [81753ccb9d9fc5d8792fa1847bea645d007ebd1d] Use sparse triangular solvers for sparse triangular solves. Fixes JuliaLang/julia#13792.
git bisect good 81753ccb9d9fc5d8792fa1847bea645d007ebd1d
# first bad commit: [87e26c516853dd395de3c50406076f2d4d7aa413] Consolidate the \ methods for sparse and dense and make the necessary adjustment to other methods.

@yuyichao
Copy link
Author

Even though this is bisected to a commit that change the definition of \ I guess the type inference should also be more robust and shouldn't fall in an infinite loop in this case?

@andreasnoack
Copy link
Member

Yes. I don't think there is anything wrong with the method.

@yuyichao
Copy link
Author

I think there are multiple problems here.

  1. LowerTriangular(::LowerTriangular{Float64,Tridiagonal{Float64}}) returns LowerTriangular{Float64,LowerTriangular{Float64,Tridiagonal{Float64}}} instead of LowerTriangular{Float64,Tridiagonal{Float64}}. (i.e. it makes a copy that shares the data instead of returing the input directly) I think this is a missing ambiguity warning. Ref Avoid unnecessary conversions of nullables. julia#12424 (comment)
  2. Type inference doesn't have a recursion limit and is doing a infinite recursion when trying to speicialize \ with the first parameter being all conbinations of recursive UpperTriangular and LowerTriangular.
  3. There doesn't seem to be a appropriate specialization for UpperTriangular/LowerTriangular and SparseMatrixCSC so even though with this particular parameter the dispatch should go to the qrfact, there's a genuine infinite recursion for \(::UpperTriangular, ::SparseMatrixCSC) (and also lower traingular). I'm not super familiar with these but I guess this is a valid argument combination and even if I'm wrong I think this function should raise an appropriate error instead of giving a stack overflow.

I think all three problems needs to be fixed. For this particular case (the WoodburyMatrices test case), 2 should be enough (since it's a tridiagonal matrix) but it will trigger the infinite recursion easily with a slightly tweaked value (when it's upper or lower triangular) so 3 should also be fixed even when we fix the more general issue in typeinf.

@yuyichao
Copy link
Author

And to demonstrate point 3

julia> U = sprandn(5, 2, 0.2)
5x2 sparse matrix with 1 Float64 entries:
        [3, 1]  =  1.14495

julia> A2 = UpperTriangular([1. 2. 3. 4. 5.
                             0  1. 0. 0. 0.
                             0  0  1. 0. 0.
                             0  0  0  1. 0.
                             0  0  0  0  1.])
5x5 UpperTriangular{Float64,Array{Float64,2}}:
 1.0  2.0  3.0  4.0  5.0
 0.0  1.0  0.0  0.0  0.0
 0.0  0.0  1.0  0.0  0.0
 0.0  0.0  0.0  1.0  0.0
 0.0  0.0  0.0  0.0  1.0

julia> A2 \ U
ERROR: StackOverflowError:
 in \ at ./linalg/generic.jl:329

edit: this also happens on 0.4

@yuyichao yuyichao changed the title type inference freeze for \(::Tridiagonal{Float64}, ::SparseMatrixCSC{Float64,Int64}) Infinite recursion of \ on sparse matices Nov 15, 2015
@yuyichao
Copy link
Author

I think this particular issue will be fixed when the necessary specialization is provided (since type inference will dispatch it to other methods instead of infinitely recursing on the same method with different types).

I've updated the issue and will open another issue for the general type inference problem. (edit JuliaLang/julia#14009)

@yuyichao yuyichao added linear algebra and removed regression Regression in behavior compared to a previous version labels Nov 15, 2015
@yuyichao
Copy link
Author

Also note that when type inference is fixed to deal with these, the inferened type for the current definition would likely be ::Any.

@Mirage10
Copy link

In 0.5 dev: A2\U -> infinite loop
In 0.4.1 : A2\U -> StackOverflowError

@andreasnoack
Copy link
Member

Thanks for analyzing the issues here. We should probably define a fall back for sparse right hand side with full(B) because it's very unlikely that the solution will be sparse even when both problem matrix and right hand side are sparse, e.g.

julia> Bidiagonal(ones(5), ones(4), false)\eye(5,1)
5x1 Array{Float64,2}:                              
  1.0                                              
 -1.0                                              
  1.0                                              
 -1.0                                              
  1.0                                              

However, there could easily be other situations where some combination of A and B wouldn't be defined so I think it is necessary to have a way to avoid the inifinite loop in inference. Is your fix of 1. enough to avoid it?

@yuyichao
Copy link
Author

Is your fix of 1. enough to avoid it?

Unfortunately no. since the type inference is constructing LowerTriangular(UpperTriangular(.....)).

@yuyichao
Copy link
Author

Can there be something like \(::LowerTriangular, ::AbstractMatrix)? Or is that going to cause too many ambiguity warnings....

@andreasnoack
Copy link
Member

So do we have to break the chain here as well? I.e. LowerTriangular(UpperTriangular(A))=Diagonal? That is probably a good idea regardless of the type inference issue.

Can there be something like (::LowerTriangular, ::AbstractMatrix)?

We could fall back to \(A::AbstractTriangular, B::AbstractMatrix) = A\convert(Array, B).

@yuyichao
Copy link
Author

LowerTriangular(UpperTriangular(A))=Diagonal

LOL, I like this. A less surprising way (for the type system) could be UpperTriangular(LowerTriangular(UpperTriangular(A))) == UpperTriangular(LowerTriangular(A)) but it will probably not have the benefit of treating the result as a Diagonal matrix. I'm not sure which one is better. (Do we have other cases where !isa((T::DataType)(a), T))

@andreasnoack
Copy link
Member

(Do we have other cases where !isa((T::DataType)(a), T))

Not that I'm aware of. Most people probably expect a constructor to construct an instance of the type that shares name with the constructor.

I've never really thought about this issue with nested special matrices before, but it's an instance of constructors not being generic which can explain the need for function like lufact qrfact and cholfact that are "constructors" of cholesky QR and LU factorizations, but isa(cholfact(::Matrix),Cholesky) and isa(cholfact(::SparseMatrixCSC),CHOLMOD.Factor). A solutuion could be to let tril and triu return Lower/UpperTriangular, but I think the need for these "pseudo constructors" is a general consideration for the language beyond just linear algebra. Maybe it's a solved problem in language design, but then I'd like to know the solution.

@Sacha0
Copy link
Member

Sacha0 commented Nov 18, 2015

Test data and analysis for \ operating on sparse RHSs (both SparseMatrixCSCs and SparseVectors, with a broad array of LHSs) follows below.

TL;DR The fix Andreas proposed for some aspects of this issue, at the REPL

import Base.(\)
import Base.LinAlg.AbstractTriangular
(\)(A::AbstractTriangular, B::AbstractVecOrMat) = A \ convert(Array, B)

patches up the mode of failure described above in all cases tested below. In some cases eliminating this first mode of failure reveals an unrelated second mode of failure, namely that lufact(A) \ B lacks appropriate methods for various combinations of A and B.

Shall I construct a PR implementing the fix, removing the methods it obviates, and introducing tests covering the fixed cases? Additionally, should I open new issues for the various modes of lufact(A) \ B failure?

Setup code:

order = 5
varscale = 2 * order
fillprob = 0.2
perturbationvec(order) = randn(order) / varscale;
perturbationmat(order) = randn(order, order) / varscale;
spperturbationmat(order) = sprandn(order, order, fillprob) / varscale;

diagvec = perturbationvec(order) + ones(order);
subdiagvec = perturbationvec(order - 1);
superdiagvec  = perturbationvec(order - 1);

diagmat = Diagonal(diagvec);
lbidiagmat = Bidiagonal(diagvec, subdiagvec, false);
ubidiagmat = Bidiagonal(diagvec, superdiagvec, true);
tridiagmat = Tridiagonal(subdiagvec, diagvec, superdiagvec);
symtridiagmat = SymTridiagonal(diagvec, superdiagvec);

densemat = eye(order) + perturbationmat(order);
ltdensemat = LowerTriangular(densemat);
utdensemat = UpperTriangular(densemat);
symdensemat = Symmetric(densemat);
hrmdensemat = Hermitian(densemat);

sparsemat = speye(order) + spperturbationmat(order);
ltsparsemat = LowerTriangular(sparsemat);
utsparsemat = UpperTriangular(sparsemat);
symsparsemat = Symmetric(sparsemat);
hrmsparsemat = Hermitian(sparsemat);

cscvector = sparse([1], [1], [0.1], order, 1);
sparsevector = sparsevec([1], [1.0], order);

Tests and analysis:

# Common failure mode 1: Dispatches to method (\)(AbstractMatrix, AbstractVecOrMat),
# which begins one form or another of infinite loop as described earlier in this issue.
#
# Common failure mode 2: Dispatches to method (\)(A::AbstractMatrix, B::AbstractVecOrMat),
# which correctly calls `lufact(A) \ B`, which subsequently fails for lack of appropriate
# methods.

# Without fix, success: Dispatches to method (\)(Diag, AbstractMatrix), which works.
diagmat \ cscvector;
diagmat \ sparsevector;
# With fix, still succeeds.

# Without fix, success: Dispatches to method (\)(Bidiagonal, AbstractMatrix), which works.
lbidiagmat \ cscvector;
ubidiagmat \ cscvector;
lbidiagmat \ sparsevector;
ubidiagmat \ sparsevector;
# With fix, still succeeds.

# Without fix, failure: Common failure mode 1.
tridiagmat \ cscvector;
tridiagmat \ sparsevector;
# With fix, succeeds.

# Without fix, failure: Common failure mode 2. Specifically, lufact(SymTridiagonal)
# calls lufact!(SymTridiagonal), but the latter has no method for SymTridiagonal.
# Hence this actually occurs independent of RHS. So we have another issue:
# Fix (\)(SymTridiagonal, X) where X<:AbstractVecOrMat.
symtridiagmat \ cscvector;
symtridiagmat \ sparsevector;
# With fix, still fails as above.

# Without fix, failure: Common failure mode 1.
densemat \ cscvector;
densemat \ sparsevector;
# With fix, changes to failure mode 2. Specifically, ultimately
# A_ldiv_B!(LU, SparseMatrixCSC) and A_ldiv_B!(LU, SparseVector)
# get called, but no such methods exist.

# Without fix, failure: Common failure mode 1.
ltdensemat \ cscvector;
ltdensemat \ sparsevector;
utdensemat \ cscvector;
utdensemat \ sparsevector;
# With fix, success.

# Without fix, failure: Common failure mode 1.
symdensemat \ cscvector;
symdensemat \ sparesvector;
hrmdensemat \ cscvector;
hrmdensemat \ sparsevector;
# With fix, changes to failure mode 2. Specifically, lufact(A) calls
# lufact!(A), but the latter has no methods for Symmetric{Array}
# and Hermitian{Array}.

# Without fix, failure: Common failure mode 2. Specifically,
# ultimately A_ldiv_B!(UmfpackLU, SparseMatrixCSC) gets called,
# but no such method exists.
sparsemat \ cscvector;
# With fix, still fails as above.

# Without fix, failure: Common failure mode 1.
sparsemat \ sparsevector;
# With fix, changes to failure mode 2. Specifically, ultimately
# A_ldiv_B!(UmfpackLU, SparseVector) gets called,
# but no such method exists.

# Without fix, success: Dispatches to methods
# (\)(LowerTriangular{SparseMatrixCSC}, SparseMatrixCSC) and
# (\)(UpperTriangular{SparseMatrixCSC}, SparseMatrixCSC), which work.
ltsparsemat \ cscvector;
utsparsemat \ cscvector;
# With fix, still succeeds, but the fix obviates the specialized methods
# mentioned just above, given they simply call, e.g.,
# (\)(LowerTriangular{SparseMatrixCSC}, full(SparseMatrixCSC)),
# which is what the fix does more generally.

# Without fix, failure: Common failure mode 1.
ltsparsemat \ sparsevector;
utsparsemat \ sparsevector;
# With fix, success.

# Without fix, failure: Common failure mode 1.
symsparsemat \ cscvector;
symsparsemat \ sparsevector;
hrmsparsemat \ cscvector;
hrmsparsemat \ sparsevector;
# With fix, changes to failure mode 2. Specifically,
# lufact(A) calls lufact!(A), but the latter has no methods
# for Symmetric{SparseMatrixCSC} and Hermitian{SparseMatrixCSC}.

Edit: Corrected tests.

@andreasnoack
Copy link
Member

@Sacha0 Thanks for going through this systematically

Shall I construct a PR implementing the fix

Yes please do. Since the other issues are hidden by the first issue, it might be easier to discuss the them in the new PR instead of in a separate issue. We can open a separate issue later if necessary.

@kshyatt kshyatt added the sparse Sparse arrays label Nov 18, 2015
@Sacha0
Copy link
Member

Sacha0 commented Nov 28, 2015

Working on the pull request discussed above sent me down a surprisingly deep rabbit hole (and in exploring that warren I have been amazed by all the brilliant work the Julia team has done --- hats off!) With new perspective, the approach proposed above seems suboptimal with respect to #136. The alternative approach I am considering involves more substantial modifications, so being a novice it struck me as wise to seek input prior to forging ahead. My thoughts follow below; please advise.

In short: linalg/triangular.jl contains great generic infrastructure. Rather than punting presently-unhandled cases to dense methods, we should make better use of that generic infrastructure and extend it where necessary.

In full: linalg/triangular.jl and linalg/bidiag.jl together appear to contain all generic methods necessary to handle several binary operations (various forms of multiplication and left and right division) where one argument is some form of triangular type and the other is a one- or two-dimensional array type that implements size, getindex, and setindex (with elements belonging to a field, or perhaps just a commutative ring for some operations.) The trouble is these methods are almost uniformly restricted in the non-triangular argument to StridedVector/StridedMatrix/StridedVecOrMat; I imagine this restriction reflects StridedVector/StridedMatrix/StridedVecOrMat being the most general types in the existing type hierarchy that imply definition of size, getindex, and setindex (as AbstractArray seemingly does not guarantee setindex.)

There appears to be no simple/graceful way to precisely capture the interface described above within the existing type system (short of using traits). If so (and prior to a proper traits/interfaces system or equivalent enhancement in expressive power of the type system landing), it seems better to err on the side of making the generic method definitions in linalg/triangular.jl and linalg/bidiag.jl slightly loose (AbstractVector/AbstractMatrix/AbstractVecOrMat, missing the strict setindex requirement) rather than too tight (StridedVector/StridedMatrix/StridedVecOrMat, which exclude legitimate cases and preclude type extension). Concretely, loosening the relevant method definitions should make several calls that presently fail for lack of methods work (including but not limited to many involved in this issue), and enable broader usage of the generic code in linalg/triangular.jl and linalg/bidiag.jl with new types.

Basically another face of JuliaLang/julia#1276.

This side of attempting implementation I see two primary potential downsides. First, chances are some ambiguity warnings will have a good laugh at my expense --- not a significant concern. Second, barring introduction of methods presently being discussed in #271 and touched on in JuliaLang/julia#12441 (or using traits), there exists a class of cases this approach will not handle, namely those where the non-triangular argument's type does not support setindex and so requires container-type promotion, for example certain special matrix types. To be clear, those cases fail presently, so this is less a downside than resolution would be a nice addition. With #271 going the way it appears to be going, this approach could eventually handle those cases generically; if #271 goes another way, those cases could be handled elsewhere via tiny specialized methods, or eventually generically with traits. In depth:

The generic promotion methods defined in 4d0b1ac9c8bc50fe15bdfee29e8fa86ee8809f1e/base/linalg/triangular.jl#L986-L1040: take a pair of array-like arguments; determine the element type of the returned array from the operation and argument types; accordingly promote the arguments' element types to ensure type stability in the subsequent calculation; and finally pass the element-promoted arguments to an appropriate in-place method. Specifically, the argument passed in mutating position to the in-place method is promoted/copied via copy_oftype(). copy_oftype() returns a copy with the same container type but promoted element type. Where the relevant argument's container type supports setindex, this works; where it does not, for example as with special matrix types, we need an equivalent of copy_oftype() that also promotes that argument's container type to the tightest equivalent that supports setindex. But no such function exists insofar as I can tell, and the ongoing discussion in #271 regarding similar() and copy() seems to confirm that. The end of JuliaLang/julia#12441 is also relevant. The potential resolutions seem to be felicitous resolution of #271 (perhaps ideal), covering these cases via a number of tiny specialized methods, or covering these cases generically using traits. Again, less a downside than resolution would be a nice addition.

At the same time it might be nice to reduce the minor coupling between linalg/triangular.jl and linalg/bidiag.jl. Said coupling somewhat obfuscates how the generic methods tie together in linalg/triangular.jl.

Thoughts? Should I move forward with this approach? Best!

@tkelman
Copy link

tkelman commented Nov 29, 2015

I like the sound of where it seems you're going here.

@Sacha0
Copy link
Member

Sacha0 commented Nov 30, 2015

Cheers --- absent further input I'll start playing with this midweek. Best!

@andreasnoack
Copy link
Member

@Sacha0 I think you are right about most of this. AbstractArray is really a very broad thing and I've come to the conclusion that whenever you get/setindex through all elements, it's not a good idea to define the method for AbstractArray because for sparse and distributed arrays, it will simply be too inefficient.

It is possible to fix most in this issue by speical casing sparse right hand side and promoting to dense. That is reasonsable to do, but is not as generic as you (and I) might want. Your proposed solution with similar + copy! has crossed my mind before and I think it the right way to go, but as you mention, it will probably requre that we decide that similar returns strictly mutable arrays.

@Sacha0
Copy link
Member

Sacha0 commented Dec 3, 2015

@andreasnoack I share your concern over the performance of sparse and distributed arrays and appreciate the related thoughts you expressed in #136; writing specialized methods where performance is a significant issue is critical, and including warnings along the lines of This operation may be painfully slow on these operands. Use this operation judiciously or not at all here, and consider X instead. certainly seems wise and helpful in some cases.

The three potential changes at hand (generalization of existing generic methods via signature abstraction, generalization of existing generic methods via enhancement of similar/copy, and addition of specialized methods where performance is an issue or to cover presently uncovered cases) strike me as strongly complementary rather than mutually exclusive though (happily their combination addresses your concerns while simultaneously advancing #136), and with relative importance reflected in the preceding ordering --- first write generic methods ensuring things work, then add specialized methods where necessary for performance or other advantage. Beyond the immediately preceding tenet seeming to be the Julian way --- correct me if not --- doing otherwise has the ring of premature optimization. Expansion:

JuliaLang/julia#1276 addresses precisely the question of AbstractArray vs StridedArray for generic methods; though it opens with an essentially identical argument to yours above concerning the performance of sparse and distributed arrays, tshort's counterarguments in favor of removing unnecessary constraint (JuliaLang/julia#1276 (comment) and JuliaLang/julia#1276 (comment)) eventually win out. stevengj, timholy, and others similarly argue against unnecessary constraint in JuliaLang/julia#2345 with like outcome. At an earlier stage, carlobaldassi also argues similarly in https://groups.google.com/forum/?fromgroups#!topic/julia-dev/0pcb-S7uOUw / JuliaLang/julia#750; though that early thread concludes against abstraction, that outcome is explicitly reversed in JuliaLang/julia#6258 and the position in favor of abstraction is affirmed. (Specifically JuliaLang/julia#6258 (comment), JuliaLang/julia#6258 (comment), and the pursuing comments. These comments seem to capture the position that fell out after the dust settled in JuliaLang/julia#987, JuliaLang/julia#5810, and JuliaLang/julia#6212 etc, followed by jiahao's codification in JuliaLang/julia#10064.) Insofar as I have seen that's the most recent directly related discussion / position.

Furthermore, slow operation within reasonable bounds (a la JuliaLang/julia#6258 (comment)) seems preferable to failure, as perhaps evidenced by @stevengj's #136 and, for example, python's ubiquity. (Incidentally, encountering a sequence of failing operations involving sparse objects brought my attention to this issue.) Additionally, artificially restricting what operations one can perform seems at odds with Julia's 'consenting adults' principle --- particularly where that restriction's purpose is to paternalistically herd users away from relatively obvious performance pitfalls at the potential cost of significant, legitimate generic functionality. Where writing specialized methods cannot mitigate the performance pitfall, warnings like This operation may be painfully slow on these operands. Use this operation judiciously or not at all here, and consider X instead. seem a good fallback instead of restriction and potential associated costs. Addressing performance concerns about generic methods via improvement of the generic methods themselves (where possible) seems like a preferable alternative as well, perhaps via ideas like @tkelman /simonbyrne's intriguing musings regarding eachindex-/eachstored-like indexing (#136, #136, #136, JuliaLang/julia#10902 (comment) .) Finally, the ongoing push towards protocols/traits/interfaces seems to mesh well with the pro-abstraction position. (On that note, given the recent major activity on that front JuliaLang/julia#6975 (comment), perhaps this is a good time to discuss --- elsewhere --- how the protocols proposal would interact with linear algebra?)

More concretely, the alternative approach --- (\)(A::AbstractTriangular, B::AbstractMatrix) = A \ convert(Array, B) --- suffers equivalent, and in some cases perhaps more catastrophic, performance pitfalls: what if B is large and sparse or distributed? In some sense both approaches are generic, with all accompanying ups and downs, but rely on different mechanisms --- one relies on the AbstractArray interface, and the other on [convert() + dense methods]; relying on the AbstractArray interface ultimately seems like the better thing to do. So, given both approaches present the same kinds of potential hazards, why not choose the approach that otherwise presents significant advantages, seems like 'the right thing', and may offer / be commensurate with paths for improvement (rather than being a dead end) going forward?

So, concretely, in an effort to address all concerns above to at least lowest order :), I propose (1) generalizing the existing generic methods via signature abstraction; (2) writing corresponding specialized methods for sparse operands where important for performance; (3) writing corresponding 'warning' methods for distributed operands which caution against use; and (4) once #271 resolves, leveraging the resulting functionality and (1) to cover additional presently-failing cases.

All that said, if this appeal does not sway you I will happily defer to your experience :) --- good conscience demanded that I make this appeal, but I will gladly confine my enthusiasm to the specialized methods for now and get coding :).

@andreasnoack
Copy link
Member

Good summary. It's is maybe worth remembering that these discussions have taken place over a longer period when the language was very young and that they have involved different people over time.

I can support (1), (2), and (4), but I consider (2) the most time critical because I suspect that most users will be affected. I'm more sceptical about (3). We generally move away from warnings. One of the reasons is that, right now, they cannot be caught so you cannot get rid of them even when you know what you are doing. It might be better to wait and see. Julia users are enthusiastic profilers so if a common method shows up as very slow, we'll see an issue and then we can add a specialized method.

@Sacha0
Copy link
Member

Sacha0 commented Dec 3, 2015

More concretely, the alternative approach --- ()(A::AbstractTriangular, B::AbstractMatrix) = A \ convert(Array, B) --- suffers equivalent, and in some cases perhaps more catastrophic, performance pitfalls: what if B is large and sparse or distributed? In some sense both approaches are generic, with all accompanying ups and downs, but rely on different mechanisms --- one relies on the AbstractArray interface, and the other on [convert() + dense methods]; relying on the AbstractArray interface ultimately seems like the better thing to do. So, given both approaches present the same kinds of potential hazards, why not choose the approach that otherwise presents significant advantages, seems like 'the right thing', and may offer / be commensurate with paths for improvement (rather than being a dead end) going forward?

I failed to make clear a subtle but compelling point here, so explicitly:

One might argue that in calling, for example, (::AbstractTriangular) \ (::SparseVector), the dense-fallback approach is preferable to the signature-abstraction approach for the following reason: Both the dense-fallback and signature-abstraction approaches will densify the sparse object, but the dense-fallback approach densifies the object in one shot and then provides fast indexing operations, whereas the signature-abstraction approach may involve many successive reallocations and provides slower indexing operations.

But this would be giving you (those who wrote the methods for generic numerical linear algebra, sparse matrices, and sparse vectors) too little credit. You did a really great job with those methods --- perhaps better than you realize: Whereas the dense-fallback approach necessarily destroys all structure, the signature-abstraction approach is already automatically structure-preserving in important cases, a tremendous advantage:

Consider calling A_ldiv_B!(A::Union{Bidiagonal, AbstractTriangular}, B::AbstractMatrix) on (A::LowerTriangular, B::SparseMatrixCSC) or (A::LowerTriangular, B::SparseVector). Suppose A is fully dense, but the nonzero elements of the SparseMatrixCSC or SparseVector are confined to rows >= some k which is close to size(B, 1) = M. In such cases, in principle the A_ldiv_B! only introduces fill-in in rows >= k. Because the existing generic methods rely only on the AbstractArray interface, and assignment of zero to an unallocated element of a SparseMatrixCSC or SparseVector is implemented as a no-op, what should happen in principle is what actually happens, structurally speaking:

julia> m = 100; ltdensemat = LowerTriangular(eye(m) + randn(m, m)/(2*m));

julia> spvec = sparsevec([m], [1.0])
Sparse vector of length 100, with 1 Float64 nonzero entries:
  [100]  =  1.0


julia> A_ldiv_B!(ltdensemat, sparsevector)
Sparse vector of length 100, with 1 Float64 nonzero entries:
  [100]  =  0.990708


julia> cscvec = sparse([m], [1], [1.0])
100x1 sparse matrix with 1 Float64 entries:
    [100,   1]  =  1.0

julia> A_ldiv_B!(ltdensemat, cscvector)
100x1 sparse matrix with 1 Float64 entries:
    [100,   1]  =  0.990708

julia> cscmat = sparse(m*ones(Int, 3), [1:3], 1.0)
100x3 sparse matrix with 3 Float64 entries:
    [100,   1]  =  1.0
    [100,   2]  =  1.0
    [100,   3]  =  1.0

julia> A_ldiv_B!(ltdensemat, cscmat)
100x3 sparse matrix with 3 Float64 entries:
    [100,   1]  =  0.994819
    [100,   2]  =  0.994819
    [100,   3]  =  0.994819

That's fantastic. Nicely done :).

@Sacha0
Copy link
Member

Sacha0 commented Dec 3, 2015

I can support (1), (2), and (4), but I consider (2) the most time critical because I suspect that most users will be affected. I'm more sceptical about (3). We generally move away from warnings. One of the reasons is that, right now, they cannot be caught so you cannot get rid of them even when you know what you are doing. It might be better to wait and see. Julia users are enthusiastic profilers so if a common method shows up as very slow, we'll see an issue and then we can add a specialized method.

Sounds good! I will prioritize (2) followed by (1), toss out (3), and keep an eye on #271 for (4). Let me know if I have missed any significant reading as well (I'm sure there's something!) Best!

@andreasnoack
Copy link
Member

I like that, in general, you can rely on the ! versions not promoting and allocating. Then you can call them when you know some structure and know what you are doing. However, my view is that sparse results from solves are rare and the flip side of the above is

julia> n = 500;

julia> A = randn(n, n)|> t -> t't;

julia> F = cholfact(A);

julia> b = sparsevec([n], [1.0], n);

julia> @time A_ldiv_B!(F[:U], copy(b));
  0.011692 seconds (29 allocations: 17.031 KB)

julia> @time A_ldiv_B!(F[:U], full(b));
  0.000364 seconds (10 allocations: 4.344 KB)

julia> 0.011692/0.000364
32.12087912087912

so for \ I think the best default is to convert to dense, but to leave A_ldiv_B! as it is, such that you can get a sparse result if you want.

@Sacha0
Copy link
Member

Sacha0 commented Dec 3, 2015

Cheers! With \ we can (just about) realize the best of both worlds:

function foo_A_ldiv_B(A::LowerTriangular, B::SparseVector)
    A_ldiv_B!(A, full(B))
end

function bar_A_ldiv_B(A::LowerTriangular, B::SparseVector)
    nzrange = B.nzind[1]:B.n
    nzrangesubA = LowerTriangular(sub(A.data, nzrange, nzrange))
    SparseVector(B.n, collect(nzrange), A_ldiv_B!(nzrangesubA, full(B[nzrange])))
end

# warmup
wum = 10
wumat = LowerTriangular(rand(wum, wum))
wuvec = sparsevec([1], [1.0], wum)
foo_A_ldiv_B(wumat, wuvec);
bar_A_ldiv_B(wumat, wuvec);

m = 10000;
ltdensemat = LowerTriangular(eye(m) + randn(m, m)/(2*m));

diabolical = sparsevec([1], [1.0], m);
print("diabolical/foo: "); @time foo_A_ldiv_B(ltdensemat, diabolical);
print("diabolical/bar: "); @time bar_A_ldiv_B(ltdensemat, diabolical);

intermediate = sparsevec([Int(m/2)], [1.0], m);
print("intermediate/foo: "); @time foo_A_ldiv_B(ltdensemat, intermediate);
print("intermediate/bar: "); @time bar_A_ldiv_B(ltdensemat, intermediate);

wonderful = sparsevec([m], [1.0], m);
print("wonderful/foo: "); @time foo_A_ldiv_B(ltdensemat, wonderful);
print("wonderful/bar: "); @time bar_A_ldiv_B(ltdensemat, wonderful);

This yields:

diabolical/foo:   0.076005 seconds (7 allocations: 78.422 KB)
diabolical/bar:   0.076405 seconds (16 allocations: 156.953 KB)
intermediate/foo:   0.077598 seconds (7 allocations: 78.422 KB)
intermediate/bar:   0.018971 seconds (17 allocations: 78.875 KB)
wonderful/foo:   0.076400 seconds (7 allocations: 78.422 KB)
wonderful/bar:   0.000010 seconds (15 allocations: 768 bytes)

Writing this:

function baz_A_ldiv_B(A::LowerTriangular, B::SparseVector)
    nzrange = B.nzind[1]:B.n
    nzrangesubB = sub(B, nzrange)
    nzrangesubA = sub(A, nzrange, nzrange)
    SparseVector(B.n, collect(nzrange), A_ldiv_B!(nzrangesubA, full(nzrangesubB)))
end

seems cleaner and more consistent than bar_A_ldiv_B (and would eliminate the unnecessary copy I imagine is involved in full(B[nzrange])?), but this fails for two reasons: (1) sub(Triangular) presently returns SubArray{Triangular{Data}} rather than Triangular{SubArray{Data}} (new issue?), for which no A_ldiv_B! method exists; and (2) full(sub(SparseVector)) falls back to sub(AbstractArray) (new issue?) which, being a no-op, returns SubArray{SparseVector}, on which A_ldiv_B! again fails. Best!

Edit: I should add that I see sparse results from solves not irregularly in my (admittedly niche) work, though specifically with sparse LHSs.

@andreasnoack
Copy link
Member

Good observation. I'm still sceptical about returning a sparse vector from a the solve as the general solution. In many cases that would be a waste. However, since A_ldiv_B!(XTriangular, SparseVec) will be alloacting anyway through the allocations in setindex! it might be worth implementing your version for that method. The general \ should instead do the conversion upfront, or something like \(A::XTriangular, b::SparseVec)=A_ldiv_B!(A,full(b)) such that b is not copied after the conversion.

It wouldn't be good to have sub(XTriangular) -> XTriangular even when the result is a triangular matrix because then return type would depend on the values of the ranges. The output type should only depend on the input types, not values.

@tkelman
Copy link

tkelman commented Dec 3, 2015

A more specialized alternative to sub (with a different name, subsquare or something) where all index inputs are restricted to be the same would potentially be able to preserve some kinds of structural information that the general sub can't?

@andreasnoack
Copy link
Member

That's true and in theory it could work with all the special matrix types. It would already work for XTriangular, Hermitian/Symmetric, but XDiagonal would have to be parametric on the buffer.

@Sacha0
Copy link
Member

Sacha0 commented Dec 3, 2015

It wouldn't be good to have sub(XTriangular) -> XTriangular even when the result is a triangular matrix because then return type would depend on the values of the ranges. The output type should only depend on the input types, not values.

Good catch! I missed the potential type instability. So the appropriate way to express my intent was indeed LowerTriangular(sub(A.data, nzrange, nzrange)), or is there a better way?

(2) full(sub(SparseVector)) falls back to sub(AbstractArray) (new issue?) which, being a no-op, returns SubArray{SparseVector}

Did I miss something here as well, or is this one an actual issue? Best!

@Sacha0
Copy link
Member

Sacha0 commented Dec 3, 2015

A more specialized alternative to sub (with a different name, subsquare or something) where all index inputs are restricted to be the same would potentially be able to preserve some kinds of structural information that the general sub can't?

That's true and in theory it could work with all the special matrix types. It would already work for XTriangular, Hermitian/Symmetric, but XDiagonal would have to be parametric on the buffer.

I like this. Do I understand XDiagonal correctly as Diagonal, Bidiagonal, Tridiagonal, and SymTridiagonal --- the special matrices that are separate storage classes rather than simply annotations of other storage classes' contents?

@tkelman
Copy link

tkelman commented Dec 3, 2015

sub on a sparsevector does sound like a not-implemented-yet issue. worth tracking.

@Sacha0
Copy link
Member

Sacha0 commented Dec 3, 2015

However, since A_ldiv_B!(XTriangular, SparseVec) will be alloacting anyway through the allocations in setindex! it might be worth implementing your version for that method.

Good call --- something similar to bar_A_ldiv_B above for A_ldiv_B!(Triangular, SparseVec) seems strictly advantageous.

The general \ should instead do the conversion upfront, or something like (A::XTriangular, b::SparseVec)=A_ldiv_B!(A,full(b)) such that b is not copied after the conversion.

I must be missing something here. In foo_A_ldiv_B allocation occurs only in full(b) as you describe, but the hypothetical situation in baz_A_ldiv_B seems substantially identical: Equivalent allocation to full(b) occurs with full(nzrangesubB) (potentially less) and collect(nzrangeB), and insofar as I can tell no other allocation occurs (excluding the negligible allocation involved in the SubArray and SparseVector construction calls)? What am I missing?

@andreasnoack
Copy link
Member

The problem is that the returned vector is a SparseVector even though it is probably dense and therefore double as large as necessary. Maybe we can get the best from the two solutions by something like

function \(A::LowerTriangular, B::SparseVector)
    nzrange = B.nzind[1]:B.n
    nzrangesubA = LowerTriangular(sub(A.data, nzrange, nzrange))
    Bf = full(B)
    A_ldiv_B!(nzrangesubA, sub(Bf, nzrange))
    return Bf
end

@Sacha0
Copy link
Member

Sacha0 commented Dec 4, 2015

Cheers, that looks great --- I will run with that for \ and A_ldiv_B equivalents. Best!

@Sacha0
Copy link
Member

Sacha0 commented Dec 4, 2015

To check my inferred understanding of the design philosophy (so hopefully I can begin answering my own questions :) ):

If a user calls (\)(A::XTriangular{Tmat}, B::SparseVecOrMat) where Tmat does not subtype AbstractSparseMatrix, most likely the user is either not cognizant of sparsity or not concerned with preserving it in that operation, and in any case the user lacks a reasonable expectation of sparsity preservation --- otherwise the user would have had Tmat subtype AbstractSparseMatrix somehow. Hence in this case it is reasonable to [perform any operation and return any object] commensurate with [A being fully populated in the appropriate triangular half and the result potentially being completely dense]. Thus your version of \ two comments above (@andreasnoack), which trades all structure for speed, is almost always the best thing to do in this first case.

If a user calls A_ldiv_B!(A::XTriangular{Tmat}, B::SparseVecOrMat) where Tmat does not subtype AbstractSparseMatrix, the situation is as above --- except that the user reasonably expects preservation of B's container/type. Hence your suggestion (@andreasnoack),

However, since A_ldiv_B!(XTriangular, SparseVec) will be alloacting anyway through the allocations in setindex! it might be worth implementing your version for that method.

which potentially trades some structure for speed, is almost always the best thing to do in this second case.

Finally, if the user calls (\)(A::XTriangular{Tmat}, B::SparseVecOrMat) or A_ldiv_B!(A::XTriangular{Tmat}, B::SparseVecOrMat) where Tmat does subtype AbstractSparseMatrix, most likely the user is both cognizant of and concerned with preserving sparsity in that operation. Hence preserving structure and avoiding spurious allocation are paramount. So full-blown sparse-sparse triangular solves would be ideal, but in their absence falling back to the existing generic method, which trades all speed to preserve structure and avoid spurious allocation, is most likely the best thing to do in this third case.

Does this capture your mental model? Thanks, and best!

@andreasnoack
Copy link
Member

I'm a bit in doubt about the last case. I fear that many users unexpectedly will experience slow solves. Furthermore, users who know that they want a sparse result are probably more experienced and can maybe figure out that they could use A_ldiv_B!. It's actually "obvious" that A_ldiv_B! will work because the type of the B matrix cannot change during the computations since it's an in place operation.

@andreasnoack
Copy link
Member

@yuyichao The example case you posted seems to be fixed by @Sacha0's PRs. Is the real downstream issue also fixed?

@yuyichao
Copy link
Author

yuyichao commented Feb 4, 2016

Seems to be fixed according to travis.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sparse Sparse arrays
Projects
None yet
Development

No branches or pull requests

6 participants