Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How best to fix similar for triangular matrices? #271

Closed
artkuo opened this issue Oct 22, 2015 · 39 comments
Closed

How best to fix similar for triangular matrices? #271

artkuo opened this issue Oct 22, 2015 · 39 comments
Assignees
Labels
needs decision A decision on this change is needed

Comments

@artkuo
Copy link

artkuo commented Oct 22, 2015

In response to #258, the bug was traced to mean which relies on similar (through reducedim) to prepare an array for its output, e.g. mean of a 4x3 matrix along dim 1 should yield a 1x3. It ends up calling something like similar(triangularmatrix, LowerTriangular, (1,3)) (the template matrix, the desired type, and the dimensions), which fails because a 1x3 triangular matrix doesn't make sense. In this context, similar means "give me a matrix with the same eltype" not "give me another triangular just like this one."

  1. An easy fix (@tkelman) is to adjust similar so that when optional dimensions are used, it automatically drops the triangular, which results in AbstractMatrix. When no dimension is given, it could yield a lower triangular. This actually works, and even preserves sparsity automatically. I find this a bit yucky, because the return type depends implicitly on how the call is made.
  2. Another fix is to change similar to always drop the triangular, again preserving sparsity where applicable. I've tried this fix and it doesn't break any existing tests (other than needing a new test for similar). Downside is it is a potentially breaking change for user code where similar is intended to mean "give me another of same structure." Yucky as well.
  3. Yet another possibility is to delve into the details of how similar is called and root out the problem at the level of reducedim. Where appropriate, it could be replaced by sameeltype, meaning a more explicit version of what is meant already. In other cases, the replacement might be samecontainer or samestructure or some such. Idea is to make explicit what you really want. This seems like a longer term solution, because similar occurs 600+ times in julia project (between method calls and definitions) and it may not be straightforward to know what is intended when.

Previous discussions have complained about similar JuliaLang/julia#11574 (comment) and JuliaLang/julia#10064. The problem is that similar is vague. Present docs say that behavior for special types should be decided case by case. Right now similar for triangulars is undocumented, so the user must guess or look at the source code. Might be better to clarify and perhaps split off a sameeltype or samestructure in the long run.

There are safe, conservative ways to do option 2, such as when triangular is input, to return a matrix that's not triangular but whose off-triangle has been zeroed. This makes it safe for the user expecting a triangular back and for the most part won't blow up on them. (Triangulars store a full matrix in their .data field, so there is potential danger if things are not zeroed.)

What is the right solution?

@kshyatt
Copy link
Contributor

kshyatt commented Nov 25, 2015

No one responded to this issue - @andreasnoack, @jiahao, @tkelman I'm pinging the LinAlg crew. Is this still a problem? What do we want to do?

@tkelman
Copy link

tkelman commented Nov 25, 2015

I prefer options 1 or 3 here.

@kshyatt
Copy link
Contributor

kshyatt commented Nov 25, 2015

BTW I would also like to say that this is a very nice issue with a well stated definition of the problem & various solutions. Nice work, @artkuo.

@andreasnoack
Copy link
Member

I would like to postpone the introduction of a new method until we have better evidence for the need so I'll vote for 1 or 2. To make progress, I'll propose that we try out 1 and see how it goes. @artkuo Do you want to try prepare a pr?

@artkuo
Copy link
Author

artkuo commented Nov 25, 2015

Thanks for the replies, but the more I delve into it, the less I like option 1. It is a temporary bandaid, and doesn't address a fairly deep problem, which is that sometimes similar is used to allocate memory, say to hold the results of a BLAS call, and for some structured matrices the similar is not fully mutable. Meanwhile, few people seem be reporting bugs now, perhaps because the structured matrices are only used in particularly safe situations or infrequently. So I'd rather work on my solution, which is closer to option 3. I am slowly working towards a PR, which will probably be quite unpopular but at least give an idea of how wide-ranging the issue might be.

So for now I'd like to put this on hold. I'd like to keep the issue open just as a benchmark for when the real problem is fixed, hopefully in a more meaningful way than option 1.

@andreasnoack
Copy link
Member

I think the problem is quite limited and therefore that it's best to refrain from adding complexity by introducing a new generic function. Could you provide more examples for when similar causes problems right now? I don't think anything in base relies on similar(XTriangular) being an XTriangular. Do you know of such examples?

@tkelman
Copy link

tkelman commented Nov 25, 2015

I agree with the opinion that similar is underspecified and has conflicting uses, and choosing the type it should return is all too often a subjective judgement call or dependent on how many arguments it's called with. For example the case of what to return for similar(::SharedArray) has gone back and forth, ref JuliaLang/julia#5893 and JuliaLang/julia#12964, and I think either choice has left multiple people unsettled and disagreeing. That's a pretty good sign that something is off with the abstraction.

@nalimilan
Copy link
Member

I agree with @tkelman. Even if in practice it more or less works, we'd better clarify what similar means. (In most cases, I suspect it's used to create an Array of the same eltype and (sometimes) size. Everything would be clearer if a separate function was used for that.)

@artkuo
Copy link
Author

artkuo commented Nov 25, 2015

@andreasnoack As an example where a mutable similar would be great, take A*B for both A and B AbstractMatrix. Right now the structured matrices try to code for every single combination, and there are quite a few missing ones. Ideally there would be a fall-back function (*){T,S}(A::AbstractMatrix{T}, B::AbstractMatrix{S}). I do have one waiting in the wings. It's 4 lines of code that will safely produce the desired result if no specific method exists, but it only if there is a proper similar (not option 1).

There is no case I could find in base where you need similar(XTriangular) to be XTriangular. The only fear is that user code expects it, which @tkelman pointed out in the bug report that led to this issue. However, I should point out that the current documentation says that similar should produce a mutable array. It also says that custom types can choose what they want to do, with default to be mutable. The current XTriangular behavior is not mutable, and is not documented as so.

If you believe that nobody should be relying on undocumented behavior, then the safer thing to do is to make it mutable across the board (option 2). I also think that the most basic and necessary closed algebra for triangulars is already implemented in triangular.jl, and does so without itself using similar. But elsewhere in Base, there is code that expects similar to return mutable, and will not work correctly with some structured matrices.

My draconian self prefers option 2, which instantly facilitates the matrix multiplication fix, is consistent with documentation, and is reliable behavior that most people should expect. I have a couple dozen examples of these and other operations between AbstractMatrix and/or AbstractVector that have no fall-back, or a fall-back that breaks because of similar (or copy) not being mutable. If you like, I can file a PR for my multiplication fix, which is extremely simple other than including option 1. I wanted to start this issue first, because I felt like this is an executive decision above my pay grade. (I also don't want to file a PR unless there is >10% chance of being accepted.)

@tkelman
Copy link

tkelman commented Nov 25, 2015

I think the documentation for similar used a poor choice of words, since mutable in the sense of Julia types is not what matters here. Triangular is an immutable-type wrapper around a mutable array, but it is writable and modifiable according to the abstraction of mutable array storage - but with additional structural information. The type system exists to provide additional information about objects, and throwing away that structural information in similar strikes me as absolutely the wrong thing to do. If you always want a dense Array without taking into account any structural properties, use the Array constructor. similar should preserve type-level structural information wherever it can.

@nalimilan
Copy link
Member

If you always want a dense Array without taking into account any structural properties, use the Array constructor. similar should preserve type-level structural information wherever it can.

We could add a method like Array(a::AbstractArray, T=eltype(a), dims=size(a)) = Array{T}(dims), deprecate similar in favor of it, and add a new stricter version of similar always returning an AbstractArray of the same type and structure.

@andreasnoack
Copy link
Member

@artkuo I think it makes sense to always return a (completely) mutable array determined by the wrapped array, i.e. option 2, but I also wanted to close #258 and suggested option 1 as a compromise with little harm done. #258 is really a tiny issue and we'll probably never be able to completely avoid issues like that. I don't really want to defend option 1, but wouldn't your matrix multiply fix use a dimension argument such that option 1 would actually work?

@tkelman @nalimilan Array wouldn't be generic enough for e.g. XTriangular{DArray}. I'm not sure that throwing away the wrapper type is such a disaster. The place where I expected similar(XTriangular)=XTriangular to be useful was in linalg/triangular.jl and I realized that I didn't even use the method there. When do you think it would be useful? As things are now, you can't use a generic setindex! strategy because the structurally zeroed part of the matrix will give you errors.

@artkuo
Copy link
Author

artkuo commented Nov 25, 2015

Okay I have to admit the multiplication fix does indeed still work with option 1. But option 1 is really horrible, because there's a discrete switch in what you get back depending on how many arguments there are. I hope this would be a temporary thing, deprecated with a warning?

Either way it seems like the long-term solution would be a separation of two needs: (1) A reliable, easy way to allocate a fully writable array; (2) A way to allocate another thing exactly like a structured matrix but not necessarily writable. Either one of these could have a different name, and similar could be shifted to be more consistent either direction. In Base, I think writable allocation is more typical, and if that were done with Array it would take some careful combing through to switch that over.

I mostly like the Array solution, except for this: similar does work well with SparseMatrixCSC, because those are fully writable. Array would return a dense instead of a sparse, when what you want is indeed a similar in the classic sense. The odd man out really is triangular.

@tkelman
Copy link

tkelman commented Nov 25, 2015

Triangular is writable, subject to structural constraints. Diagonal, Bidiagonal, Tridiagonal etc are going to have the same issue. If we had a type for structurally-fixed sparse matrices that would be the same, allowing modification to the nonzero values but not introducing new ones.

It depends what the result of similar is going to be used for. Is the contract for similar "the entire output is writable," or "the structurally relevant part of the output is writable" ? We don't have a generic way of expressing what "the structurally relevant part" means for any type yet, but we do need such an API and I plan on creating it. Generic code could operate over only the meaningful portions of a structured array, which would not produce the same result if similar was throwing away relevant structural type information. Some of this use case might be better served by copy, but I still see no advantage to going down the option 2 "remove useful information" approach.

@artkuo
Copy link
Author

artkuo commented Nov 25, 2015

I believe that most cases in Base assume that the entire output is writable. The specific algebra is implemented within each structure, generally allocating a new structure without hoping that similar will do that. I understand why there is objection to option 2, but I also don't see anything that would be broken as a consequence. (I've tried out option 2 and it passes all tests, and fixes several bugs, not that this constitutes a proof.) Perhaps @tkelman is cooking up something that is really going to leverage a structural similar, which would lead me to ask whether that same thing could use a similarstructure that gives a better guarantee of what it returns.

By the way, there is almost the same issue with copy, where for example it is used to do a BLAS call to provide both an input and a writable output, but then failing as an output. (I have a couple dozen examples of matrix bugs, more of which are due to copy than similar.) Also I don't think it is good to redirect to copy (if it worked properly), since similar should presumably be faster by only having to allocate memory.

@tkelman
Copy link

tkelman commented Nov 25, 2015

gives a better guarantee of what it returns

The underlying issue is that similar itself gives no guarantee of what it returns, it's a case-by-case judgement call which is a poor foundation to build indexing and other infrastructure upon.

I would ask the same question regarding what should similar(::Diagonal) return? Returning a Diagonal is not "entirely writable" but unlike for Triangular (or Symmetric which has not been mentioned here yet) the storage type that it wraps is lower-order in terms of memory use than what a "fully writable" array of the same dimensions would require. If I ask for similar of a 10^5 x 10^5 Diagonal matrix should that be expected to cause an OOM error?

Base doesn't really exercise the structured array types very well, aside from wrapping a few factorizations where they mostly don't interact with other code in that many ways. I'm less concerned with whether this would break code than if it would be the right abstraction design.

@artkuo
Copy link
Author

artkuo commented Nov 25, 2015

similar(::Diagonal) should return an array; make it sparse if you wish. I don't think memory is a good argument because you can always run out of it when you're on your last byte.

Indeed, the issue is right abstraction design. I would argue that both are "correct": similar allocates writable memory, and similar preserves structure. There may well be need for both, but shouldn't these have different names, and if so, which way should similar go rather than straddling?

@tkelman
Copy link

tkelman commented Nov 25, 2015

Memory, and more generally algorithmic efficiency, is much of the point of using structured array types. Triangular and Symmetric are the unusual cases for that since we're not using packed storage.

We could pick separate names, neither of which reuse the existing similar which we could deprecate. I think storage has been suggested before as a potentially better name.

@andreasnoack
Copy link
Member

If there are no (or very few) use cases for preserving the structure, wouldn't it be simpler to adjust the few similar(StructuredMatrix) methods to return a completely writable array? If, in the future, we are able to exploit the structure better, then we can just introduce a new method. Right now it seems as an overly complicated fix for a small issue.

@tkelman
Copy link

tkelman commented Nov 25, 2015

I don't like that plan to throw out information and harm performance because we aren't very good yet at utilizing structural information. Structure is a complicated problem that isn't going to have a simple solution, and it deserves thinking carefully about rather than punting and doing the naive thing because it's too hard. Designing base to only be good at dealing with dense arrays is a really bad long-term plan.

@nalimilan
Copy link
Member

I just thought about another use case which introduces another interesting way of looking at this: NamedArrays and AxisArrays. Here, the question is not only whether the array is writable or not, but also whether to preserve information about dimensions. Clearly, similar should preserve this information when creating an array with the same size, but not when creating an array of a different size. This is actually akin to what should happen with structured matrices: you can request a different element type, but not a different size -- at least if you want an object of the same type.

I find this example useful because NamedArrays and AxisArrays are in-place alternatives to Array in many situations, contrary to structured matrices which cannot be written to everywhere. Therefore, they are a typical case where a well-designed similar would make generic programming shine.

So I can see at least three four different needs:

  1. get an Array of the same eltype or size: this is purely a convenience function which could be merged into Array()
  2. get an array of the same type if possible, but with a different size: for most types, an Array will be returned, losing the structure and dimension names; for sparse arrays, a sparse array is returned with no stored entries.
  3. get an array of exactly the same size, possibly with a different element type, but writable everywhere: here we can preserve dimension names, but the underlying structure is an Array (wrapped in a NamedArray or AxisArray)
  4. get an array of exactly the same size, possibly with a different element type, and writable only for stored elements of the original array: here we can preserve the structure and dimension names

The question is, does this cover all cases? Are there situations where you want an array of a different size, but which isn't a standard Array? For example, does this apply to ShareArrays? I would say it doesn't, and anyway this function could be included later if it's really needed in some rare situations.

@tkelman
Copy link

tkelman commented Nov 25, 2015

Are there situations where you want an array of a different size, but which isn't a standard Array?

You can certainly ask for a SparseMatrixCSC of any size.

@andreasnoack
Copy link
Member

Designing base to only be good at dealing with dense arrays is a really bad long-term plan.

I don't see how that can be a fair conclusion to draw from my comment.

@artkuo
Copy link
Author

artkuo commented Nov 25, 2015

| wouldn't it be simpler to adjust the few similar(StructuredMatrix) methods to return a completely writable array?

@andreasnoack Indeed that's option 2, which would fix quite a few existing bugs and is not known to break any pre-existing code.

@tkelman Do you have any specific plans that will really benefit from structured similar, and does it have to be called similar? A longer term, smarter structure in Base need not conflict with making similar consistent now.

Also, my version of option 2 is not dense, merely unstructured; it still preserves sparsity. A similar(::UpperTriangular(sparsematrix)) would just yield a sparse with as many filled entries as its source.

And along those lines, @nalimilan I agree with part of what you say, but I wouldn't stick to Array because a sparse should have a sparse similar. I would just drop the structure.

@nalimilan
Copy link
Member

You can certainly ask for a SparseMatrixCSC of any size.

@tkelman @artkuo Right, I forgot about this intermediate situation where the structure of stored elements can be modified. I've added it as an additional case (2).

@tkelman
Copy link

tkelman commented Nov 25, 2015

I don't see how that can be a fair conclusion to draw from my comment.

It was in reaction to "If there are no (or very few) use cases for preserving the structure" - my point is I don't think we should be designing abstractions today that are only good at dealing with use cases that have been written right now in Base or registered packages.

Do you have any specific plans that will really benefit from structured similar, and does it have to be called similar?

Yes, my plans are roughly that for every current case of structured matrix types, I'd like to introduce an abstraction for iterating and operating over the stored elements that generalizes the current per-type custom loop implementations. The "uninitialized structural copy" operation for this does not have to be called similar, but I still think similar is a poor name and an underspecified operation even if it gets changed to more consistently mean "fully writable storage." The current uses of similar should carefully be looked at to determine exactly what is meant as far as what abstraction they're really relying on.

@nalimilan
Copy link
Member

That said, I'm not sure there are many use cases for writing generic code that creates a new sparse array with a different size than the original. And as regards the particular situation where you create a new array with some rows or columns added (or removed) compared to the original, another strategy will have to be found if we want such code to preserve dimension names for NamedArrays and AxisArrays. For that case, a function taking the original array and the indices of the dimensions where to add or remove rows/columns would be needed; it would also allow preserving the structure of sparse matrices for the parts that are common with the original array.

@artkuo
Copy link
Author

artkuo commented Nov 25, 2015

Thanks, this really helps. It sounds like the four cases of @naliman can be addressed with one extension to Array, and two or three new methods that in the long run should not be called similar. In the short term, I am tempted to follow a course that will be mostly the original option 2, perhaps with some of the others sprinkled in. The understanding is that it's still important to root out what was meant in the various similar uses in Base and perhaps clarify what was intended.

Also I would amend @naliman's case 3 to include different sizes as well, as is needed by mean, although that is conceivably covered by case 1. Perhaps one should be able to take a mean(::SparseMatrixCSC, 1) and get a 1-dimensional sparse matrix (not Array) of the means down the columns?

Probably hard to resolve new method names now, but I kind of like allocatewritable and allocatestructured, which is more direct about why you wish for a similar. Also a verb like allocate could be more parallel to verb copy which could also have writable and structured variants. The present similar is an adjective while copy is a verb or noun.

@nalimilan
Copy link
Member

Also I would amend @naliman's case 3 to include different sizes as well, as is needed by mean, although that is conceivably covered by case 1. Perhaps one should be able to take a mean(::SparseMatrixCSC, 1) and get a 1-dimensional sparse matrix (not Array) of the means down the columns?

@artkuo It doesn't sound likely to me that one would want to get a sparse matrix when computing reductions like mean over dimensions of a sparse array. At least, it would seem at least as common to want a standard Array, since most of the time rows/columns contain at least one stored element. And a function cannot choose which one is the most appropriate just from the input type.

@tkelman
Copy link

tkelman commented Nov 26, 2015

Right, that's called "hypersparse" when the dimensions are larger than the number of nonzeros, and csc isn't really the best format for it. The algorithms there are a bit different.

@Sacha0
Copy link
Member

Sacha0 commented Dec 22, 2016

Unlikely to receive attention prior to 0.6. Best!

@StefanKarpinski
Copy link
Member

Is this also unlikely to receive attention prior to 1.0?

@mbauman
Copy link
Member

mbauman commented Oct 12, 2017

IMO: the long-term solution is a new allocate function that uses storage-style traits to be much more specific about the kind of array you want to receive. This would be so radically different from similar that I think it deserves a new name, and thus would be a new feature. The only place this could be breaking is that the base library may shift away from calling similar to calling this new, more specific function… but even there I think we could structure things so that it fell back to using similar.

@StefanKarpinski
Copy link
Member

That sounds like we could take it off the 1.0 milestone?

@Sacha0
Copy link
Member

Sacha0 commented Oct 15, 2017

JuliaLang/julia#15198 implemented the OP's first proposal for banded (Diagonal, [Bi|[Sym]Tri]diagonal) and wrapped ([Unit](Upper|Lower)Triangular, Symmetric/Hermitian) matrices. That is, similar(A, T) now preserves structure/wrapping, while similar(A, T, shape) does not. Specifically, JuliaLang/julia#15198 (inadvertently?) made shape-receiving similar return Arrays independent of, for example, a wrapped type's storage type .

So I see three potential actions remaining for this issue: (1) Restore preservation of storage type of wrapped types when similar receives a shape argument (such that, for example, similar(UpperTriangular(sprand(4, 4, 0.5)), Float32, (3, 3)) yields a SparseMatrixCSC rather than a Matrix). (2) Consider yielding a sparse rather than dense array when similar receives a structured matrix and a shape (such that, for example, similar(Diagonal(rand(4)), Float32, (3, 3)) yields a SparseMatrixCSC rather than a Matrix), which would be consistent with e.g. broadcast. (3) The similar disambiguation / replacement that @mbauman mentions and JuliaLang/julia#18161 currently tracks.

Proposal: Complete (1) and (2), then close this issue in favor of JuliaLang/julia#18161. Thoughts? Best!

@Sacha0
Copy link
Member

Sacha0 commented Oct 16, 2017

(Related: Should similar(A::Union{Symmetric,Hermitian}[, neweltype]) preserve A's uplo value? Perhaps so.)

@Sacha0
Copy link
Member

Sacha0 commented Oct 16, 2017

(#24162 and JuliaLang/julia#24163 together cover (1) and uplo preservation.)

@Sacha0
Copy link
Member

Sacha0 commented Oct 16, 2017

(#24170 covers (2).)

@Sacha0
Copy link
Member

Sacha0 commented Oct 20, 2017

With items (1) and (2) from #271 complete, closing in favor of JuliaLang/julia#18161 for a grander overhaul of similar. Thanks all!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs decision A decision on this change is needed
Projects
None yet
Development

No branches or pull requests

8 participants