Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Triangular matrix must be square" error for square triangular matrices #258

Closed
diegozea opened this issue Sep 16, 2015 · 12 comments
Closed

Comments

@diegozea
Copy link

julia> list_tril
3x3 Array{Int64,2}:
  0  0  0
  1  0  0
 -2  3  0

julia> cor(LowerTriangular(list_tril))
ERROR: ArgumentError: Triangular matrix must be square
 in similar at linalg/triangular.jl:27
 in cor at statistics.jl:430
 in eval_user_input at REPL.jl:64
 [inlined code] from REPL.jl:93
 in anonymous at task.jl:68

julia> cor(list_tril)
3x3 Array{Float64,2}:
   1.0        -0.944911  NaN  
  -0.944911    1.0       NaN  
 NaN         NaN           1.0

@andreasnoack
Copy link
Member

I'm not sure it makes much sense to calculate the correlation matrix from a triangular matrix. Do you have a real usecase for this?

@artkuo
Copy link

artkuo commented Oct 20, 2015

Regardless of the use case, this is a bug. What happens is that cor needs the mean of the matrix, which is supposed to be a one-row matrix. But mean asks for a similar, and Triangular matrices seem to interpret similar to mean give me the same kind of matrix. This doesn't seem consistent with the spirit of the documentation:

similar(array, [element_type=eltype(array)], [dims=size(array)])
Create an uninitialized mutable array with the given element type and size, based upon the given source array. The second and third arguments are both optional, defaulting to the given array's eltype and size. The dimensions may be specified either as a single tuple argument or as a series of integer arguments.

But going on further, there is wiggle room:

Custom AbstractArray subtypes may choose which specific array type is best-suited to return for the given element type and dimensionality. If they do not specialize this method, the default is an Array(element_type, dims...).

It seems that similar is really meant to give you an array of the same eltype, but with arbitrary dimensions dims. Those dimensions almost certainly imply you don't want another triangular matrix, you just want a matrix, say of Float64s.

One solution is to go with the default documentation, and drop back to Array. But it would help to know whether similar is ever used to mean "give me another triangular matrix."

BTW, same issue should affect Diagonal and perhaps others.

@tkelman
Copy link

tkelman commented Oct 20, 2015

But it would help to know whether similar is ever used to mean "give me another triangular matrix."

Yes, in quite a few indexing operations now due to the fallbacks in JuliaLang/julia#10525. What similar should do is currently determined on a case by case basis. When called with the same dimensions as the original array, there's a lot to be said for maintaining the original structural properties and avoiding falling back to dense behaviors when that would be a performance disaster for sparse, or diagonal, or narrow-banded matrices.

I think one way out here would be finding and tweaking the uses of similar that result in different dimensionalities than the input array. mean along a dimension can't preserve structural properties for many specialized array types, so it should probably not be using similar for those cases.

@artkuo
Copy link

artkuo commented Oct 21, 2015

This problem seems fairly compartmentalized to triangular and diagonal matrices, and not much else. The similar issue has indeed been discussed in JuliaLang/julia#10525, as well as JuliaLang/julia#11574. similar is used in a variety of base functions like mean and cor through things like Base.reduced_dims, and usually means "give me a matrix with this eltype but these dimensions," i.e. it's usually not asking for another triangular. In fact, the closed operations for triangulars such as + are defined quite explicitly and safely without similar.

In the earlier discussions, it's clear that similar is a bit of a misnomer. Probably it will continue to refer to the eltype, and if someone really wanted the same kind of structured array, the solution might be something like samekind() or justlikethisbutnotacopy(), without option for dimensions.

Back to the bug. The fix appears to be pretty simple and have few effects on performance. It also seems that triangulars don't completely leverage the advantages they could. I'll see if I can put together a pull request.

@tkelman
Copy link

tkelman commented Oct 21, 2015

The counterexample to that, as usual, is sparse matrices. Triangular is an immutable wrapper around any type of AbstractMatrix. Promoting to dense is not a safe thing to do on any arbitrary AbstractMatrix, when it could easily be a million by a million sparse matrix with only a few nonzeros per column. You could potentially have similar return the wrapped data matrix's type, but wouldn't this result in indexing operations on Triangular's returning potentially full matrices that include garbage data from the ignored triangle? edit: nevermind, it wouldn't

The situation is a bit like JuliaLang/julia#11574 (comment) where you really want a different return type for different numbers of dimensions, and for triangular potentially also whether the dimensions match or not. The implications that has on type stability, and where exactly similar needs to be used with vs without dimension inputs, are worth figuring out.

@tkelman
Copy link

tkelman commented Oct 21, 2015

I'm coming around on this after sleeping on it. Without a dimensions argument, similar should preserve special structure and return Triangular for triangular input, Diagonal for diagonal input, etc. For the matrix subtypes that are currently required to be square, similar with a dimension argument should return the wrapped AbstractMatrix type, or dense if the wrapped data was stored as vectors. It's a shame that indexing a rectangular region of a Diagonal has to allocate a potentially very large dense array for the output, but until we are returning views the only way to do better would be by returning sparse - which would be a surprising thing to do, a user who wants that can convert to sparse before indexing.

@nalimilan
Copy link
Member

+1 (though I'm not sure we will be able to continue tweaking similar without ever bumping on a case where we need to differentiate use cases)

@artkuo
Copy link

artkuo commented Oct 21, 2015

The proposed solution doesn't actually promote to dense, it only avoids enforcing triangular. The result is that similar of a sparse lower triangular yields a sparse, and no harm is done. There is also little effect on triangulars, because they do not save on storage compared to dense, nor do they save on computation, save for a few BLAS calls coded explicitly in triangular.jl. The closed algebra of triangulars is spelled out there, and otherwise triangulars should behave like regular matrices.

Also, I don't quite agree with you about the divergent behaviors for similar, where it yields a triangular if no dimensions are given. I think the behavior should be conceptually the same no matter what, and divergent behavior should be given a divergent method name. I propose to provide something like samestructure to do just that. I'll also provide some documentation, because part of the problem is that similar for triangulars is undocumented. Meanwhile, I will look around more to try to figure out if there are indeed some unwanted ramifications of this change.

@tkelman
Copy link

tkelman commented Oct 21, 2015

similar with no additional arguments is often used in user code as roughly an uninitialized copy. Generic code would break if you change that to return the wrapped array type. Triangular is a structure indicator and should be preserved wherever possible, not thrown out. The concept of similar is underspecified, and it really means very different things depending on whether or not you are changing the eltype or dimensions.

@artkuo
Copy link

artkuo commented Oct 21, 2015

Okay, I'm tempted to follow your suggestion of the bifurcating similar, even though I don't like it. My reservation is that the same function would return different types depending on the optional dimension parameters, which sounds problematic. Perhaps this could be a short-term bug fix that's immediately deprecated? Perhaps a triangular similar should give a warning and point towards similareltype and similarstructure? Internally, it's also possible to re-jigger reducedim to favor similareltype, but it may be difficult to do that across the board (a quick grep shows 600+ appearances of similar in the language). There are lots of ramifications, so some sort of change to similar or its callers could be opened as a separate issue (#11574 being closed), what do you suggest?

@tkelman
Copy link

tkelman commented Oct 21, 2015

Yes I think this would be worth a separate discussion given the particular question in JuliaLang/julia#11574 was mostly answered, and this is a more specific detail than JuliaLang/julia#10064.

@andreasnoack
Copy link
Member

Fixed by JuliaLang/julia#15198

@KristofferC KristofferC transferred this issue from JuliaLang/julia Nov 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants