Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid checking the diagonal for non-real elements when constructing Hermitians #17367

Merged
merged 1 commit into from
Nov 7, 2017

Conversation

andreasnoack
Copy link
Member

The point in Hermitian is to create a Hermitian view of a matrix. Therefore, the error below, which is thrown on 0.4 and master, is causing frustration and has been reported several times.

julia> complex(randn(3,3), randn(3,3)) |> t -> Hermitian(t'*(eye(3)/3)*t)
ERROR: ArgumentError: Cannot construct Hermitian from matrix with nonreal diagonals
 in call at linalg/symmetric.jl:16
 in call at linalg/symmetric.jl:14
 in anonymous at none:1
 in |> at operators.jl:198

This PR partly reverts #10672 by removing the check of the diagonal on construction.

@andreasnoack andreasnoack added the linear algebra Linear algebra label Jul 11, 2016
@tkelman
Copy link
Contributor

tkelman commented Jul 11, 2016

I really dislike this and think it's a bad holdover from Fortran / Lapack conventions that we don't need to follow just because they did things this way. We treat complex numbers as scalars, and a "view" into just the real part of that scalar isn't something we do anywhere else. It interacts badly with sending the memory layout to libraries (e.g. complex semidefinite programming solvers) that look more literally at the contents of the memory without treating diagonals in a special way. This is an invariant of the structure tag that we should respect and verify.

@andreasnoack
Copy link
Member Author

In practice, this check makes it difficult, if not impossible, to benefit from the Hermitian type. See e.g. #17199 (comment), which is among many other examples including my own work so in a world with floating point rounding I think that BLAS/LAPACK made the right choice. We should just be careful when calling out to libraries but the user experience will be so much better with this PR and I think that is more important.

I don't understand the "We treat complex numbers as scalars, and a "view" into just the real part of that scalar" argument. What do scalars have to do with the Hermitian view of a Matrix?

@tkelman
Copy link
Contributor

tkelman commented Jul 11, 2016

I'm talking about the diagonal entries. They are (most often) scalars, and this ignores part of their contents. Why not add a mutating convenience function specifically for the purpose of zeroing the imaginary diagonal components?

@@ -82,9 +92,16 @@ end

# Conversion
convert(::Type{Matrix}, A::Symmetric) = copytri!(convert(Matrix, copy(A.data)), A.uplo)
convert(::Type{Matrix}, A::Hermitian) = copytri!(convert(Matrix, copy(A.data)), A.uplo, true)
function convert(::Type{Matrix}, A::Hermitian)
B = copytri!(copy(A.data), A.uplo, true)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doesn't convert to dense any more

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah. Rebase error. I'll fix that.

@stevengj
Copy link
Member

Hermitian(A) ignores the lower-triangular part of A even if it is non-symmetric, so it seems consistent that it should ignore the imaginary parts of the diagonal too.

@tkelman
Copy link
Contributor

tkelman commented Jul 11, 2016

The difference is that if an element of A.data is in the "active" triangle of the view, then it exactly matches the corresponding element of A right now. With this change, you add special cases and branching and impose a burden on all generic code that ever gets written to handle diagonals of Hermitian correctly. Making this a checked invariant of the allowed data that can be constructed adds a bit of inconvenience at creation, but is much better for generic code because correct code can be written with fewer special cases.

@stevengj
Copy link
Member

An alternative would be a Hermitian! constructor that projects A onto its Hermitian part (A + A')/2.

@andreasnoack
Copy link
Member Author

An alternative would be a Hermitian! constructor that projects A onto its Hermitian part (A + A')/2.

I'd be okay with that. Then we could add this suggestion to the error message in the check. @tkelman thoughts?

@tkelman
Copy link
Contributor

tkelman commented Jul 11, 2016

Yes I like the idea of a mutating wrapper constructor for this. It's just a more concise implementation of my proposed convenience function.

@andreasnoack
Copy link
Member Author

I think the original solution is the best here. On top of that, we could also have a symmetrizing (A+A')/2 function but that would be promoting for e.g. integers which is a problem.

@tkelman
Copy link
Contributor

tkelman commented Feb 17, 2017

selectively ignoring complex components only on the diagonal would make it extremely cumbersome to write generic code here.

@andreasnoack
Copy link
Member Author

selectively ignoring complex components only on the diagonal would make it extremely cumbersome to write generic code here.

Please be more specific.

@StefanKarpinski
Copy link
Member

I don't see why ignoring the complex part of the diagonal is any worse than ignoring the upper/lower part of the data. Arguably, indexing could just unconditionally compute 0.5*(A.data[i,j] + A.data[j,i]'). That's likely as fast or faster than the nest of if-else checks currently in there. The only other approach that seems consistent is to mutate the input array on construction and change all the values using the same formula and then indexing can just pass indexing straight through.

@tkelman
Copy link
Contributor

tkelman commented Feb 17, 2017

All the Fortran code that follows this convention has to selectively call real on all the diagonal elements, but that Fortran code copy-pastes most of the rest of the implementation and uses different function names for different array types. The precedent here is Lapack conventions which don't do generic programming at all. It would be inconsistent with all other handling of complex array elements everywhere else in Julia.

If you don't have to remember this special case and take the real part of the diagonal in every method that looks at .data of Hermitian inputs, then you can share a lot more code between complex Hermitian and complex symmetric. Can also send array data to C libraries that won't all follow this convention.

@stevengj
Copy link
Member

stevengj commented Feb 17, 2017

Regarding the promotion for integer arrays if we define Hermitian!(A) to replace the contents of A with (A + A')/2, I would just throw an InexactError if this produces non-integer results. (This will happen automatically, in fact, since it will use the same storage as A.)

@stevengj
Copy link
Member

stevengj commented Feb 17, 2017

Something like

function Hermitian!(A::AbstractMatrix)
    m, n = size(A)
    m == n || throw(DimensionMismatch("nonsquare $m×$n matrix cannot be symmetrized"))
    @inbounds for i = 1:m, j = i:m
        a = (A[i,j]+A[j,i]')/2
        A[i,j] = a
        A[j,i] = a'
    end
    return Hermitian(A)
end

is type-stable (throwing an InexactError if necessary) and seems like it should do the trick. It even works for matrices of matrices. (The treatment of the diagonal entries is slightly suboptimal, but A[i,i] = real(A[i,i]) wouldn't work for matrices of matrices.)

@StefanKarpinski
Copy link
Member

If that's named Hermitian! then I would expect Hermitian(A) = Hermitian!(copy(A)), which would not be the case; instead Hermitian(A) would simply not ensure that the data is actually Hermitian.

@tkelman
Copy link
Contributor

tkelman commented Feb 17, 2017

Or work like the existing current Symmetric and Triangular constructors that ignore the inactive triangle of the input data. There's no check needed in those cases to ensure that the active triangle satisfies the invariant of the type, since it always does (though it's worth thinking whether Triangular should be recursive on the diagonal elements for arrays of arrays). Hermitian is the odd one out where data in the active triangle of the input can possibly violate the invariant of the type, unless you move the problem to all user code that has to work on Hermitian inputs.

The Lapack convention for Hermitian is to also ignore the imaginary component of elements on the diagonal and consider those inactive so they can contain arbitrary values and be ignored, that's the part of the original proposal here that I'm against because it's not something we usually do in Julia with complex array elements. The fact that Cholmod doesn't do this, and that it doesn't work for matrices of matrices, are interesting points to think about here.

@StefanKarpinski
Copy link
Member

I would make the following argument for Hermitian(A) just mutating A to be Hermitian:

  • It is common for constructors to "take ownership" of their mutable arguments.
  • If you're passing A to Hermitian, you've almost certainly constructed it so as to be very nearly Hermitian. In that case, even if the caller is hanging onto A making it actually Hermitian is unlikely to screw them over badly.

This way all the LAPACK functions the data gets passed to will see an actual Hermitian matrix.

@StefanKarpinski
Copy link
Member

So basically @stevengj's code but named Hermitian instead of Hermitian!.

@andreasnoack
Copy link
Member Author

Thinking a little more about this: often when using these types, only one of the triangles is actually initialized so (A+A')/2 would give the wrong result. If we are okay with mutation when I suggest that we just zero the imaginary part. I still also think it we shouldn't exclude integer matrices for being wrapped.

@tkelman
Copy link
Contributor

tkelman commented Feb 17, 2017

Do other wrapper types mutate upon construction? It's normal for them to share data, but mutation on construction by default seems less common. Could make it a keyword argument?

@StefanKarpinski
Copy link
Member

Do other wrapper types mutate upon construction? It's normal for them to share data, but mutation on construction by default seems less common. Could make it a keyword argument?

It's not very common, but in this case the user is passing in what they claim is a Hermitian matrix anyway – making it actually be one seems unlikely to be a particularly nasty surprise.

@nalimilan
Copy link
Member

It wouldn't be nasty, but it would certainly be surprising, as it would go against the naming conventions.

@antoine-levitt
Copy link
Contributor

I don't get the point against the original proposal.

If you don't have to remember this special case and take the real part of the diagonal in every method that looks at .data of Hermitian inputs, then you can share a lot more code between complex Hermitian and complex symmetric. Can also send array data to C libraries that won't all follow this convention.

How is that different than having to remember to only look at a specific triangle? Complex symmetric matrices don't seem to be very used (and are for instance not supported in LAPACK). The point wrt interaction with C libraries is a good one, but that seems to me like a smaller price to pay compared to having a Hermitian! constructor or modifying the input (and therefore having a different API than in the Symmetric case)

@tkelman
Copy link
Contributor

tkelman commented Jun 11, 2017

Looking at the active triangle, you can still index a valid component of .data and get the right value. Having to then also call real but only on the diagonal elements and only on Hermitian types is a particular special case that is easy to miss and do incorrectly, and limits the ability for code reuse between real symmetric, complex symmetric, and complex hermitian.

@antoine-levitt
Copy link
Contributor

But in the end the tradeoff basically amounts to an additional if in specialized code whose authors already have to figure out the data storage of symmetric types (and therefore read the doc where this could be highlighted prominently), vs awkwardness in user-facing code. I agree that the possibility of silent bugs is worrisome, but this looks localized enough that it doesn't affect a lot of code (e.g. is there a lot of packages that would be broken if this PR got merged?)

@Sacha0
Copy link
Member

Sacha0 commented Jun 11, 2017

(Ref. #22200 (comment))

@tkelman
Copy link
Contributor

tkelman commented Aug 27, 2017

Any package that accesses .data and/or sends a Hermitian matrix to a C library would silently do the wrong thing here. Cholmod doesn't follow this convention, other C libraries and user code aren't all going to expect this special casing either. There's room for usability improvement with a convenience or mutating helper so roundoff errors on the diagonal don't get in users' way, but Hermitian(im*eye(n)) working and equalling zero (but preserving the nonzero imaginary diagonal so that you need to do extra work to get the behavior right) isn't desirable in my opinion.

@andreasnoack
Copy link
Member Author

The current situation is problematic and there hasn't been a better proposal. It is true that some care is required when calling out to C but the work is already done SuiteSparse, for other libraries it is most likely isolated to a single conversion function, if the conversion is written incorrectly it is not unlikely that the library will just throw an error (that is what SuiteSparse does), and it is pretty likely that you'll catch such an error now that you are aware of it. You are the only person objecting to this so unless new arguments show up or a better proposal is presented, I'll go ahead and merge this tomorrow.

@tkelman
Copy link
Contributor

tkelman commented Aug 28, 2017

There are multiple proposals above, I'd encourage you to consider them first, since none have been tried (and none would be breaking). Mutating wrapper constructors or keyword arguments or a separate zero-the-imaginary-diagonals function would avoid introducing a new special case that we do nowhere else in terms of partially ignoring imaginary components of scalar complex values.

@Sacha0
Copy link
Member

Sacha0 commented Aug 28, 2017

You are the only person objecting to this

Though my opinion has not crystallized, I do share some measure of Tony's ill ease with the behavior implemented in this pull request, and find aspects of the alternative proposals above appealing. Best!

@fredrikekre
Copy link
Member

Next best thing would be Hermitian!(A) which removes the imaginary component on the diagonal. But as Stefan said:

then I would expect Hermitian(A) = Hermitian!(copy(A))

but I think its OK if we document the behaviour clearly.

@andreasnoack
Copy link
Member Author

Just realized that accessing elements in the constructor also makes it impossible to use the current constructor with GPU arrays.

@tkelman
Copy link
Contributor

tkelman commented Nov 2, 2017

How so? Why don't GPU arrays allow getindex in the constructor? They could implement their own specialization if really needed.

@andreasnoack
Copy link
Member Author

The status of this is that the current behavior is problematic and this proposed change will fix the problem. Hypothetically, the fix could cause a bug if the underlying buffer is directly accessed either from Julia or if passed to a C library. That risk is present for any Julia abstraction so it shouldn't cause special caution in this case here. Furthermore, the main external library that Hermitians will be sent to is BLAS which already has the behavior introduced by this PR. The other main library we call with a Hermitian is CHOLMOD. That case is taken care of in this PR where the buffer is adjusted when passed to CHOLMOD. If it should happen that a matrix with non-zero imaginary parts was passed to CHOLMOD anyway then it would simply throw so there is no risk of silent errors. I'd assume that other libraries would follow what either BLAS or CHOLMOD do but so far no specific examples have been provided. Regarding the alternative proposals then we cannot assume that we can modify the wrapped matrix since it could be immutable. Symmetrizing by (A + A')/2 would promote and allocate which is not desirable. This has been going on for a while now so I'll merge as is and we'll have to see what the consequences are.

@andreasnoack andreasnoack merged commit 7c2a42a into master Nov 7, 2017
@andreasnoack andreasnoack deleted the anj/herm branch November 7, 2017 08:23
@tkelman
Copy link
Contributor

tkelman commented Nov 7, 2017

This special casing of only Hermitian to ignore the imaginary part of only the diagonal elements is, in my opinion, a bad semantic choice and more problematic than the behavior that had been on master for multiple years and releases, and the multiple alternative options that could have been implemented without introducing a special case were not investigated as repeatedly suggested over time.

That risk is present for any Julia abstraction so it shouldn't cause special caution in this case here

The previously existing abstraction was much safer - since complex scalar elements are treated atomically everywhere else in julia arrays, enforcing that the diagonal should have a zero imaginary component is safer - those libraries like BLAS that happen to follow an internal convention of ignoring imaginary on the diagonal will still behave correctly. This PR makes any library (or Julia code) that doesn't follow that convention behave incorrectly.

CHOLMOD is far from the only other library in the world that may need to deal with Hermitian matrices. There are the similar libraries Pardiso, MUMPS, Harwell ma57 etc, Spral, just to name a few. Writing bindings for those libraries now has a subtle, not obvious to test for special case hiding that could lead to silently giving wrong results if there were nonzero imaginary components on the diagonal.

I don't think consensus was respectfully sought here or alternatives ever given due consideration, this was merged despite repeated objections and semantic concerns that were shared by at least a few others than myself. The problems this may introduce can potentially be the worst kind, silently incorrect results. Making correct implementation of operations on this type more difficult in order to avoid inconvenience errors does not strike me as a good tradeoff when there are much less harmful ways of avoiding the same errors without modifying semantics or adding non-generic legacy special case behaviors from the Fortran world.

@Sacha0
Copy link
Member

Sacha0 commented Nov 7, 2017

I don't think consensus was respectfully sought here or alternatives ever given due consideration, this was merged despite repeated objections and semantic concerns that were shared by at least a few others than myself.

I must agree with Tony here. Suddenly merging this pull request without further discussion leaves something to be desired. Best!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
linear algebra Linear algebra
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants