-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid checking the diagonal for non-real elements when constructing Hermitians #17367
Conversation
I really dislike this and think it's a bad holdover from Fortran / Lapack conventions that we don't need to follow just because they did things this way. We treat complex numbers as scalars, and a "view" into just the real part of that scalar isn't something we do anywhere else. It interacts badly with sending the memory layout to libraries (e.g. complex semidefinite programming solvers) that look more literally at the contents of the memory without treating diagonals in a special way. This is an invariant of the structure tag that we should respect and verify. |
In practice, this check makes it difficult, if not impossible, to benefit from the Hermitian type. See e.g. #17199 (comment), which is among many other examples including my own work so in a world with floating point rounding I think that BLAS/LAPACK made the right choice. We should just be careful when calling out to libraries but the user experience will be so much better with this PR and I think that is more important. I don't understand the "We treat complex numbers as scalars, and a "view" into just the real part of that scalar" argument. What do scalars have to do with the Hermitian view of a Matrix? |
I'm talking about the diagonal entries. They are (most often) scalars, and this ignores part of their contents. Why not add a mutating convenience function specifically for the purpose of zeroing the imaginary diagonal components? |
base/linalg/symmetric.jl
Outdated
@@ -82,9 +92,16 @@ end | |||
|
|||
# Conversion | |||
convert(::Type{Matrix}, A::Symmetric) = copytri!(convert(Matrix, copy(A.data)), A.uplo) | |||
convert(::Type{Matrix}, A::Hermitian) = copytri!(convert(Matrix, copy(A.data)), A.uplo, true) | |||
function convert(::Type{Matrix}, A::Hermitian) | |||
B = copytri!(copy(A.data), A.uplo, true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this doesn't convert to dense any more
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah. Rebase error. I'll fix that.
|
The difference is that if an element of |
An alternative would be a |
I'd be okay with that. Then we could add this suggestion to the error message in the check. @tkelman thoughts? |
Yes I like the idea of a mutating wrapper constructor for this. It's just a more concise implementation of my proposed convenience function. |
68aef5d
to
aba1bad
Compare
I think the original solution is the best here. On top of that, we could also have a symmetrizing |
selectively ignoring complex components only on the diagonal would make it extremely cumbersome to write generic code here. |
Please be more specific. |
I don't see why ignoring the complex part of the diagonal is any worse than ignoring the upper/lower part of the data. Arguably, indexing could just unconditionally compute |
All the Fortran code that follows this convention has to selectively call real on all the diagonal elements, but that Fortran code copy-pastes most of the rest of the implementation and uses different function names for different array types. The precedent here is Lapack conventions which don't do generic programming at all. It would be inconsistent with all other handling of complex array elements everywhere else in Julia. If you don't have to remember this special case and take the real part of the diagonal in every method that looks at |
Regarding the promotion for integer arrays if we define |
Something like function Hermitian!(A::AbstractMatrix)
m, n = size(A)
m == n || throw(DimensionMismatch("nonsquare $m×$n matrix cannot be symmetrized"))
@inbounds for i = 1:m, j = i:m
a = (A[i,j]+A[j,i]')/2
A[i,j] = a
A[j,i] = a'
end
return Hermitian(A)
end is type-stable (throwing an |
If that's named |
Or work like the existing current The Lapack convention for Hermitian is to also ignore the imaginary component of elements on the diagonal and consider those inactive so they can contain arbitrary values and be ignored, that's the part of the original proposal here that I'm against because it's not something we usually do in Julia with complex array elements. The fact that Cholmod doesn't do this, and that it doesn't work for matrices of matrices, are interesting points to think about here. |
I would make the following argument for
This way all the LAPACK functions the data gets passed to will see an actual Hermitian matrix. |
So basically @stevengj's code but named |
Thinking a little more about this: often when using these types, only one of the triangles is actually initialized so |
Do other wrapper types mutate upon construction? It's normal for them to share data, but mutation on construction by default seems less common. Could make it a keyword argument? |
It's not very common, but in this case the user is passing in what they claim is a Hermitian matrix anyway – making it actually be one seems unlikely to be a particularly nasty surprise. |
It wouldn't be nasty, but it would certainly be surprising, as it would go against the naming conventions. |
I don't get the point against the original proposal.
How is that different than having to remember to only look at a specific triangle? Complex symmetric matrices don't seem to be very used (and are for instance not supported in LAPACK). The point wrt interaction with C libraries is a good one, but that seems to me like a smaller price to pay compared to having a Hermitian! constructor or modifying the input (and therefore having a different API than in the Symmetric case) |
Looking at the active triangle, you can still index a valid component of |
But in the end the tradeoff basically amounts to an additional if in specialized code whose authors already have to figure out the data storage of symmetric types (and therefore read the doc where this could be highlighted prominently), vs awkwardness in user-facing code. I agree that the possibility of silent bugs is worrisome, but this looks localized enough that it doesn't affect a lot of code (e.g. is there a lot of packages that would be broken if this PR got merged?) |
(Ref. #22200 (comment)) |
4a5cbbb
to
c45a65a
Compare
Any package that accesses |
The current situation is problematic and there hasn't been a better proposal. It is true that some care is required when calling out to C but the work is already done SuiteSparse, for other libraries it is most likely isolated to a single conversion function, if the conversion is written incorrectly it is not unlikely that the library will just throw an error (that is what SuiteSparse does), and it is pretty likely that you'll catch such an error now that you are aware of it. You are the only person objecting to this so unless new arguments show up or a better proposal is presented, I'll go ahead and merge this tomorrow. |
There are multiple proposals above, I'd encourage you to consider them first, since none have been tried (and none would be breaking). Mutating wrapper constructors or keyword arguments or a separate zero-the-imaginary-diagonals function would avoid introducing a new special case that we do nowhere else in terms of partially ignoring imaginary components of scalar complex values. |
Though my opinion has not crystallized, I do share some measure of Tony's ill ease with the behavior implemented in this pull request, and find aspects of the alternative proposals above appealing. Best! |
Next best thing would be
but I think its OK if we document the behaviour clearly. |
Just realized that accessing elements in the constructor also makes it impossible to use the current constructor with GPU arrays. |
c45a65a
to
7ed2e2d
Compare
How so? Why don't GPU arrays allow getindex in the constructor? They could implement their own specialization if really needed. |
7ed2e2d
to
a17d21d
Compare
The status of this is that the current behavior is problematic and this proposed change will fix the problem. Hypothetically, the fix could cause a bug if the underlying buffer is directly accessed either from Julia or if passed to a C library. That risk is present for any Julia abstraction so it shouldn't cause special caution in this case here. Furthermore, the main external library that |
This special casing of only Hermitian to ignore the imaginary part of only the diagonal elements is, in my opinion, a bad semantic choice and more problematic than the behavior that had been on master for multiple years and releases, and the multiple alternative options that could have been implemented without introducing a special case were not investigated as repeatedly suggested over time.
The previously existing abstraction was much safer - since complex scalar elements are treated atomically everywhere else in julia arrays, enforcing that the diagonal should have a zero imaginary component is safer - those libraries like BLAS that happen to follow an internal convention of ignoring imaginary on the diagonal will still behave correctly. This PR makes any library (or Julia code) that doesn't follow that convention behave incorrectly. CHOLMOD is far from the only other library in the world that may need to deal with Hermitian matrices. There are the similar libraries Pardiso, MUMPS, Harwell ma57 etc, Spral, just to name a few. Writing bindings for those libraries now has a subtle, not obvious to test for special case hiding that could lead to silently giving wrong results if there were nonzero imaginary components on the diagonal. I don't think consensus was respectfully sought here or alternatives ever given due consideration, this was merged despite repeated objections and semantic concerns that were shared by at least a few others than myself. The problems this may introduce can potentially be the worst kind, silently incorrect results. Making correct implementation of operations on this type more difficult in order to avoid inconvenience errors does not strike me as a good tradeoff when there are much less harmful ways of avoiding the same errors without modifying semantics or adding non-generic legacy special case behaviors from the Fortran world. |
I must agree with Tony here. Suddenly merging this pull request without further discussion leaves something to be desired. Best! |
The point in
Hermitian
is to create a Hermitian view of a matrix. Therefore, the error below, which is thrown on 0.4 and master, is causing frustration and has been reported several times.This PR partly reverts #10672 by removing the check of the diagonal on construction.