[Request for comments] Properly solving the *TRTRS singularity detection problem #946

TiborGY · 2023-11-26T19:48:33Z

As described in #918, the *TRTRS/*TBTRS/*TPTRS family of routines include a rudimentary check for detecting singularity: if and only if any of the diagonal elements of the A triangular matrix is exactly equal to zero (as in 0.0d+0), they signal an error.

An obvious weakness of this approach is that even matrices with one (or more) diagonal elements subnormally close to zero are uncaught. For example you could pass DTRTRS an A matrix where the diagonal is filled with 2^−1025, and get no error from it, even though the matrix would in all likelihood numerically singular. In my opinion this is a problem, as it greatly weakens the numerical reliability guarantees of *TRTRS and all of their callers (*GELS, *GELST, *GETSLS, *GGLSE, *GGGLM), as things like rank-deficiency checks rely on *TRTRS to signal an error.

As far as I can see there are two possible ways to fix this, neither of them is without drawbacks:

Behind door `#1`: Behaviour change

Alter the singularity check in *TRTRS et.al to check that the absolute value of diagonal elements are not .LT. TINY(0.0d+0), instead of checking with .EQ. 0.0d+0. This would fix the worst case behaviour of *TRTRS as well as all the LAPACK subroutines that depend on it, in one fell swoop, without any API changes at all, and would be minimal work to implement.
There are two problems with this approach though.

Firstly, this changes the behaviour of longstanding LAPACK subroutines. Some users could end up faced with runtime failures ( INFO > 0) after such hypothetical changes to LAPACK, although it could be argued that it is better that they become aware that their matrices are potentially numerically unstable.

Secondly, it would still not let users customize/adjust the threshold where an error is signaled. Contrast this with *GELSY: it does pivoted QR and asks the user to specify RCOND, where RCOND is used to determine the effective rank of A. Yet there is no argument that could be used to tell *GELS, when to consider A rank-deficient.

Behind door `#2`: API extension

Add a new sibling to every subroutine affected, and have them take the singularity check threshold as an argument. I am not sure what naming convention would be best, but adding one more would look something like this: *TRTRSCT/*TBTRSCT/*TPTRSCT/*GELSCT/... where CT stands for Custom Threshold. Anyways the naming can always be debated later...

This would allow the behaviour of the well-established subroutines to stay exactly the same, while allowing users who are using the latest LAPACK to enjoy the new subroutines. It could easily be made such that if a negative number is passed into the subroutines as the singularity threshold, then it would default to using the sane default of TINY(0.0d+0) as the threshold for the absolute values of the diagonal elements in *TRTRSCT.

Users could even recover the behaviour of the old subroutines by setting the new threshold argument to zero. In fact, this would be how I would eliminate the code duplication of having both *TRTRSCT and *TRTRS implemented, I would just turn *TRTRS into a shim that calls *TRTRSCT with zero threshold.
Unfortunately this approach also suffers from a couple of problems.

Firstly, this would be considerably more work to implement, although it is possible that I would volunteer to do it myself, if community feedback is favorable.
Secondly, this would further inflate the already substantial LAPACK API. I would have to also implement the corresponding LAPACKE bindings.
Thirdly, other-than-netlib LAPACK implementations may choose to not implement these extensions, or only after an unreasonably long delay.

Request for comments

Thanks for reading my rant. Please let me know what you think the best option would be for a proper fix of the rudimentary singularity check in *TRTRS/*TBTRS/*TPTRS.

The text was updated successfully, but these errors were encountered:

ilayn · 2023-11-26T20:18:57Z

option # 1 is for all intentions and purposes, impossible. That ship has sailed a few decades ago.

Some users could end up faced with runtime failures ( INFO > 0) after such hypothetical changes to LAPACK, although it could be argued that it is better that they become aware that their matrices are potentially numerically unstable.

This one is easy. That's a clean no (we did this in SciPy it wasn't fun). They survived with potential singular arrays all the way here. You are just going to annoy the whole world about a no-op warning and get a bad rep for no particularly strong reasoning.

Secondly, it would still not let users customize/adjust the threshold where an error is signaled. Contrast this with *GELSY: it does pivoted QR and asks the user to specify RCOND, where RCOND is used to determine the effective rank of A.

Correct, and one of the biggest complains about this, is where you get the rcond from. So it won't solve your problems, in fact you now have to guess a rcond for your array or perform another painful long long command since ?##con wants the ##trf result to compute and the function signatures are unnerving just to solve a single AX = B equation.

For # 2

I always wanted to have this anyways, so don't get me wrong. But the question is how are you going to do it? Just by checking the absolute value is not enough. So you should get the condition number for inversion. Typical application of this is you do one ?##trf, then squeeze in a ?##con, and then continue with ?##trs if con result goes OK (Or I don't know any other way to implement it). But note that rcond computation is not exact and not really that fast(calls latrs); so you are approximating it. Hence there will be some art to it and hence false positives.

In my opinion, if there is any chance to touch these algorithms, it has to implement the matlab backslash behavior. Check out the flowchart here. This is not really a complicated algorithm and allows for quick for loops over the array.

That is to say, it discovers the structure of the array by running down the entries, checking for bi, tri diagonal, upper/lower triangular, symmetric and hermitian, and pos def. and decide the algorithm and the ?xxcon check as it discovers choosing the easy way out.

The rest is, it seems to me, just unnecessary labor for the users, legacy of 40 year old habits with 5 different calls with respective lworks and aligning memory for the results and so on. It should get easier not more laborious by the decade.

That would be my door # 3

TiborGY · 2023-11-28T14:43:53Z

option # 1 is for all intentions and purposes, impossible. That ship has sailed a few decades ago.
You are just going to annoy the whole world about a no-op warning and get a bad rep for no particularly strong reasoning.

I thought so as well, and initially rejected the idea myself as well. But still, I am curious to know the community attitude is towards the idea. I mean, who knows, some communities are lot more gung-ho about minor changes to function behaviour, so might as well put it on the table.
But I agree, my first instinct is also to avoid option # 1.

Secondly, it would still not let users customize/adjust the threshold where an error is signaled. Contrast this with *GELSY: it does pivoted QR and asks the user to specify RCOND, where RCOND is used to determine the effective rank of A.

Correct, and one of the biggest complains about this, is where you get the rcond from. So it won't solve your problems, in fact you now have to guess a rcond for your array or perform another painful long long command since ?##con wants the ##trf result to compute and the function signatures are unnerving just to solve a single AX = B equation.

I used *GELSY just as a precedent for having some kind of decision theshold, I 100% agree that having users sit there scratching their head, trying to figure out where to get RCOND from to call *GELSY, is a frustrating experience.
But I think the biggest reason for that is the lack of a sane default. This is why I suggested putting in a default that ought to have very few false positives.

But the question is how are you going to do it? Just by checking the absolute value is not enough. So you should get the condition number for inversion.

I am not convinced that going through the full condition number estimation is necessary for triangular solvers like *TRTRS. For a triangular matrix the eigenvalues are just the diagonal elements, and the determinant is just the product of the diagonal elements. And there is a convenient "natural" default to check against in the form of subnormality. Beyond the individual elements one could also check the determinant as it is so easy to compute for triangular matrices. If any of the diagonal elements or the determinant is subnormal, trigger the error exit.

TiborGY changed the title ~~[Request for comments] Properly solving the *TRTRS with singularity detection problem~~ [Request for comments] Properly solving the *TRTRS singularity detection problem Nov 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Request for comments] Properly solving the *TRTRS singularity detection problem #946

[Request for comments] Properly solving the *TRTRS singularity detection problem #946

TiborGY commented Nov 26, 2023

ilayn commented Nov 26, 2023

TiborGY commented Nov 28, 2023

[Request for comments] Properly solving the *TRTRS singularity detection problem #946

[Request for comments] Properly solving the *TRTRS singularity detection problem #946

Comments

TiborGY commented Nov 26, 2023

Behind door #1: Behaviour change

Behind door #2: API extension

Request for comments

ilayn commented Nov 26, 2023

TiborGY commented Nov 28, 2023

Behind door `#1`: Behaviour change

Behind door `#2`: API extension