-
Notifications
You must be signed in to change notification settings - Fork 444
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add level-3 BLAS triangular Sylvester equation solver #651
Conversation
2e1e51f
to
f08506b
Compare
Codecov ReportBase: 0.00% // Head: 0.00% // No change to project coverage 👍
Additional details and impacted files@@ Coverage Diff @@
## master #651 +/- ##
=========================================
Coverage 0.00% 0.00%
=========================================
Files 1894 1904 +10
Lines 184078 186552 +2474
=========================================
- Misses 184078 186552 +2474
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
Force compatibility with [ds]trsyl. Use two floating-point scaling factors (rather than integer scaling factors). This does not eliminate the problem that scalings can be flushed, making any result useless. That problem could be eliminated by replacing the floating-point scale factor with an integer scale factor.
The tests of ztrsyl via zget35 are on tiny matrices that fall into the unblocked section of ztrsyl3. Add a new test file that checks ztrsyl(3) for larger matrices and their compatibility: Every problem that is solvable by ztrsyl must be solvable by ztrsyl3.
An extensions of LATRS solving T*x = s*x to many right-hand sides, allowing the usage of BLAS-3. The new algorithm tends to use less aggressive scaling, in particular for larger matrices.
DLATRS3 DTRSYL3 The two solvers are a drop-in replacement of the existing BLAS-2 routines and dedicate quite a bit of code to computing a similar result as the BLAS-2 counterpart.The question on whether or not the scale factor should be of the same data type as the other data structures remains. The advantage of moving to the integer scale factor alpha = 2**j is that flushing of the scale factor to zero can no longer happen as a result of limitations of the floating-point range. I would keep the representation by the BLAS-2 and the BLAS-3 version compatible. Then the disadvantage is a change of the function signature of LATRS and TRSYL. One option is to revisit the code later when an upgrade to LAPACK 4.0 is due and then switch the representation of the solution from (1/scale) * x to (1/(2**scale)) * x,where scale is int, later. |
Description
Currently the only way to solve the triangular Sylvester equation AX + XB = C is via dtrsyl, which solely relies in level-2 BLAS. This PR intends to contribute a level-3 BLAS version.
The triangular Sylvester equation has been recognized to be prone to overflow. For that purpose, dtrsyl utilizes a scaling factor to represent the solution as (scale^{-1} * X) and solve the scaled equation AX + XB = scale * C. Due to the scaling factor, there is some flexibility in the representation of the solution. The proposed level-3 BLAS version computes the scaling factors based on upper bounds of blocks to enable level-3 BLAS. The scaling is typically slightly more aggressive so that an alternatively scaled final solution is computed. This is no problem as long as the scaling factor does not get flushed to zero. Indeed, the assumption in Anderson's original paper on solving scaled systems is that scale is in (0,1] and not [0,1].
In experiments the flushing occasionally happens for both dtrsyl and the proposed solver dtrsyl3, but dtrsyl3 reaches the stage earlier. In view of the intended changes of the handling of NaN, inf etc. I would be grateful to get some feedback on how to address the problem.
Our paper proposes to use integer scaling factors of the form 2^{(intscale} and solve AX + XB = 2^{intscale} * C. The vastly increased representational range rules out any flushing during the computation. The problem with the scaling factors affects dlatrs and dtrevc, too. There may be more routines. Is there an interest in using integer scaling factors and changing the function signature/documentation?
Alternatively, I would revise the PR and add a workaround for the flushing problem present in the attached code.
Once the decision on the scalings is taken, I will fix the other data types.
Checklist