Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply ROUNDUP_LWORK function in lapack #904

Merged

Conversation

kleineLi
Copy link
Contributor

@kleineLi kleineLi commented Aug 31, 2023

Apply ROUNDUP_LWORK function in lapack

This work is continuation of PR #605. Applied only for s|c precisions. Do we need these changes to be applied for d|z precisions or it is redundant?

Please review!
/cc @weslleyspereira

@codecov
Copy link

codecov bot commented Aug 31, 2023

Codecov Report

Attention: 413 lines in your changes are missing coverage. Please review.

Comparison is base (4174d8d) 0.00% compared to head (51f2ec3) 0.00%.

❗ Current head 51f2ec3 differs from pull request most recent head 88810e4. Consider uploading reports for the commit 88810e4 to get more accurate results

Additional details and impacted files
@@           Coverage Diff           @@
##           master     #904   +/-   ##
=======================================
  Coverage    0.00%    0.00%           
=======================================
  Files        1918     1918           
  Lines      188614   188617    +3     
=======================================
- Misses     188614   188617    +3     
Files Coverage Δ
SRC/chseqr.f 0.00% <ø> (ø)
SRC/cggglm.f 0.00% <0.00%> (ø)
SRC/cgglse.f 0.00% <0.00%> (ø)
SRC/cggqrf.f 0.00% <0.00%> (ø)
SRC/cggrqf.f 0.00% <0.00%> (ø)
SRC/chesv_aa_2stage.f 0.00% <0.00%> (ø)
SRC/chetrs_aa.f 0.00% <0.00%> (ø)
SRC/csysv_aa_2stage.f 0.00% <0.00%> (ø)
SRC/csytrf_aa_2stage.f 0.00% <0.00%> (ø)
SRC/csytrs_aa.f 0.00% <0.00%> (ø)
... and 203 more

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@kleineLi kleineLi force-pushed the apply-roundup_lwork-in-lapack branch 2 times, most recently from d7accd3 to cc850c0 Compare September 1, 2023 15:29
@weslleyspereira
Copy link
Collaborator

Great that you took the time to work on that. Thanks!

I believe we should not think to much and just apply the round ups for single and double precision. If it is easy to do, I would include it in this PR. It adds minimal overhead to the workspace calculation, and the benefits can be big.

@weslleyspereira
Copy link
Collaborator

I'll ping @mgates3 here as well, which might have something to add to the discussion.

@mgates3
Copy link
Contributor

mgates3 commented Sep 1, 2023

Fine to add it to double-precision files as well. Keeps single & double consistent and avoids issues if single is generated from double somehow.

double has 53 bits of precision, so lwork is highly unlikely to exceed that. If lwork = 4n^2, it would need n ≥ 47,453,132 = (2^53 / 4)^0.5 to encounter an issue, right?

Another way to look at it is lwork ≥ 8 bytes * 2^53 / 1024^5 = 64 PiB of memory before there was an issue for double, right?

In MAGMA, the equivalent to roundup_lwork is applied in all files due to precision generation, but it does nothing for double or double-complex, just returns the original value.
Cf. https://github.com/icl-utk-edu/magma/blob/606169ca033e50fb1853ceec0bc60250617ea6a7/control/magma_zauxiliary.cpp#L44

@AlexanderZotkevich
Copy link

Let me suggest to apply only for s|c precisions. As it was mentioned by @mgates3 , 64 PiB of memory before there was an issue for double.

@kleineLi
Copy link
Contributor Author

kleineLi commented Sep 4, 2023

Thank you @mgates3 for the comprehensive comment.

I think that adding functionality for d|z precision unnecessary at the moment, since the probability of working with matrix of size 47mn is extremely small.

But for single precision I've met the rounding issue with matrix size 9000.

SRC/sgetsls.f Outdated Show resolved Hide resolved
@kleineLi kleineLi force-pushed the apply-roundup_lwork-in-lapack branch 2 times, most recently from 465dd08 to c453054 Compare September 8, 2023 15:48
@weslleyspereira
Copy link
Collaborator

Hi @kleineLi. Would you be able to merge to/rebase with the current master? The script for the Github Actions needed to be updated in a recent commit. Thanks

@kleineLi kleineLi force-pushed the apply-roundup_lwork-in-lapack branch 2 times, most recently from d3d3d39 to 095aca0 Compare September 10, 2023 11:57
@kleineLi
Copy link
Contributor Author

kleineLi commented Sep 10, 2023

Hi @weslleyspereira. I have rebased my branch on the current master and completed all changes in this PR

SRC/ssytrd.f Outdated Show resolved Hide resolved
SRC/sgghd3.f Outdated Show resolved Hide resolved
@angsch
Copy link
Collaborator

angsch commented Oct 12, 2023

@kleineLi Could you be so kind and apply the fix also to the alternative implementation of the QR decomposition (SRC/VARIANTS/qr/LL)?

@kleineLi kleineLi force-pushed the apply-roundup_lwork-in-lapack branch from 39f7097 to 689795f Compare October 31, 2023 12:29
@kleineLi
Copy link
Contributor Author

@kleineLi Could you be so kind and apply the fix also to the alternative implementation of the QR decomposition (SRC/VARIANTS/qr/LL)?

Done

@kleineLi kleineLi force-pushed the apply-roundup_lwork-in-lapack branch from 689795f to 28bf3f1 Compare October 31, 2023 14:43
@kleineLi kleineLi force-pushed the apply-roundup_lwork-in-lapack branch from 28bf3f1 to 88810e4 Compare October 31, 2023 14:53
@weslleyspereira
Copy link
Collaborator

weslleyspereira commented Oct 31, 2023

We can probably merge this one. I don't think the failure in AppVeyor has anything to do with this PR. The failure has to do with AppVeyor not finding the flang compiler.

@langou langou merged commit 1638202 into Reference-LAPACK:master Oct 31, 2023
9 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants