-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sptrsv supernode #552
Sptrsv supernode #552
Conversation
…tor, give run-time optitions for merge, metis)
…supernodal block with zeros)
create sptrsv_aux.hpp for some common function between SuperLU and Cholmod
…_superlu tester NOTE: we store L in CSC (calling cusparse with UPPER+TRANSPOSE) and U in CSR (calling cusparse with UPPER+NoTRANSPOSE)
Running spot-check on kokkos-dev-2 |
Spot-check on kokkos-dev-2 passes:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't get through the last few files, but I think I said what I wanted to say. I didn't click "Request changes" because I don't want to block the scheduled kokkos-kernels release and promotion. Here are my main concerns about this PR:
-
It introduces new TPLs for only three functions that already exist in the BLAS and LAPACK TPLs. kokkos-kernels already has code for bringing in BLAS functions, that the author could have imitated.
-
The code only works for
double
andcomplex<double>
. For other types, the code may run to completion but produce incorrect results with no diagnostic. kokkos-kernels functions should support at least the four types that the BLAS, LAPACK, cuSPARSE, etc. support, or stop with an error if they do not. -
The code reimplements existing
KokkosBlas::*
functionality.
@@ -24,6 +24,10 @@ KOKKOSKERNELS_INTERNAL_ENABLE_MKL := $(strip $(shell echo $(KOKKOSKERNELS_ENABLE | |||
KOKKOSKERNELS_INTERNAL_ENABLE_CUSPARSE := $(strip $(shell echo $(KOKKOSKERNELS_ENABLE_TPLS) | grep "cusparse" | wc -l)) | |||
KOKKOSKERNELS_INTERNAL_ENABLE_CUBLAS := $(strip $(shell echo $(KOKKOSKERNELS_ENABLE_TPLS) | grep "cublas" | wc -l)) | |||
KOKKOSKERNELS_INTERNAL_ENABLE_MAGMA := $(strip $(shell echo $(KOKKOSKERNELS_ENABLE_TPLS) | grep "magma" | wc -l)) | |||
KOKKOSKERNELS_INTERNAL_ENABLE_CHOLMOD := $(strip $(shell echo $(KOKKOSKERNELS_ENABLE_TPLS) | grep "cholmod" | wc -l)) | |||
KOKKOSKERNELS_INTERNAL_ENABLE_SUPERLU := $(strip $(shell echo $(KOKKOSKERNELS_ENABLE_TPLS) | grep "superlu" | wc -l)) | |||
KOKKOSKERNELS_INTERNAL_ENABLE_LAPACKE := $(strip $(shell echo $(KOKKOSKERNELS_ENABLE_TPLS) | grep "lapacke" | wc -l)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ndellingwood @iyamazaki just to summarize my concerns about adding CBLAS
and LAPACKE
TPLs in yesterday's phone call:
Trilinos historically let people add optional TPLs freely. However, as a practical matter, downstream users (like Clark) will not enable them unless they are enabled by default or unless they add a lot of benefit. PR tests do not enable CBLAS and LAPACKE, so you (either you personally, or kokkos-kernels) must do all the downstream testing yourself. Furthermore, you've also added work with no benefit in terms of features, since the CBLAS and LAPACKE features you use already exist in the BLAS and LAPACK.
I see this as a temporary solution, made necessary because of the release time frame. The long-term solution is to wrap the BLAS and LAPACK in kokkos-kernels, not to introduce new TPL dependencies. I don't mind temporary solutions at all and I'll be happy to help with long-term solutions, but please just be aware of the aforementioned concerns. I have no personal stake in this -- I'm just explaining the practical reality that new TPLs create a lot of trouble for downstream users.
@@ -1,5 +1,5 @@ | |||
TRIBITS_PACKAGE_DEFINE_DEPENDENCIES( | |||
LIB_REQUIRED_PACKAGES KokkosCore KokkosContainers KokkosAlgorithms | |||
LIB_OPTIONAL_TPLS quadmath MKL BLAS LAPACK CUSPARSE MAGMA | |||
LIB_OPTIONAL_TPLS quadmath MKL BLAS LAPACK CUSPARSE MAGMA SUPERLU CHOLMOD LAPACKE CBLAS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we actually test the MAGMA TPL? Just curious -- I'm asking for another project.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not that I know of.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might have a bootstrap issue here for LAPACKE and CBLAS. The tpls don't exist in Trilinos (AFAIK) so adding the TPLs to Dependencies.cmake
would also require adding Tribits find modules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trilinos does have find modules for SuperLU and Cholmod, but with different capitalization.
SUPERLU -> SuperLU
and CHOLMOD -> Cholmod
.
This problem of capitalize vs dont' capitalize is going to follow us forever...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jjwilke I'm not sure how easy it would be to change these in a backwards-compatible way so that Trilinos could accept either capitalization, even for specific TPLs. @bartlettroscoe might have some ideas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure how easy it would be to change these in a backwards-compatible way so that Trilinos could accept either capitalization, even for specific TPLs.
This is an ugly problem. The names and capitalization of these TPLs should have been considered very carefully from the beginning. But with people breaking backward compatiblity left and right right now (because of the anticipated Trilinos 13.0 release) this is a great time to fix a bunch of these issues.
@jjwilke, can we get a list of TPLs defined in {{Trilinos/TPLsList.cmake}} that should be renamed or recapitalized to match standard CMake names for these TPLs?
I think this is why find_package()
can also find <lower_case_name>_config.cmake
filkes in addition to <UpperCaseName>Config.cmake
files as this solves the capitalization problem, at least for installed packages that provide config.cmake files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bartlettroscoe wrote:
But with people breaking backward compatiblity left and right right now (because of the anticipated Trilinos 13.0 release) this is a great time to fix a bunch of these issues.
Is there a way to add "aliases" for TPL names? Could I set up a TPL Foo
so that users could say TPL_ENABLE_Bar=ON
and TPL_ENABLE_Foo=ON
gets set?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way to add "aliases" for TPL names? Could I set up a TPL Foo so that users could say TPL_ENABLE_Bar=ON and TPL_ENABLE_Foo=ON gets set?
@mhoemmen, not sure. Would have to look at all of the use cases that would need to work (and it is easy to leave out use cases that don't occur to one right away).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bartlettroscoe The main reason would be to support changes in capitalization. On the other hand, CMake already tells you about unused options, so perhaps that's good enough to support a change.
template<typename scalar_t> | ||
void forwardP_supernode(int n, int *perm_r, int nrhs, scalar_t *B, int ldb, scalar_t *X, int ldx) { | ||
|
||
/* Permute right hand sides to form Pr*B */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is almost certainly duplicated code. Many operations in kokkos-kernels and downstream code need to permute vectors or multivectors. If kokkos-kernels has an existing permutation function, please use it. Otherwise, consider adding one to kokkos-kernels.
} | ||
|
||
template<typename scalar_t> | ||
void backwardP_supernode(int n, int *perm_c, int nrhs, scalar_t *B, int ldb, scalar_t *X, int ldx) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See above comment on permutations and duplicated code.
for (int i = 0; i <= j; i++) { | ||
hv(hr(j1 + j) + i) = Lx[psx + i + j*nsrow]; | ||
} | ||
#if 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this code commented out?
int psx = colptrL[j1]; | ||
if (invert_diag) { | ||
if (std::is_same<scalar_t, double>::value == true) { | ||
LAPACKE_dtrtri (LAPACK_COL_MAJOR, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not a fan of this. You could have just copied and pasted the extern "C"
declaration format for BLAS functions that kokkos-kernels already uses, to get calls to LAPACK's _TRTRI
functions. It's bad encapsulation to have this "if scalar type is this, call this, else call that, else ..." code pasted directly into the SuperLU code here. You should just make this into a function, and change it to call regular LAPACK inside instead of LAPACKE. You don't have to make it a public function if you don't intend to support it yet.
LAPACKE_ztrtri (LAPACK_COL_MAJOR, | ||
'U', 'N', nscol, | ||
reinterpret_cast <lapack_complex_double*> (&Lx[psx]), nsrow); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about float
and complex<float>
? kokkos-kernels is supposed to support at least the four types that the BLAS supports. Please at least static_assert
or throw here.
if (invert_diag) { | ||
if (offset > 0 && invert_offdiag) { | ||
if (std::is_same<scalar_t, double>::value == true) { | ||
cblas_dtrmm (CblasColMajor, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My comments about LAPACKE apply here to CBLAS.
reinterpret_cast <double*> (&one), | ||
reinterpret_cast <double*> (&hv(nnzD)), nscol+offset, | ||
reinterpret_cast <double*> (&hv(nnzD+nscol)), nscol+offset); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about float
and complex<float>
?
@mhoemmen thanks for thorough review! I skimmed some comments and can provided a couple points of clarification I'm aware of, and let @iyamazaki add other responses. Magma - we do not have nightly testing running with magma, though the tpls and scripts are setup to handle this on apollo (where development and PR testing was done of the capability). I'll get this setup soon, dropped the ball on it. Experimental - the code hasn't been used yet outside of development and the API is finalized yet, hope is to get some feedback and review. Longer term, I think the plan is to expose the solver through Amesos2. |
@ndellingwood wrote:
No worries! I can explain in person why MAGMA might be more important for us in the future, but there's no rush. |
* remove superfluous "== true" in "std::is_same<,>::value == true"
Conflicts: src/sparse/KokkosSparse_sptrsv.hpp src/sparse/KokkosSparse_sptrsv_handle.hpp src/sparse/impl/KokkosSparse_sptrsv_solve_impl.hpp src/sparse/impl/KokkosSparse_sptrsv_solve_spec.hpp src/sparse/impl/KokkosSparse_sptrsv_symbolic_impl.hpp
@iyamazaki I put in a PR against this branch on your fork iyamazaki#1 that merges develop back in and resolves conflicts. If it passes testing go ahead and merge it into your branch, then push your updated sptrsv-supernode branch back up to this PR. |
Spoke with @srajama1, this PR will go in for the promotion and allow for development; we'll resolve the TPL issues for the next promotion, the goal being to provide proper kokkos-kernels implementations for the missing BLAS and LAPACK routines needed (called from cblas and lapacke currently). |
Merging in for release 3.0 Note the spot-check on kokkos-dev-2 has results posted in a PR to @iyamazaki branch when the merge conflicts were resolved. |
This PR adds the options for supernodal-based sparse-triangular solve (with SuperLU or Cholmod) to the existing level-based scheduling sparse-triangular solve.