-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TPL Support for BLAS functions (nrm2, axpy, dot, gemm) using CuBLAS (Issue #247) #262
Changes from 11 commits
eb0eab0
2704c2b
0d670ff
29cd214
8cc6c91
5adcfcd
23c7e71
578c409
864c6fd
5f7b843
31d3489
6e1821d
0d55406
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -71,11 +71,47 @@ Kokkos::View<const SCALAR*, LAYOUT, Kokkos::Device<ExecSpace, MEMSPACE>, \ | |
Kokkos::MemoryTraits<Kokkos::Unmanaged> >, \ | ||
1,1> { enum : bool { value = true }; }; | ||
|
||
#if defined (KOKKOSKERNELS_INST_DOUBLE) | ||
KOKKOSBLAS1_DOT_TPL_SPEC_AVAIL_BLAS( double, Kokkos::LayoutLeft, Kokkos::HostSpace) | ||
#endif | ||
#if defined (KOKKOSKERNELS_INST_FLOAT) | ||
KOKKOSBLAS1_DOT_TPL_SPEC_AVAIL_BLAS( float, Kokkos::LayoutLeft, Kokkos::HostSpace) | ||
#endif | ||
#if defined (KOKKOSKERNELS_INST_KOKKOS_COMPLEX_DOUBLE_) | ||
KOKKOSBLAS1_DOT_TPL_SPEC_AVAIL_BLAS( Kokkos::complex<double>, Kokkos::LayoutLeft, Kokkos::HostSpace) | ||
#endif | ||
#if defined (KOKKOSKERNELS_INST_KOKKOS_COMPLEX_FLOAT_) | ||
KOKKOSBLAS1_DOT_TPL_SPEC_AVAIL_BLAS( Kokkos::complex<float>, Kokkos::LayoutLeft, Kokkos::HostSpace) | ||
#endif | ||
|
||
#endif | ||
|
||
// cuBLAS | ||
#ifdef KOKKOSKERNELS_ENABLE_TPL_CUBLAS | ||
// double | ||
#define KOKKOSBLAS1_DOT_TPL_SPEC_AVAIL_CUBLAS( SCALAR, LAYOUT, MEMSPACE ) \ | ||
template<class ExecSpace> \ | ||
struct dot_tpl_spec_avail< \ | ||
Kokkos::View<SCALAR, LAYOUT, Kokkos::HostSpace, \ | ||
Kokkos::MemoryTraits<Kokkos::Unmanaged> >, \ | ||
Kokkos::View<const SCALAR*, LAYOUT, Kokkos::Device<ExecSpace, MEMSPACE>, \ | ||
Kokkos::MemoryTraits<Kokkos::Unmanaged> >, \ | ||
Kokkos::View<const SCALAR*, LAYOUT, Kokkos::Device<ExecSpace, MEMSPACE>, \ | ||
Kokkos::MemoryTraits<Kokkos::Unmanaged> >, \ | ||
1,1> { enum : bool { value = true }; }; | ||
|
||
#if defined (KOKKOSKERNELS_INST_DOUBLE) | ||
KOKKOSBLAS1_DOT_TPL_SPEC_AVAIL_CUBLAS( double, Kokkos::LayoutLeft, Kokkos::CudaSpace) | ||
#endif | ||
#if defined (KOKKOSKERNELS_INST_FLOAT) | ||
KOKKOSBLAS1_DOT_TPL_SPEC_AVAIL_CUBLAS( float, Kokkos::LayoutLeft, Kokkos::CudaSpace) | ||
#endif | ||
#if defined (KOKKOSKERNELS_INST_KOKKOS_COMPLEX_DOUBLE_) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How come the complex double and complex float macros end with underscores, but the double and float ones don't? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we can use either without or with underscore, since in the
but I just followed the convention in hpp files in There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK, just checking :) |
||
KOKKOSBLAS1_DOT_TPL_SPEC_AVAIL_CUBLAS( Kokkos::complex<double>, Kokkos::LayoutLeft, Kokkos::CudaSpace) | ||
#endif | ||
#if defined (KOKKOSKERNELS_INST_KOKKOS_COMPLEX_FLOAT_) | ||
KOKKOSBLAS1_DOT_TPL_SPEC_AVAIL_CUBLAS( Kokkos::complex<float>, Kokkos::LayoutLeft, Kokkos::CudaSpace) | ||
#endif | ||
|
||
#endif | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't you use appropriate kokkos-kernels macros to detect whether those Scalar types (e.g., float) are enabled? Otherwise, you'll be instantiating for types for which the user did not want to instantiate. This will increase build time and library size.