-
Notifications
You must be signed in to change notification settings - Fork 572
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KokkosBlas: limited vector size in norm calculation? #9856
Comments
I was too quick to say it was a Tpetra issue. The problem seems to be in KokkosBlas. I wrote the following reproducer:
With Node::device_type == Kokkos::Cuda, this code fails with the error that @jennloe saw. |
Brian says this is a bug with Kokkos' reduce; see kokkos/kokkos#4461 |
@kddevin @jennloe FYI kokkos/kokkos-kernels#1204 will take care of this. You can have any number of vectors on any backend, and for most of the BLAS-1 functions (dot, norms other than nrminf, sum) you'll get a ~3x speedup on Cuda compared to before. |
…s:develop' (7f1c956). * trilinos-develop: Nightlies on Geminga: Fix build names Tempus: Work around Piro adjoint sensitivity test failure Tempus: Fix mistake in merge Anasazi: Add another epsilon compare that I missed Allow epsilon tolerance in MultiVecTraits norms Patch in Kokkos Kernels trilinos#1204 to fix trilinos#9856 Tempus: Reset computation of dg/dx, df/dx, df/dx_dot every integration Tempus: Add method to get response value from pseudo-transient adjoint integrator Tempus: Allow 2nd adjoint ME for pseudo-transient adjoint integrator Tempus: Add more needed methods to adjoint, psuedo-transient adjoint integrators Tempus: Add new sensitivity functions to adjoint, pseudo-transient adjoint integrators Tempus: Changes to sensitivity integrators in support of SPARC
…s:develop' (7f1c956). * trilinos-develop: Nightlies on Geminga: Fix build names Tempus: Work around Piro adjoint sensitivity test failure Tempus: Fix mistake in merge Anasazi: Add another epsilon compare that I missed Allow epsilon tolerance in MultiVecTraits norms Patch in Kokkos Kernels trilinos#1204 to fix trilinos#9856 Tempus: Reset computation of dg/dx, df/dx, df/dx_dot every integration Tempus: Add method to get response value from pseudo-transient adjoint integrator Tempus: Allow 2nd adjoint ME for pseudo-transient adjoint integrator Tempus: Add more needed methods to adjoint, psuedo-transient adjoint integrators Tempus: Add new sensitivity functions to adjoint, pseudo-transient adjoint integrators Tempus: Changes to sensitivity integrators in support of SPARC
Automatically Merged using Trilinos Pull Request AutoTester PR Title: Tpetra: reproducer for #9856 PR Author: kddevin
…Trilinos:develop' (615b98f). * trilinos/develop: (25 commits) Drivers: Changes for sems update Drivers: Changes for sems update Drivers: lightsaber update MueLu: Add parameter translation for ML Maxwell1 Tpetra: remove unused device_type typedef in functor Geminga: Set CMAKE_CXX_USE_RESPONSE_FILE_FOR_OBJECTS:BOOL=OFF for Cuda builds Geminga: Fix cron driver ShyLU_DD/FROSch: fix counting nnz Ifpack2 FastILU: Fix MV access ShyLU_DD/FROSch: fix data access without UVM Nightlies on Geminga: Fix build names ShyLU_DDD/FROSch: add access patter for building without Tpetra deprecated code Attempt to fix nightlies on geminga Tempus: Work around Piro adjoint sensitivity test failure Finish transition from sems-xyz to sems-archive-xyz (SEHELPD-2963) Tempus: Fix mistake in merge Anasazi: Add another epsilon compare that I missed Allow epsilon tolerance in MultiVecTraits norms Patch in Kokkos Kernels trilinos#1204 to fix trilinos#9856 Tempus: Reset computation of dg/dx, df/dx, df/dx_dot every integration ...
Bug Report
@trilinos/tpetra
Description
@jennloe reports
I have this Trilinos (Belos + Tpetra) code which is crashing, and I don't understand why
And the output is:
At first glance, the problem appears to be a mismatch between the norms argument (on host) and the vector argument (on device)
Steps to Reproduce
Reproduced with simple test program on ascicgpu with no UVM.
The text was updated successfully, but these errors were encountered: