forked from QMCPACK/miniqmc
-
Notifications
You must be signed in to change notification settings - Fork 0
Home
rcclay edited this page Jul 13, 2018
·
5 revisions
Gotcha's with Kokkos:
- Handling legacy C code will take some care. C-style structs -> C++ structs really should have constructors/destructors called, and thus should use "new" and "delete" instead of malloc and free. I think this causes problems with reference counting when Kokkos data types are dropped into legacy C code. Should maybe look into "kokkos_malloc" and "kokkos_free".
- Kokkos seems to require copy constructors with same syntax as the default copy constructor. Change = delete to = default.
- Looks like care should definitely be taken regarding static class members. It's delicate in CUDA, and Kokkos reflects this. Move MultiBSplineData into MultiBSpline for now to circumvent this problem.
- All calls to C++ std library functions in Kokkos parallel regions should be looked at very carefully ("looked at"=purged). Does not play well with CUDA. See "std::fill" in src/Numerics/Spline2/MultiBspline.hpp:evaluate_v(...). Chugs along fine for CPU, GPU code compiles, but frustratingly difficult to find runtime error.
- In einspline_spo, parallelizing evaluate_v needs "psi" and "einspline" arrays. Overloading the einspline_spo class functor gives access to these views, whereas using the provided KOKKOS_LAMBDA macro assumes the Lambda function will capture by value, NOT by reference which is required.