Add hipSYCL curand support#226
Conversation
aelizaro
left a comment
There was a problem hiding this comment.
Hi @nilsfriess, thank you for the PR, looks good to me! @mkrainiuk, could you, please, also take a look?
mkrainiuk
left a comment
There was a problem hiding this comment.
Thank you for the PR, it also looks good to me. Does it make sense to update also our documentation with RNG supported compilers?
|
I think there are multiple places where the documentation is not quite up to date with regard to hipSYCL support (e.g. here and here it is not mentioned that hipSYCL can also be used for the BLAS backend with CUDA). |
Good catch, we need to update them too. I'm fine with both options, if you prefer extra PR we can merge this PR as is. |
2a17e78 to
2f79496
Compare
…ration` instead of `handler::host_task` when compiling with hipSYCL
2f79496 to
0a7db36
Compare
|
There weren't too many places where the documentation was not up to date, so I just added the changes to this PR. |
Thank you! Documentation looks good to me. Please let me know if there anything else you want to add or we can merge it now. |
|
Thanks, you can merge it :) |
Description
This PR adds support to use the cuRAND backend with hipSYCL.
The approach is similar though simpler than #144: Since hipSYCL does not support
host_taskbut instead implements the extensionhipSYCL_enqueue_custom_operation, a wrapper functiononemkl_curand_host_taskis introduced that calls eitherhost_taskorhipSYCL_enqueue_custom_operationdepending on which compiler is used.Further, calls to
curandSetStreambefore calling the cuRANDgenerate-functions were added. This is important since previously, the call towait_and_throwafter submitting the CUDA calls might not wait for the cuRAND random number generation to finish if it was started on a different stream. In fact, increasing the amount of random numbers generated in the unit tests (and thereby increasing the time spent generating numbers) made some tests fail (both when using hipSYCL and dpc++), since the result might be read before random number generation is finished. Now all tests pass also when increasing the number of random samples to be generated.Test logs
The tests were executed on Ubuntu 20.04 using CUDA 11.6.
curand_dpcpp_test.log
curand_hipsycl_test.log
Checklist