-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make Sapporo2 work with more device types #5
Comments
One bit of trouble on macOS is that OpenCL is deprecated there, in favour of Metal. OpenCL 1.2 still works, but newer versions are not supported. Not sure if there is a workaround for this. |
I'm on Linux and having trouble with the Sapporo/OpenCL, the output is:
I checked at the offending line, and the kernel launch that fails has |
I'm not familiar with that device, nor what the optimum settings are. But it looks like too many blocks are launched. What you can try is to change this line, to look as follows: Or change the NTHREAD values here. It might require some trial and error to get that right and work with your device. |
This issue seems increasingly relevant, with other GPUs than Nvidia-build ones becoming more prominent (e.g. Apple's M1 series processors). |
Renamed the issue - I think adding Vulkan support would be a great goal, since this is the most supported GPU language (also supported on macOS via MoltenVK which translates it to Metal). |
Maybe Sycl is the way to go these days? https://sycl.tech |
It is for sure if Sapporo is to take advantage of upcoming Intel HPC GPUs. There's also SYCLomatic that's supposed to be helpful converting CUDA to SYCL, but I bet it won't be too easy for codes like Sapporo that (if I remember correctly) use the CUDA driver (as opposed to runtime) API. |
Probably not easy no. But I think it's essential if we want to use Sapporo in the future. |
I was discussing migrating Sapporo to using SYCL with Kentaro Nomura (now at Intel, formerly at RIKEN), perhaps he can help us with this. |
AMD now has HIP, which is essentially a clone of the CUDA API backed by either CUDA (if you have nVidia hardware) or ROCm (if you have AMD). Easy to port supposedly, but support for other platforms is an open question. Kokkos also looks interesting. It takes a pure C++ approach, and has a variety of backends, although I can't find one for Metal. It does apparently give you less low-level control than SYCL, but with the resources we have, that's probably fine. It also involves writing modern C++, which is a good idea but may require some learning. Still doesn't look like there's a clear winner... |
Tracker issue - this currently doesn't always seem to work (at least on macOS - for which PR #4 is an initial fix), it would be nice if it did.
I will report on progress / problems here.
The text was updated successfully, but these errors were encountered: