-
Notifications
You must be signed in to change notification settings - Fork 132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert from HIP to hipper #1063
Comments
When we do this, we should make sure to look at the CMake setup too. Right now, there is a lot of complicated code needed to trick CMake into using HIP as CUDA. It would be nice if we could only take this path when HIP is being used to do the compilation. |
Yes. We may also need to refactor the HIP/CUDA CMake code considerably in for this issue and #1101. This code was written for very early ROCm/HIP tools and has not been maintained over the years. I don't have access to an AMD GPU system for testing yet. OLCF's test system is not yet open to INCITE projects and NCSA Delta (which will have only one AMD GPU node) has been delayed again. If these are still not available when I start working on this, I will make the conversion testing only on CUDA and then make the changes needed for AMD support later. |
I support this. I also do not have an AMD system to test on, and ROCm/HIP has been unstable, at least in the past. |
Revisiting this: It will be a significant effort to port HOOMD to hipper and may require updates to hipper itself. I need to look more into this before proceeding. However, the alternative is to continue using only HIP. Current versions of HIP are no longer header only and require a build and install step. I find AMD's compilation documentation severely lacking. I can not expect the majority of HOOMD users to follow it. Additionally, HIP has not been updated on conda-forge for several years. This alternative will therefore require that we learn how to build and install HIP, document it for our users, and maintain a conda-forge package. |
I admittedly have not been trying to keep hipper up to date because (1) the features we are actually using are pretty minimal and (2) I don’t have any AMD GPUs for testing. We could be more active in this if we want to pursue using it throughout HOOMD. If a conversion is going to be made, have you given any thought to Intel oneAPI? I haven’t tried it so I’m not sure what use/performance is actually like. That is probably even more work, though, I would imagine. Using only HIP is a little more palatable now that the CMake build system is fixed, but I agree that the documentation is general very poor. |
I also have not tried oneAPI, but I have spoken with people who have. It is a vastly different programming model to CUDA/HIP. In addition to rewrites of all kernels, it would require a complete overhaul of the memory management system as oneAPI requires the use of its provided memory management classes. I got conflicting answers on whether there was any possibility of oneAPI / CUDA interoperability. One knowledgeable individual indicated that it was not possible at all, requiring us to do a complete port or none at all. oneAPI also has no support for zero-copy interoperability with Python at this time - which is one of the most popular features of v3. For Intel GPU support, there is a third party package that implements HIP on Intel. The large DOE centers have an interest to support projects like that. If oneAPI gains traction in the long run and replaces HIP in the community, we will need to consider a port then. At present, a oneAPI port would be require a massive time investment and would remove functionality. Switching to hipper in the meantime will be time consuming, but not unduly so. If I convert to hipper, I can remove the outdated HIP and hipCUB submodules. It is only a matter of time before one of those runs into compatibility issues with a new CUDA version. |
OK! That all makes sense. oneAPI would be an enormous amount of work then, and it’s unclear how much traction it will have. I favor the hipper approach, then, since it will allow true CUDA builds without any dependencies we have to maintain. It also means we don’t have to teach how to compile HIP. |
Please let me know if you need me to test any code on AMD. |
While SYCL (oneAPI) was originally designed with its own memory management model (the buffer/accessor model), SYCL 2020 has full support for USM, meaning memory management can be identical in SYCL as it is in CUDA/HIP (using malloc_devices, memcpys etc). There are tools to automatically port CUDA code to SYCL, see https://www.intel.com/content/www/us/en/developer/articles/technical/syclomatic-new-cuda-to-sycl-code-migration-tool.html SYCL/CUDA interoperability is fully supported through the use of Performance of SYCL vs native APIs is extremely competitive, perhaps @zjin-lcf can comment on the latest oneAPI vs HIP performance. SYCL also gives the advantage that single source SYCL code can be compiled to run on CUDA, HIP and Intel platforms (including openCL CPU platforms). |
I spoke with Teja Alaghari in September 2022 and discussed the possibility of Intel developers providing a prototype port. I have not seen nor heard any progress on this. I am open to pull requests that add SYCL as an alternate code path to begin exploring the possibilities. If anyone does so, please base work on the However, I am not interested in fully converting HOOMD-blue to SYCL at this time. The complete rewrite would require a massive amount of effort in porting and testing, while there are currently no national HPC centers with Intel GPUs, the longevity and stability of SYCL is unknown, there is no zero-copy interface to interact with SYCL memory buffers in Python, SYCL is not available on the conda-forge build system, and users on currently supported platforms would need to install additional dependencies to build and/or use HOOMD-blue. In other words, I do not have the free time available to invest in such a port. Even if I did, doing so would remove popular features and require users to make drastic changes in order to continue using HOOMD-blue. |
I'd like to bring your comments as feature request. Can you please explain "no zero-copy interface to interact with SYCL memory buffers in Python" ? Thank you for your comments about SYCL. |
CuPy (https://cupy.dev/) provides the |
There are some Python projects that sit on the SYCL runtime and plugins. See https://github.com/IntelPython/dpctl and https://intelpython.github.io/dpnp/index.html . Both seem to be available on conda-forge. I am not sure if they have this zero-copy feature that you use, but can investigate. Note that gromacs has chosen to use SYCL over HIP as the API to target AMD GPUs. |
I see that Even though these are accessed with the The Gromacs and NAMD developers are free to do as they choose. They both have a much larger group of developers than HOOMD-blue, thus I would presume that they are more willing and able to continually rewrite their code. Until there are a plurality of HPC systems available with Intel GPUs and 100% of all HOOMD-blue features can be supported via SYCL, there is no reason to waste our limited effort on a complete port. |
Description
Replace all HIP calls with
hipper
calls. Continues #427.Motivation and context
HIP is complex, out of our control, gets in the way, and often breaks things. There is no need to use it for CUDA builds.
hipper
is a thinner translation layer that @mphoward developed that works around these issues by using CUDA more directly than HIP for CUDA builds and falls back to HIP for AMD builds.The text was updated successfully, but these errors were encountered: