Convert from HIP to hipper #1063

joaander · 2021-07-22T10:28:57Z

Description

Replace all HIP calls with hipper calls. Continues #427.

Motivation and context

HIP is complex, out of our control, gets in the way, and often breaks things. There is no need to use it for CUDA builds. hipper is a thinner translation layer that @mphoward developed that works around these issues by using CUDA more directly than HIP for CUDA builds and falls back to HIP for AMD builds.

The text was updated successfully, but these errors were encountered:

mphoward · 2022-06-02T13:44:56Z

When we do this, we should make sure to look at the CMake setup too. Right now, there is a lot of complicated code needed to trick CMake into using HIP as CUDA. It would be nice if we could only take this path when HIP is being used to do the compilation.

joaander · 2022-06-02T13:53:38Z

When we do this, we should make sure to look at the CMake setup too. Right now, there is a lot of complicated code needed to trick CMake into using HIP as CUDA. It would be nice if we could only take this path when HIP is being used to do the compilation.

Yes. We may also need to refactor the HIP/CUDA CMake code considerably in for this issue and #1101. This code was written for very early ROCm/HIP tools and has not been maintained over the years.

I don't have access to an AMD GPU system for testing yet. OLCF's test system is not yet open to INCITE projects and NCSA Delta (which will have only one AMD GPU node) has been delayed again. If these are still not available when I start working on this, I will make the conversion testing only on CUDA and then make the changes needed for AMD support later.

mphoward · 2022-06-02T13:57:54Z

I will make the conversion testing only on CUDA and then make the changes needed for AMD support later.

I support this. I also do not have an AMD system to test on, and ROCm/HIP has been unstable, at least in the past.

joaander · 2022-11-30T17:10:33Z

Revisiting this: It will be a significant effort to port HOOMD to hipper and may require updates to hipper itself. I need to look more into this before proceeding.

However, the alternative is to continue using only HIP. Current versions of HIP are no longer header only and require a build and install step. I find AMD's compilation documentation severely lacking. I can not expect the majority of HOOMD users to follow it. Additionally, HIP has not been updated on conda-forge for several years. This alternative will therefore require that we learn how to build and install HIP, document it for our users, and maintain a conda-forge package.

mphoward · 2022-11-30T17:43:12Z

I admittedly have not been trying to keep hipper up to date because (1) the features we are actually using are pretty minimal and (2) I don’t have any AMD GPUs for testing. We could be more active in this if we want to pursue using it throughout HOOMD.

If a conversion is going to be made, have you given any thought to Intel oneAPI? I haven’t tried it so I’m not sure what use/performance is actually like. That is probably even more work, though, I would imagine.

Using only HIP is a little more palatable now that the CMake build system is fixed, but I agree that the documentation is general very poor.

joaander · 2022-11-30T18:12:43Z

I also have not tried oneAPI, but I have spoken with people who have. It is a vastly different programming model to CUDA/HIP. In addition to rewrites of all kernels, it would require a complete overhaul of the memory management system as oneAPI requires the use of its provided memory management classes. I got conflicting answers on whether there was any possibility of oneAPI / CUDA interoperability. One knowledgeable individual indicated that it was not possible at all, requiring us to do a complete port or none at all.

oneAPI also has no support for zero-copy interoperability with Python at this time - which is one of the most popular features of v3.

For Intel GPU support, there is a third party package that implements HIP on Intel. The large DOE centers have an interest to support projects like that.

If oneAPI gains traction in the long run and replaces HIP in the community, we will need to consider a port then. At present, a oneAPI port would be require a massive time investment and would remove functionality.

Switching to hipper in the meantime will be time consuming, but not unduly so. If I convert to hipper, I can remove the outdated HIP and hipCUB submodules. It is only a matter of time before one of those runs into compatibility issues with a new CUDA version.

mphoward · 2022-11-30T18:52:41Z

OK! That all makes sense. oneAPI would be an enormous amount of work then, and it’s unclear how much traction it will have.

I favor the hipper approach, then, since it will allow true CUDA builds without any dependencies we have to maintain. It also means we don’t have to teach how to compile HIP.

jglaser · 2022-11-30T18:54:49Z

Please let me know if you need me to test any code on AMD.

hdelan · 2023-02-09T13:07:47Z

While SYCL (oneAPI) was originally designed with its own memory management model (the buffer/accessor model), SYCL 2020 has full support for USM, meaning memory management can be identical in SYCL as it is in CUDA/HIP (using malloc_devices, memcpys etc).

There are tools to automatically port CUDA code to SYCL, see https://www.intel.com/content/www/us/en/developer/articles/technical/syclomatic-new-cuda-to-sycl-code-migration-tool.html

SYCL/CUDA interoperability is fully supported through the use of host_task. See here for more https://github.com/codeplaysoftware/SYCL-For-CUDA-Examples/tree/master/examples/cuda_interop

Performance of SYCL vs native APIs is extremely competitive, perhaps @zjin-lcf can comment on the latest oneAPI vs HIP performance. SYCL also gives the advantage that single source SYCL code can be compiled to run on CUDA, HIP and Intel platforms (including openCL CPU platforms).

joaander · 2023-02-09T13:40:21Z

I spoke with Teja Alaghari in September 2022 and discussed the possibility of Intel developers providing a prototype port. I have not seen nor heard any progress on this. I am open to pull requests that add SYCL as an alternate code path to begin exploring the possibilities. If anyone does so, please base work on the trunk-major branch.

However, I am not interested in fully converting HOOMD-blue to SYCL at this time. The complete rewrite would require a massive amount of effort in porting and testing, while there are currently no national HPC centers with Intel GPUs, the longevity and stability of SYCL is unknown, there is no zero-copy interface to interact with SYCL memory buffers in Python, SYCL is not available on the conda-forge build system, and users on currently supported platforms would need to install additional dependencies to build and/or use HOOMD-blue. In other words, I do not have the free time available to invest in such a port. Even if I did, doing so would remove popular features and require users to make drastic changes in order to continue using HOOMD-blue.

zjin-lcf · 2023-02-09T14:27:29Z

@joaander

I'd like to bring your comments as feature request. Can you please explain "no zero-copy interface to interact with SYCL memory buffers in Python" ? Thank you for your comments about SYCL.

joaander · 2023-02-09T14:40:11Z

CuPy (https://cupy.dev/) provides the __cuda_array_interface__ which allows Python C extensions to directly access the GPU memory buffers without copying the data. We use this to provide users with direct access to particle and force data (e.g. https://hoomd-blue.readthedocs.io/en/v3.8.1/module-hoomd-data.html#hoomd.data.LocalSnapshotGPU) so that they can write Python extensions that customize their simulations with minimal overhead. This is popular because users prefer to write Python code over a compiled C++ extension.

hdelan · 2023-02-09T15:00:53Z

There are some Python projects that sit on the SYCL runtime and plugins. See https://github.com/IntelPython/dpctl and https://intelpython.github.io/dpnp/index.html . Both seem to be available on conda-forge. I am not sure if they have this zero-copy feature that you use, but can investigate.

Note that gromacs has chosen to use SYCL over HIP as the API to target AMD GPUs.

joaander · 2023-02-09T15:49:19Z

I see that dpctl and dpnp are available via the intel and anaconda channels.
https://anaconda.org/search?q=dpnp
https://anaconda.org/search?q=dpctl

Even though these are accessed with the conda package manager, these channels are not the same as the community driven conda-forge project where I distribute HOOMD-blue: https://conda-forge.org/ . The conda-forge ecosystem supports CUDA, but not HIP and not SYCL: https://conda-forge.org/docs/maintainer/knowledge_base.html#cuda-builds.

The Gromacs and NAMD developers are free to do as they choose. They both have a much larger group of developers than HOOMD-blue, thus I would presume that they are more willing and able to continually rewrite their code. Until there are a plurality of HPC systems available with Intel GPUs and 100% of all HOOMD-blue features can be supported via SYCL, there is no reason to waste our limited effort on a complete port.

joaander added refactor Refactoring existing code task Something needs to be done. complex A particularly complex or large project that involves significant amount of effort. labels Jul 22, 2021

joaander added this to the future milestone Jan 13, 2022

joaander added the essential Work that must be completed. label Feb 4, 2022

joaander removed this from the future milestone Mar 8, 2022

joaander added the breaking Changes that will break API. label Aug 12, 2022

joaander mentioned this issue Jan 13, 2023

Optionally release GIL when calling run on simulations #1463

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convert from HIP to hipper #1063

Convert from HIP to hipper #1063

joaander commented Jul 22, 2021

mphoward commented Jun 2, 2022

joaander commented Jun 2, 2022

mphoward commented Jun 2, 2022

joaander commented Nov 30, 2022

mphoward commented Nov 30, 2022

joaander commented Nov 30, 2022

mphoward commented Nov 30, 2022

jglaser commented Nov 30, 2022

hdelan commented Feb 9, 2023

joaander commented Feb 9, 2023

zjin-lcf commented Feb 9, 2023

joaander commented Feb 9, 2023

hdelan commented Feb 9, 2023 •

edited

Loading

joaander commented Feb 9, 2023

Convert from HIP to hipper #1063

Convert from HIP to hipper #1063

Comments

joaander commented Jul 22, 2021

Description

Motivation and context

mphoward commented Jun 2, 2022

joaander commented Jun 2, 2022

mphoward commented Jun 2, 2022

joaander commented Nov 30, 2022

mphoward commented Nov 30, 2022

joaander commented Nov 30, 2022

mphoward commented Nov 30, 2022

jglaser commented Nov 30, 2022

hdelan commented Feb 9, 2023

joaander commented Feb 9, 2023

zjin-lcf commented Feb 9, 2023

joaander commented Feb 9, 2023

hdelan commented Feb 9, 2023 • edited Loading

joaander commented Feb 9, 2023

hdelan commented Feb 9, 2023 •

edited

Loading