Skip to content

Latest commit

 

History

History
141 lines (108 loc) · 6.11 KB

README.md

File metadata and controls

141 lines (108 loc) · 6.11 KB

Hippo

A Julia Set fractal generated with HIP

Minimal HIP Example

  • IMPORTANT: requires at least nim 2.1.9 for gpu usage, and must compile with nim cpp
    • nim 2.1.9 added --cc:nvcc and --cc:hipcc
    • nim cpp is required for both HIP and CUDA.
import hippo

proc addKernel*(a, b: cint; c: ptr[cint]) {.hippoGlobal.} =
  c[] = a + b

var
  c: int32
  dev_c = hippoMalloc(sizeof(int32))
hippoLaunchKernel(addKernel,args = (2,7,dev_c.p))
hippoMemcpy(addr c, dev_c, sizeof(int32), hipMemcpyDeviceToHost)
echo "2 + 7 = ", c

Functions

  • There are 3 sets of function prefixes.
  • hippo* prefixed functions are friendly nim interfaces for either HIP or CUDA
    • This is the recommended way to use this library, as it is the most nim-like
    • These functions check for errors and raise them as exceptions
  • hip* prefixed functions are the raw HIP C++ functions
  • cuda* prefixed functions are the raw CUDA C functions

Pragmas

  • all pragmas are prefixed with hippo and can be used for hip or cuda to assign attributes like __global__ or __shared__

basic kernel example:

proc add(a,b: int; c: ptr[int]): {.hippoGlobal.} =
  c[] = a + b

Notes

  • work in progress!

  • very basic kernels work that use block/thread indices

  • shared memory works

  • there are examples for several different targets in the examples directory.

  • building for HIP requires that hipcc is in your PATH

  • vector_sum_cuda.nims example .nims for building with hipcc

  • vector_sum_hippo.nims example .nims for building with nvcc for cuda

  • vector_sum_hip_nvidia.nims example .nims for building with hipcc with the cuda backend

    • tell nim that you want the 'nvcc' compiler settings to get the right arguments, but swap out the compiler executable for hipcc
    • you might also need to have HIP_PLATFORM=nvidia set in your environment
  • vector_sum_cpu.nims example .nims for building with the HIP-CPU backend (no hipcc required)

  • HIP-CPU is supported with the -d:HippoRuntime=HIP_CPU flag

    • note: you need to pull the HIP-CPU git submodule to use this feature
    • does not require hipcc. works with gcc.
    • you can write kernels and test them on cpu with echos and breakpoints and everything!
  • hippo requires that you use cpp. it will not work with c

    • nim cpp -r tests/hip/vector_sum.nim
    • The HIP Runtime is a C++ library, and so is the CUDA Runtime. As such, we can't do anything about this.
    • You can make separate projects for the C and C++ parts and link them together. (I have not tested this yet, would love examples if you do this!)

Motivation

  • I want GPU compute (HIP / CUDA) to be a first-class citizen in Nim.

  • This library is built around using HIP, which supports compiling for both AMD and NVIDIA GPUs.

    • for CUDA, hipcc is basically a wrapper around nvcc.
  • examples for each platform can be found in the examples directory

    • assuming the hipcc compiler is in your PATH, you can run the example with nim cpp -r vector_sum.nim

Compiling

  • by default, hipcc will build for the GPU detected in your system.

  • If you need to build for various GPUs, you can use --passC:"--offload-arch=gfx1100" to specify the GPU target

    • for example, to build for a 7900 xtx, you would use --passC:"--offload-arch=gfx1100"
    • for a radeon w7500, you would use --passC:"--offload-arch=gfx1102"
    • for a geforce 1650 ti, you would use --passC:"--gpu-architecture=sm_75"
      • You also have to set the HIP_PLATFORM=nvidia environment variable to build for nvidia GPUs if you don't have an nvidia GPU in your system
      • hipcc will pick nvcc by default if you have nvcc but do not have the amd rocm stack installed
  • If you want to build for CPU with -d:"HippoRuntime:HIP_CPU", you must have intel tbb libraries installed. yum install -y tbb-devel

Required flags for HIP

  • You must compile with nim cpp for c++
  • Please refer to the examples for the nim compiler settings needed for each target

Optional flags

  • -d:HippoRuntime=HIP (default) (requires hipcc)
    • If you want to build HIP for nvidia, you might need to set the environment variable HIP_PLATFORM=nvidia as well
  • -d:HippoRuntime=HIP_CPU for cpu-only usage (does not require hipcc)
    • you must pull the HIP-CPU submodule to use this feature
  • -d:HippoRuntime=CUDA (requires nvcc)

Testing

  • nimble test to run CPU-only test with HIP_CPU
    • This is what is currently ran on Github CI
  • nimble test_amd to compile with hipcc and run on your AMD GPU
  • nimble test_cuda to compile with nvcc and run on your NVIDIA GPU

Feature todo list

  • support c++ attributes like __global__, __device__
  • support kernel execution syntax <<<1,1>>> somehow
  • figure out how to handle built-ins like block/thread indices
  • Add a compiler option to use HIP-CPU to run HIP code on the CPU
    • will be useful for debugging and testing
    • also useful for running on systems without a GPU
  • helper functions to make hip/cuda calls more nim-y
  • setup automated testing build for HIP
  • setup automated testing with hip-cpu
  • support __shared__ w/ a dot product test
  • add pictures to project
  • Ensure that every example from the book "CUDA by Example" can be run with this library
    • chapter 4 vector + julia set
    • chapter 5 shared memory + thread syncing dot product
    • chapter 6 constant memory + events
    • chapter 7 texture memory (this API changed in recent cuda, might skip)
    • chapter 8 graphics interoperability with boxy
    • chapter 9 atomics
    • chapter 10 streams
    • chapter 11 multi-gpu
    • Advanced Atomics
  • setup CI runners for both nvidia & amd GPUs for testing binaries

Stretch goals

  • Test with chipStar verify it can run with openCL
  • Figure out some way to get this to work with WEBGPU