Skip to content
Caspar Jungbacker edited this page Oct 16, 2024 · 5 revisions

DALES is currently being ported to GPUs with OpenACC in a collaboration with the Netherlands eScience Center. This page provides some instructions on compiling DALES for GPUs. Keep in mind that only a subset of DALES has been ported so far, so not all cases work.

Source code

A more or less stable version of the ported code can be found in the branch dev.

Requirements

  • Nvidia GPU
  • CMake
  • Nvidia HPC SDK version 22.7 or later
  • NetCDF-Fortran, which generally has to be compiled with Nvidia's nvfortran compiler

The Nvidia HPC SDK comes bundled with OpenMPI versions 3 and 4. By default, version 3 is used. It is recommended to use version 4 by replacing the symlink:

cd <install-dir>/nvidia/hpc_sdk/Linux_x86_64/2023/comm_libs
rm -rf mpi
ln -s openmpi4/openmpi-4.0.5 mpi

When the Nvidia compilers are added to your path, set the SYST environment variable to NV-OpenACC. Then use CMake to compile as usual.

Compilation on SURF Snellius

NetCDF-Fortran

First, compile NetCDF-Fortran with nvfortran. You can use the following Easybuild script:

name = 'netCDF-Fortran'
version = '4.6.1'

homepage = 'https://www.unidata.ucar.edu/software/netcdf/'
description = """NetCDF (network Common Data Form) is a set of software libraries
 and machine-independent data formats that support the creation, access, and sharing of array-oriented
 scientific data."""

toolchain = {'name': 'NVHPC', 'version': '22.7-CUDA-11.7.0'}
toolchainopts = {'pic': True}

source_urls = ['https://github.com/Unidata/netcdf-fortran/archive/']
sources = ['v%(version)s.tar.gz']
checksums = ['40b534e0c81b853081c67ccde095367bd8a5eead2ee883431331674e7aa9509f']

builddependencies = [
    ('M4', '1.4.19'),
]

dependencies = [
    ('netCDF', '4.9.0', '', ('gompi', '2022a')),
    ('bzip2', '1.0.8'),
]

# (too) parallel build fails, but single-core build is fairly quick anyway (~1min)
parallel = 1

moduleclass = 'data'

Save the script above as netCDF-Fortran-4.6.1-nvompi-2022.07.eb, then install it:

eblocalinstall netCDF-Fortran-4.6.1-nvompi-2022.07.eb

You can then load it with the module command.

Compiling DALES

The nvfortran compiler can detect your GPU and select the correct compute capability automatically. Of course, you do need to have a GPU available for this to work. If you have access to GPU resources on Snellius, you can login to the gcn1 node and compile there. You can also manually set the correct compute capability in CMakeLists.txt with the -gpu flag. For the Nvidia A100, use -gpu=cc80.

To build the executable, execute the following script in the DALES root directory:

#!/bin/bash
module load 2022 
module load foss/2022a 
module load NVHPC/22.7 
module load netCDF-Fortran/4.6.1-NVHPC-22.7-CUDA-11.7.0 
module load CMake/3.23.1-GCCcore-11.3.0

export NVHPC_HOME=${EBROOTNVHPC}/Linux_x86_64/2022
export LD_LIBRARY_PATH=${NVHPC_HOME}/math_libs/lib64:/${NVHPC_HOME}/cuda/lib64:$LD_LIBRARY_PATH
source ${NVHPC_HOME}/comm_libs/hpcx/latest/hpcx-init.sh
hpcx_load
export OMPI_MCA_coll_hcoll_enable=0

mkdir build
cd build
export SYST=NV-OpenACC
cmake ..
# Optionally, if you want to profile with NSIGHT Systems:
# cmake .. -DUSE_NVTX=True

Submitting a job

The script below is a sample job script to run the included BOMEX case. To do so, make sure you select the cuFFT Poisson solver by adding the following to namoptions:

&SOLVER
solver_id = 200
/

You also need to select the fast thermodynamics option introduced in 51f183:

&PHYSICS
lfast_thermo=.true.

Then, you can use the following as a template job script:

#!/bin/bash
#SBATCH --time=01:00:00
#SBATCH --partition=gpu
#SBATCH --ntasks=1
#SBATCH --tasks-per-node=18
#SBATCH --gpus=1

module load 2022
module load foss/2022a
module load NVHPC/22.7
module load netCDF/4.9.0-gompi-2022a
module load netCDF-Fortran/4.6.1-NVHPC-22.7-CUDA-11.7.0

export NVHPC_HOME=${EBROOTNVHPC}/Linux_x86_64/2022
export LD_LIBRARY_PATH=${NVHPC_HOME}/math_libs/lib64:/${NVHPC_HOME}/cuda/lib64:$LD_LIBRARY_PATH

source ${NVHPC_HOME}/comm_libs/hpcx/latest/hpcx-init.sh
hpcx_load

export OMPI_MCA_coll_hcoll_enable=0

export DALES=/home/cjungbacker/dales/build/dp/src/dales4.4
export CASE=/home/cjungbacker/dales/cases/bomex

cp $CASE/{namoptions.001,lscale.inp.001,prof.inp.001} "$TMPDIR"
cd "$TMPDIR"
srun --mpi=pmix $DALES namoptions.001 | tee output.txt
cp "$TMPDIR"/output.txt /home/cjungbacker/output.txt