Skip to content

AOMP Release 15.0-0

Compare
Choose a tag to compare
@estewart08 estewart08 released this 04 Apr 14:20
· 1 commit to aomp-15.0-0 since this release

These are the release notes for AOMP 15.0-0. This release uses modifications to the LLVM development trunk called the "amd-stg-open" branch. This is found at https://github.com/RadeonOpenCompute/llvm-project. The amd-stg-open branch is constantly changing as AMD merges upstream development trunk with its internal open development efforts. Some AMD modifications are experimental and/or under review for the LLVM upstream mono-repo. The AOMP release is a snapshot of amd-stg-open and supporting repositories to build various components.

For AOMP 15.0, the last trunk commit is 6ec79a15cbe9539faf121b5ad39f195dc611fc09 on Mar 29, 2022. This is the first AOMP release for LLVM 15 development. The last amd-only commit is 7eb00e23dd0bd034c4b502a4a99e32b49ac010eb on Mar 26. This forms a frozen branch now called "aomp-15.0-0". See https://github.com/RadeonOpenCompute/llvm-project/tree/aomp-15.0-0

AOMP is a "standalone" build of all necessary ROCm components with the exception of the kernel module. The non llvm-project components for this release were built with ROCM 5.0 .x sources.

The changes from 14.0-3 to 15.0-0 include:

  • New development infrastructure support source build of supplemental components.
    There are two types of supplemental components, prerequisite and post-build components. All supplemental components are built and installed in subdirectories of a directory specified with the AOMP_SUPP environment variable which gets a default value of $HOME/local.
    Supplemental components are not included in the aomp installation package because they are for development, build, and test only. Prerequisite components are cmake, hwloc, and rocmsmilib. These are created with the build_prereq.sh script. Post-build components are for testing and require the AOMP or ROCm compiler to be installed. The current list of post-build supplemental components is : openmpi, hdf5, silo, and fftw. Post-build supplemental components are built with the build_supp.sh script. build_prereq.sh is a symbolic link to build_supp.sh. For each component, the script fetches the source, builds the component, and then installs the components.

  • Enhanced support for CU masking in the openmp_set_cu_mask wedge script. This now supports multiple devices and no longer requires that the number of CUs be a multiple of the number of ranks. If the total number of CUs is not a multiple of ranks, appropriate controls ensure each rank gets an equal set of CUs on some device.

  • Added new scripts to build and run GenASiS and GESTS applications which would scan the dependent libraries coming from build_supp.sh and use them to build and run the applications.

Performance Improvements

  • A new type of GPU kernel called "no-loop" is created for simple target regions. Currently this is an opt-in feature because it ignores runtime environment variables that require additional loop logic.

Reliability improvements

  • Increase the maximum number of captured variables in a target region to 48 from 32. Future plans are to remove this maximum completely.