AOMP Release 17.0-0
These are the release notes for AOMP 17.0-0. AOMP uses AMD developer modifications to the upstream LLVM development trunk. These differences are managed in a branch called the "amd-stg-open". This branch is found in a mirror of upstream LLVM found at https://github.com/RadeonOpenCompute/llvm-project. The amd-stg-open branch is constantly changing as AMD merges upstream development trunk with its internal open development efforts. The AMD modifications are experimental and/or contributions under review for the upstream trunk. AOMP uses a snapshot of amd-stg-open at the commit ids and dates listed below. AOMP also includes builds of related ROCm components. We call AOMP a "standalone" build as it does not use or require ROCm with the exception of the kernel module (dkms) and libdrm which are often part of the Linux distribution. AOMP is isolated from any ROCm installations by installing into /usr/lib/aomp and its use of RPATH on runtime libraries.
For AOMP 17.0-0, the last trunk commit is bd1f7c417fc04f93de6b7bbf8740351e58a90613 on March 5th, 2023. The last amd-only commit is f16add4badfa0f16d62ba025f0565a9e4475e37e on March 4th, 2023 . These commits forms a frozen branch now called "aomp-17.0-0". See https://github.com/RadeonOpenCompute/llvm-project/tree/aomp-17.0-0.
The integrated ROCm components for this AOMP release were built with ROCM 5.4.0 sources.
This is the 1st AOMP release based on LLVM 17 development.
These are the changes from 16.0-3 to 17.0-0 include:
- Add support for amdclang, amdclang++, and amdflang
- Updated build scripts for Kokkos (and updated to Kokkos v 3.7.00)
- Support for multiple blocksizes in Xteam reduction (1024 limit).
- A new execution mode BigJumpLoop for SPMD non-reduction kernels
- Additional support for OMPT function "translate_time"
- Added Centos-9 and SLES 15 SP4 rpms.
- No longer support SLES15 SP1.
Errata:
Smoke test failures:
- managed_memory: segfault, when 2+ devices are present
Hip example failure:
- device-lib
OvO:
- cpp/hierarchical_parallelism/reduction_add-complex_double/target__teams (timeout on gfx908)
- cpp/hierarchical_parallelism/reduction_add-float/target_teams (timeout on gfx908)