AOMP Release 20.0-1
These are the release notes for AOMP 20.0-1. AOMP uses AMD developer modifications to the upstream LLVM development trunk. These differences are managed in a branch called the "amd-staging". This branch is found in a mirror of upstream LLVM found at https://github.com/ROCm/llvm-project. The amd-staging branch is constantly changing as it merges the upstream development trunk with its downstream development updates. The AMD modifications are experimental while under review for the upstream trunk. AOMP uses a snapshot of amd-staging at the commit ids and dates listed below. AOMP also includes builds of related ROCm components. We call AOMP a "standalone" build as it does not use or require ROCm with the exception of the kernel module (amdgpu-dkms) and libdrm which are often part of the Linux distribution. AOMP is isolated from any ROCm installations by installing into /usr/lib/aomp and the use of RPATH for runtime libraries.
For AOMP 20.0-1, the last LLVM trunk commit is 151901c762b724ef6ffe6f3db163475071e7b215 on December 11, 2024. The last amd-only commit is e82d86c7c81631754d1af5cb72ceef2385d215e3 on December 12, 2024. These commits form a frozen branch now called "aomp-20.0-1". See https://github.com/ROCm/llvm-project/tree/aomp-20.0-1.
The integrated ROCm components for this AOMP release were built with ROCM 6.3.0 sources.
This is the 2nd AOMP release based on upstream LLVM 20 development.
While Linux distros usually have the amdgpu kernel module, we strongly recommend using the ROCm 6.3 amdgpu-dkms and amdgpu-dkms-firmware packages which resolve a long-standing SDMA firmware issue .
In this release of AOMP, we disabled the OpenMP workaround of the SDMA firmware issue. The OpenMP workaround for the SDMA issue was to not chain automatic asynchronous data transfers to the kernel completion signal. The workaround synchronously initiated data transfers after kernel completion was detected by the host CPU. This resulted in some loss of performance.
The environment variable LIBOMPTARGET_SYNC_COPY_BACK is the trigger to use the workaround. Before AOMP 20.0-1 it had a default value of true to force synchronous copy backs. In this release we set the default to false which will improve performance for kernels with lots of return maps. But if your machine does not have the ROCm 6.3 firmware, you should set LIBOMPTARGET_SYNC_COPY_BACK=true to avoid potential errors.
Changes since AOMP 20.0-0:
- Changed default LIBOMPTARGET_SYNC_COPY_BACK=false
- Dropped support for CentOS 7/8/9, Ubuntu 20.04, SLES15-SP4
- Added support for RHEL 8/9, Ubuntu 24.04, SLES15-SP5
- Updated to ROCm 6.3 sources
- Added new component, SPIRV-LLVM-Translator. This is initial support for spirv JIT offloading. This includes a spirv to LLVM IR translation tool installed in the compiler bin directory lib/llvm/bin/amd-llvm-spirv. Toolchain support to support SPIRV is still in development.
- Added a new release file showing the summary of relevant git commits since the last release. See llvm-project-20-0-1-gitlog-summary.txt
- Upgraded cmake to 3.25.2
- Changed the commands for OpenMP offload linking to use the clang-linker-wrapper command. The old method was set of intermediate commands that passed files between various steps of the heterogeneous linking process. The default command line option before 20.0-1 was --opaque-offload-linker. The default is now --no-opaque-offload-linker. While both methods performed similar GPU linking, IR optimizations and backend, there were minor differences in the final offloading image that caused issues that have been resolved. One can still see the commands from the old method with the command line options "-v -save-temps --opaque-offload-linker",
- Corrected the installation lib-debug directories to contain debug builds of various runtime libraries. The sources of all debug runtimes are also installed so that gdbtui will automatically find the sources.
Merged roct and rocr into a single aomp build COMPONENT. - Renamed flang-legacy binary to **flang-classic"" as it is better known by the flang community. Yes, this will be deprecated in the future for the new llvm flang. Currently "flang" is a symbolic link to flang-classic binary.
Errata:
- Potential data corruption as a result of an SDMA issue when AOMP generated binaries are run without ROCm 6.3 amdgpu-dkms-firmware. Set LIBOMPTARGET_SINC_COPY_BACK=true to avoid problem with OpenMP.
- THIS RELEASE CANNOT BE BUILT FROM SOURCE EXTERNALLY. This is because there is a new AMD repository that is not yet available. In the next release this repository will be made public and put in the aomp manifest for cloning to support source build of aomp.