AOMP Release 17.0-3
These are the release notes for AOMP 17.0-3. AOMP uses AMD developer modifications to the upstream LLVM development trunk. These differences are managed in a branch called the "amd-stg-open". This branch is found in a mirror of upstream LLVM found at https://github.com/RadeonOpenCompute/llvm-project. The amd-stg-open branch is constantly changing as AMD merges upstream development trunk with its internal open development efforts. The AMD modifications are experimental and/or/while contributions under review for the upstream trunk. AOMP uses a snapshot of amd-stg-open at the commit ids and dates listed below. AOMP also includes builds of related ROCm components. We call AOMP a "standalone" build as it does not use or require ROCm with the exception of the kernel module (dkms) and libdrm which are often part of the Linux distribution. AOMP is isolated from any ROCm installations by installing into /usr/lib/aomp and its use of RPATH on runtime libraries.
For AOMP 17.0-3, the last trunk commit is ec6b40ab9b577e6e9bf000ccd19d85a9753b6ca8 on JULY 13, 2023. The last amd-only commit is f959ea5d8d1e5aef4b6d06727a9698316d3d33cd on JULY 14, 2023 . These commits form a frozen branch now called "aomp-17.0-3". See https://github.com/RadeonOpenCompute/llvm-project/tree/aomp-17.0-3.
The integrated ROCm components for this AOMP release were built with ROCM 5.6.0 sources.
This is the 4th AOMP release based on LLVM 17 development.
The changes from 17.0-2 to 17.0-3 include:
- Non-compiler components are built with ROCm 5.6.0 sources
- Support code object version 5. The libomptarget device library is now generated for both code object version 4 and code object version 5.
- flang is no longer a symbolic link to clang. A new binary called flang-legacy has the driver support for flang. This is because the clang driver support for flang is going away. The new driver binary is called flang-legacy which uses a frozen set of driver support from ROCm 5.6 now found in the flang repository.
- Enabled Big Jump Loop by default.
- Improved target teams loop transform.
- Removed the link from flang to clang. Replace it with flang-legacy.
- Implemented dynamic LDS accesses from non-kernel functions.
- Performance improvements for small kernels via lazy HSA queue creation and tracking of busy queues.
- Restored GPU_MAX_HW_QUEUES in AMDGPU nextgen plugin.
- Extended environment variable ompx_apu_maps to MI200.
- Added
--archive
to the clang-offload-packager which repackages the extracted files into a new static library. This allows a fat binary static library to become a static library for a single architecture. - Disabled PIE in llvm until build issues in centos and sles are resolved.
Errata:
- Bug in hip 5.6.0 sources when using code object v5 and -O0 causes program to crash.
- flang compilations require -fPIC (need fix in flang-legacy for 17.0-4)
- Smoke test failures
fprintf (non-deterministic)
complex_reduction (non-deterministic)
schedule (non-deterministic)
flang-274983
flang-274983-2
xteamr