Releases: ROCm/aomp
AOMP Release 16.0-0
These are the release notes for AOMP 16.0-0. This release uses modifications to the LLVM development trunk called the "amd-stg-open" branch. This is found at https://github.com/RadeonOpenCompute/llvm-project. The amd-stg-open branch is constantly changing as AMD merges upstream development trunk with its internal open development efforts. Some AMD modifications are experimental and/or under review for the LLVM upstream mono-repo. The AOMP release is a snapshot of amd-stg-open and supporting repositories to build various components.
For AOMP 16.0-0, the last trunk commit is 1b56b2b2678cde21f7c20e83f881ded9b96518e4 on Sep 14 2022. This is the first AOMP release for LLVM 16 development. The last amd-only commit is 0018e8ab17297453e971ea1867d085eba5ea3f9d on Sep 14 2022. This forms a frozen branch now called "aomp-16.0-0". See https://github.com/RadeonOpenCompute/llvm-project/tree/aomp-16.0-0
AOMP is a "standalone" build of all necessary ROCm components with the exception of the kernel module. The non llvm-project components for this release were built with ROCM 5.2.x sources.
The changes from 15.0-3 to 16.0-0 include:
- Adds new flag -fopenmp-target-fast to group enable a set of OpenMP target optimizations.
- Enhancements and bug fixes for No-Loop and cross-team reduction support.
AOMP Release 15.0-3
These are the release notes for AOMP 15.0-3. This release uses modifications to the LLVM development trunk called the "amd-stg-open" branch. This is found at https://github.com/RadeonOpenCompute/llvm-project. The amd-stg-open branch is constantly changing as AMD merges upstream development trunk with its internal open development efforts. Some AMD modifications are experimental and/or under review for the LLVM upstream mono-repo. The AOMP release is a snapshot of amd-stg-open and supporting repositories to build various components.
For AOMP 15.0-3, the last trunk commit is 1f8ae9d7e7e4afcc4e76728b28e64941660ca3eb on Jul 26 2022. This is the fourth AOMP release for LLVM 15 development. The last amd-only commit is b745843ebcb77f55de887b5741197184e7d0dcbd on Aug 01 2022. This forms a frozen branch now called "aomp-15.0-3". See https://github.com/RadeonOpenCompute/llvm-project/tree/aomp-15.0-3
AOMP is a "standalone" build of all necessary ROCm components with the exception of the kernel module. The non llvm-project components for this release were built with ROCM 5.2.x sources.
The changes from 15.0-2 to 15.0-3 include:
- Use the new openmp DeviceRTL by default.
- New DeviceRTL APIs for optimized cross-team reduction.
- Clang codegen changes to use the optimized cross-team reduction APIs for a reduction clause in a device construct.
- Added support for classic flang to use the new DeviceRTL.
Known Issues:
- Flang has issues at -O0 when using the new DeviceRTL on GPUs other than gfx90a.
rocm-5.2.1
ROCm release v5.2.1
rocm-5.2.0
ROCm release v5.2.0
AOMP Release 15.0-2
These are the release notes for AOMP 15.0-2. This release uses modifications to the LLVM development trunk called the "amd-stg-open" branch. This is found at https://github.com/RadeonOpenCompute/llvm-project. The amd-stg-open branch is constantly changing as AMD merges upstream development trunk with its internal open development efforts. Some AMD modifications are experimental and/or under review for the LLVM upstream mono-repo. The AOMP release is a snapshot of amd-stg-open and supporting repositories to build various components.
For AOMP 15.0-2, the last trunk commit is 3bef90dff64fc717c5d5e33a4d5fb47a4566d04a on May 15, 2022. This is the third AOMP release for LLVM 15 development. The last amd-only commit is 651deba7aa2805d1fe19ace427548f42f2c7a29f on May 16. This forms a frozen branch now called "aomp-15.0-2". See https://github.com/RadeonOpenCompute/llvm-project/tree/aomp-15.0-2
AOMP is a "standalone" build of all necessary ROCm components with the exception of the kernel module. The non llvm-project components for this release were built with ROCM 5.1.x sources.
The changes from 15.0-1 to 15.0-2 include:
- Add user requested hint value AMD_unsafe_fp_atomics to match AMD_fast_fp_atomics.
- Fixes to compile SPEC CPU with A + A options.
- Add implementation of omp_is_initial_device to the new OpenMP runtime.
- Add Fortran specific functions to the new OpenMP runtime. Classic flang compiler does not use the same OpenMP API
as Clang and does not use __kmpc_parallel_51. This function is responsible for thread parallelization. __kmpc_parallel_51 increases the parallel level and launches parallel code. - Update cloc.sh in aomp-extras to pass bitcode for abi version.
- Fix timing accuracy for OMPT target data transfer and kernel dispatch trace records.
AOMP Release 15.0-1
These are the release notes for AOMP 15.0-1. This release uses modifications to the LLVM development trunk called the "amd-stg-open" branch. This is found at https://github.com/RadeonOpenCompute/llvm-project. The amd-stg-open branch is constantly changing as AMD merges upstream development trunk with its internal open development efforts. Some AMD modifications are experimental and/or under review for the LLVM upstream mono-repo. The AOMP release is a snapshot of amd-stg-open and supporting repositories to build various components.
For AOMP 15.0, the last trunk commit is 6ec79a15cbe9539faf121b5ad39f195dc611fc09 on Mar 29, 2022. This is the first AOMP release for LLVM 15 development. The last amd-only commit is 7eb00e23dd0bd034c4b502a4a99e32b49ac010eb on Mar 26. This forms a frozen branch now called "aomp-15.0-1". See https://github.com/RadeonOpenCompute/llvm-project/tree/aomp-15.0-1
AOMP is a "standalone" build of all necessary ROCm components with the exception of the kernel module. The non llvm-project components for this release were built with ROCM 5.1.x sources.
The changes from 15.0-0 to 15.0-1 include:
- Switch to ROCm 5.1.x sources
AOMP Release 15.0-0
These are the release notes for AOMP 15.0-0. This release uses modifications to the LLVM development trunk called the "amd-stg-open" branch. This is found at https://github.com/RadeonOpenCompute/llvm-project. The amd-stg-open branch is constantly changing as AMD merges upstream development trunk with its internal open development efforts. Some AMD modifications are experimental and/or under review for the LLVM upstream mono-repo. The AOMP release is a snapshot of amd-stg-open and supporting repositories to build various components.
For AOMP 15.0, the last trunk commit is 6ec79a15cbe9539faf121b5ad39f195dc611fc09 on Mar 29, 2022. This is the first AOMP release for LLVM 15 development. The last amd-only commit is 7eb00e23dd0bd034c4b502a4a99e32b49ac010eb on Mar 26. This forms a frozen branch now called "aomp-15.0-0". See https://github.com/RadeonOpenCompute/llvm-project/tree/aomp-15.0-0
AOMP is a "standalone" build of all necessary ROCm components with the exception of the kernel module. The non llvm-project components for this release were built with ROCM 5.0 .x sources.
The changes from 14.0-3 to 15.0-0 include:
-
New development infrastructure support source build of supplemental components.
There are two types of supplemental components, prerequisite and post-build components. All supplemental components are built and installed in subdirectories of a directory specified with the AOMP_SUPP environment variable which gets a default value of $HOME/local.
Supplemental components are not included in the aomp installation package because they are for development, build, and test only. Prerequisite components are cmake, hwloc, and rocmsmilib. These are created with the build_prereq.sh script. Post-build components are for testing and require the AOMP or ROCm compiler to be installed. The current list of post-build supplemental components is : openmpi, hdf5, silo, and fftw. Post-build supplemental components are built with the build_supp.sh script. build_prereq.sh is a symbolic link to build_supp.sh. For each component, the script fetches the source, builds the component, and then installs the components. -
Enhanced support for CU masking in the openmp_set_cu_mask wedge script. This now supports multiple devices and no longer requires that the number of CUs be a multiple of the number of ranks. If the total number of CUs is not a multiple of ranks, appropriate controls ensure each rank gets an equal set of CUs on some device.
- Added new scripts to build and run GenASiS and GESTS applications which would scan the dependent libraries coming from build_supp.sh and use them to build and run the applications.
Performance Improvements
- A new type of GPU kernel called "no-loop" is created for simple target regions. Currently this is an opt-in feature because it ignores runtime environment variables that require additional loop logic.
Reliability improvements
- Increase the maximum number of captured variables in a target region to 48 from 32. Future plans are to remove this maximum completely.
AOMP Release 14.0-3
These are the release notes for AOMP 14.0-3. This release uses modifications to the LLVM development trunk called the "amd-stg-open" branch. This is found at https://github.com/RadeonOpenCompute/llvm-project. The amd-stg-open branch is constantly changing as AMD merges upstream development trunk with its internal open development efforts. Some AMD modifications are experimental and/or under review for the LLVM mono-repo. The AOMP release is a snapshot of amd-stg-open and supporting repositories to build various components.
For AOMP 14.0-3, the last trunk commit is db01b123d012df2f0e6acf7e90bf4ba63382587c on Feb 2, 2022. That was the last upstream trunk commit before the beginning of LLVM 15 development. The last amd-only commit is b566cb1cb8f1fc8fede66a8e3af258b95009d190 on Feb 11. This forms a frozen branch now called "aomp-14.0-3". See https://github.com/RadeonOpenCompute/llvm-project/tree/aomp-14.0-3 .
AOMP is a "standalone" build of all necessary ROCm components with the exception of the kernel module. The non llvm-project components for this release were built with ROCM 5.0 sources.
The changes from 14.0-2 to 14.0-3 include:
- Update to ROCm 5.0 components.
- Fix to libompd cmake to support enabling HWLOC.
- Fix to mygpu when using it from /usr/bin
Known issues
- For badly formed custom mappers, host access to unmapped struct members causes segfault.
- Usage of flang -g results in differing debug_info_version and dwarf_version.
AOMP Release 14.0-2
These are the release notes for AOMP 14.0-2. This release uses modifications to the LLVM development trunk called the "amd-stg-open" branch. This is found at https://github.com/RadeonOpenCompute/llvm-project. The amd-stg-open branch is constantly changing as AMD merges upstream development trunk with its internal open development efforts. Some AMD modifications are experimental and/or under review for the LLVM mono-repo. The AOMP release is a snapshot of amd-stg-open and supporting repositories to build various components.
For AOMP 14.0-2, the last trunk commit is db01b123d012df2f0e6acf7e90bf4ba63382587c on Feb 2, 2022. That was the last upstream trunk commit before the beginning of LLVM 15 development. The last amd-only commit is 4dcce9a16b1685dd87069abdf1274fa75b91a928 on Feb 8. This forms a frozen branch now called "aomp-14.0-2". See https://github.com/RadeonOpenCompute/llvm-project/tree/aomp-14.0-2 .
AOMP is a "standalone" build of all necessary ROCm components with the exception of the kernel module. The non llvm-project components for this release were built with ROCM 4.5 sources.
The changes from 14.0-1 to 14.0-2 include:
- Device runtime performance improvements by overlapping multiple copies between host and device.
- Inclusion of a static build of hwloc in libomp.so . This supports the use of PLACES for CPU affinity.
- fixes to support compilation for nvptx64
- fix to support uninitialized integers for hostrpc support
- Support for managed memory allocations in OpenMP
- Fix to support optimization of constant indexes for shared memory arrays in a target region.
- Fix to resolve an unresolved global found in certain rocm-device-lib bitcode files.
Known issues
- For badly formed custom mappers, host access to unmapped struct members causes segfault.
- Usage of flang -g results in differing debug_info_version and dwarf_version.
AOMP Release 14.0-1
These are the release notes for AOMP 14.0-1. This release uses modifications to the LLVM development trunk called the "amd-stg-open" branch. This is found at https://github.com/RadeonOpenCompute/llvm-project. The amd-stg-open branch is constantly changing as AMD merges upstream development trunk with its internal open development efforts. Some AMD modifications are experimental and/or under review for the LLVM mono-repo. The AOMP release is a snapshot of amd-stg-open and supporting repositories to build various components.
For AOMP 14.0-1, the last trunk commit is 9be193bc58b356e2d2e0bddff59a404358e2c75e on Jan 11. The last amd-only commit is a4a503a2b65b37f4c8e4931d502cc6d53810b5f8 on Jan 13. This forms a frozen branch now called "aomp-14.0-1". See https://github.com/RadeonOpenCompute/llvm-project/tree/aomp-14.0-1 .
This is a major update from AOMP 14.0.0. The changes include
- A restructuring of the clang driver to a) remove the clang-build-select-link tool b) remove all "post" clang linking with mlink attributes on the clang -cc1 command. All device library linking is now done in the llvm-link step which follows clang -cc1. Furthermore, libraries including the critical libomptarget-..-.bc library are internalized by the llvm-link step to avoid unnecessary bit code for the backend.
- The construction of libomptarget-..-.bc library now includes rocm-device-lib functions, device libm functions, hostrpc stubs, and lastly the OpenMP deviceRTLs. This all-inclusive library simplifies the device toolchain and improves performance.
- Elimination of the need for the aomp-extras library.
Known issues:
- Compilation for nvidia GPUs is broken. We will fix this in 14.0-2.