Releases: ROCm/aomp
AOMP Release 0.6-2
This release uses the release_80 stable release of clang/llvm/lld/openmp repositories. The artifacts for this release include the patches to the release_80 repos to support openmp for amdgcn for release 0.6-2
Here are the fixes for 0.6-2
- Fixed issue with constant size teams and threads.
- Moved to the stable clang/llvm 8.0 code base
- Fixed code in deviceRTLs/amdgcn that set Max_Warp_Number to 16, was 64
- Enable Float16 for 0.6-2, disabled by default in release_80 merge
- Disable META data opt, and provide evar AMDGPU_ENABLE_META_OPT_BUG to enable
- Add archive handling for bc linking.
- For performance, rewrite select_outline_wrapper calls, to be direct calls.
Example: change the generated from:
@_HASHW_DeclareSharedMemory_cpp__omp_outlined___wrapper =
local_unnamed_addr addrspace(4) constant i64 -4874776124079246075
call void @select_outline_wrapper(i16 0, i32 %6, i64 -4874776124079246075)
to:
call void @DeclareSharedMemory_cpp__omp_outlined___wrapper(i16 0, i32 %6) - In release_80, Loop_tripcount API is now used, so we need to limit num_groups/teams
to no more than Max_Teams, fixes assertok_error, and snap4
Also handle num_teams clause inside loop_tripcount logic. - BALLOT_SYNC macro replaced with ACTIVEMASK in release_80
AOMP Release 0.6-1
Changes from 0.6-0 to 0.6-1:
-
Disabled SILoadStoreOptimizer pass to work around 64 bit address calculation issue
-
Added 6 new device APIs as extentions to OpenMP device apis
- omp_ext_get_warp_id
- omp_ext_get_lane_id
- omp_ext_get_master_thread_id
- omp_ext_get_smid
- omp_ext_is_spmd_mode
- omp_ext_get_active_threads_mask
-
rtl get_launch_vals added, algorithm rewrite for threads, teams computation
- Throttle code for teams and threads off by default, enabled with THREAD_TEAM_THROTTLE
-
Added support for an LLC and OPT specific env-var AOMP_LLC_ARGS AOMP_OPT_ARGS
- Allows adding compiler options to opt and llc via env-var, useful for triage, dumps, and debug.
-
Added clang-unbundle-archive tool.
-
Added support for device library archives in clang when using -l flag.
-
Updated llvm-link to work with archives of .bc components
-
Added new method AddStaticDeviceLibs to CommonArgs.cpp that searches for static device
libraries using -l and -L command line options in a way similar to the search method used for
host libraries including which directories to search for. The differences from host search are:- Searches look for names that specify the architecture and/or GPU
- Searches look in the libdevice subdirectory of each host directory path
- Searches look for filenames with .a suffix before searching for .bc suffix
-
Cleanup of aomp build scripts including split of llvm component into llvm, clang, and lld.
-
Fix where llvm-config is found during build
-
Added installed binaries from llvm to help with clang lit testing
-
New build script for comgr. This is not part of the compiler build yet. Developers and those building from source can run build_comgr.sh
-
Do not build hip runtime for ppc and arm builds.
-
Added two new smoke tests and improved automation of smoke tests
-
Corrected mymcpu and mygpu for vega20
AOMP Release 0.6-0
This is the initial release of AOMP.
AOMP is the new name for HCC2. The last HCC2 release was HCC2 0.5-4.
Changes from HCC2 0.5-4
- AOMP is built from sources for ROCm 2.1.
- AOMP can build for Nvidia cards so install of CUDA 10 SDK is required.
- AOMP needs to build hcc for proper build of hip.
Two of the openmpapps are known to fail. We are working to fix this in 0.6-1.
If you built aomp from source, it will default install into $HOME/rocm/aomp. This package will install into /opt/rocm/aomp. Many of the samples will look first in $HOME/rocm/aomp. To override this,
export AOMP=/opt/rocm/aomp