Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop testing ROCm 5.4 in nightly tests #26643

Merged
merged 3 commits into from
Feb 4, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 4 additions & 3 deletions doc/rst/technotes/gpu.rst
Original file line number Diff line number Diff line change
Expand Up @@ -167,7 +167,8 @@ The following are further requirements for GPU support:

* For ROCm 5.x, ``CHPL_LLVM`` must be set to ``system``. Note that, ROCm
installations come with LLVM. Setting ``CHPL_LLVM=system`` will allow you to
use that LLVM.
use that LLVM. Note that ROCm 5.x is not actively tested and we recommend
using ROCm 6.x.

* For ROCm 6.x, only ``CHPL_LLVM=bundled`` is supported.

Expand Down Expand Up @@ -704,9 +705,9 @@ marked with * are covered in our nightly testing configurations.

* AMD

* Hardware: MI60*, MI100 and MI250X*
* Hardware: MI60, MI100 and MI250X*

* Software:ROCm 5.4*, 6.0, 6.1, 6.2*
* Software:ROCm 5.4, 6.0, 6.1, 6.2*


GPU Support on Windows Subsystem for Linux
Expand Down
4 changes: 3 additions & 1 deletion test/ANNOTATIONS.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -612,7 +612,9 @@ all:
01/06/25:
- text: Partially revert Qthreads patch from #26328 (#26468)
config: chapcs, 16-node-apollo-hdr, 16-node-hpe-cray-ex, 1-node-hpe-cray-ex

02/03/25:
- text: Started using ROCm 6.x
config: 1-node-mi250x

# End all

Expand Down
20 changes: 0 additions & 20 deletions util/cron/test-gpu-ex-rocm-54.bash

This file was deleted.

20 changes: 0 additions & 20 deletions util/cron/test-gpu-ex-rocm-54.ofi.bash

This file was deleted.

Original file line number Diff line number Diff line change
Expand Up @@ -10,23 +10,17 @@ source $UTIL_CRON_DIR/common-native-gpu-perf.bash
# CONFIG_NAME
source $UTIL_CRON_DIR/common-perf.bash

# everything we source above will end up sourcing `common.bash` which will then
# source `load-base-deps.bash`. In the system we run this config,
# `load-base-deps.bash` ends up exporting
# `CHPL_LLVM_CONFIG=(which # llvm-config)`
# If `rocm` module is loaded, rocm's llvm-config takes precedence over our LLVM
# install. We don't want that in this system. So `module load rocm` should
# appear after all the `source`s.
module load rocm/5.4.3 # pin to rocm 5.4.3
module load rocm # load the default version of ROCm

export CHPL_COMM=none
export CHPL_LLVM=system
export CHPL_LLVM=bundled
jabraham17 marked this conversation as resolved.
Show resolved Hide resolved
unset CHPL_LLVM_CONFIG # we need this to avoid warnings
export CHPL_LOCALE_MODEL=gpu
export CHPL_LAUNCHER_PARTITION=bardpeak # bardpeak is the default queue
export CHPL_GPU=amd # also detected by default
export CHPL_GPU_ARCH=gfx90a
export CHPL_NIGHTLY_TEST_CONFIG_NAME="perf.gpu-ex-rocm-54"

export CHPL_NIGHTLY_TEST_CONFIG_NAME="perf.gpu-ex-rocm"

export CHPL_TEST_PERF_CONFIG_NAME="1-node-mi250x"

Expand Down