Skip to content

Commit 079de6b

Browse files
loadamsfabienduponttjruwaseLiangliang-Mainkcherry
authored
Update workflows to cuda 12.4 (#7000)
- Update existing workflows that use cu121 to cu124. Note, this means that where we download torch latest, we will now be getting torch 2.6 rather than the torch latest 2.5 provided with cuda 12.1. - Note, nv-nightly is failing in master currently due to unrelated errors, so this could be ignored in this PR (nv-nightly tested locally, where it passes with 12.1 and it also passes with 12.4). --------- Signed-off-by: Fabien Dupont <fdupont@redhat.com> Signed-off-by: Logan Adams <loadams@microsoft.com> Signed-off-by: Olatunji Ruwase <olruwase@microsoft.com> Signed-off-by: inkcherry <mingzhi.liu@intel.com> Signed-off-by: Omar Elayan <oelayan@habana.ai> Co-authored-by: Fabien Dupont <fabiendupont@fabiendupont.fr> Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Liangliang Ma <1906710196@qq.com> Co-authored-by: inkcherry <mingzhi.liu@intel.com> Co-authored-by: Omar Elayan <142979319+oelayan7@users.noreply.github.com>
1 parent 549e11d commit 079de6b

File tree

9 files changed

+24
-24
lines changed

9 files changed

+24
-24
lines changed

.github/workflows/nv-accelerate-v100.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ concurrency:
1919

2020
jobs:
2121
unit-tests:
22-
runs-on: [self-hosted, nvidia, cu121, v100]
22+
runs-on: [self-hosted, nvidia, cu124, v100]
2323

2424
steps:
2525
- uses: actions/checkout@v4
@@ -29,7 +29,7 @@ jobs:
2929

3030
- name: Install pytorch
3131
run: |
32-
pip install -U --cache-dir $TORCH_CACHE torch torchvision --index-url https://download.pytorch.org/whl/cu121
32+
pip install -U --cache-dir $TORCH_CACHE torch torchvision --index-url https://download.pytorch.org/whl/cu124
3333
python -c "import torch; print('torch:', torch.__version__, torch)"
3434
python -c "import torch; print('CUDA available:', torch.cuda.is_available())"
3535

.github/workflows/nv-ds-chat.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ permissions:
2727

2828
jobs:
2929
unit-tests:
30-
runs-on: [self-hosted, nvidia, cu121, v100]
30+
runs-on: [self-hosted, nvidia, cu124, v100]
3131

3232
steps:
3333
- uses: actions/checkout@v4
@@ -37,7 +37,7 @@ jobs:
3737

3838
- name: Install pytorch
3939
run: |
40-
pip install -U --cache-dir $TORCH_CACHE torch torchvision --index-url https://download.pytorch.org/whl/cu121
40+
pip install -U --cache-dir $TORCH_CACHE torch torchvision --index-url https://download.pytorch.org/whl/cu124
4141
python -c "import torch; print('torch:', torch.__version__, torch)"
4242
python -c "import torch; print('CUDA available:', torch.cuda.is_available())"
4343

.github/workflows/nv-inference.yml

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ concurrency:
2222

2323
jobs:
2424
unit-tests:
25-
runs-on: [self-hosted, nvidia, cu121, v100]
25+
runs-on: [self-hosted, nvidia, cu124, v100]
2626

2727
steps:
2828
- uses: actions/checkout@v4
@@ -32,7 +32,7 @@ jobs:
3232

3333
- name: Install pytorch
3434
run: |
35-
pip install -U --cache-dir $TORCH_CACHE torch==2.1.2 torchvision==0.16.2 --index-url https://download.pytorch.org/whl/cu121
35+
pip install -U --cache-dir $TORCH_CACHE torch==2.1.2 torchvision==0.16.2 --index-url https://download.pytorch.org/whl/cu124
3636
python -c "import torch; print('torch:', torch.__version__, torch)"
3737
python -c "import torch; print('CUDA available:', torch.cuda.is_available())"
3838
@@ -58,8 +58,8 @@ jobs:
5858
run: |
5959
unset TORCH_CUDA_ARCH_LIST # only jit compile for current arch
6060
cd tests
61-
#pytest $PYTEST_OPTS -m 'seq_inference' unit/ --torch_ver="2.1" --cuda_ver="12.1"
62-
pytest $PYTEST_OPTS -m 'inference_ops' unit/ --torch_ver="2.1" --cuda_ver="12.1"
63-
pytest $PYTEST_OPTS --forked -n 4 -m 'inference' unit/ --torch_ver="2.1" --cuda_ver="12.1"
61+
#pytest $PYTEST_OPTS -m 'seq_inference' unit/ --torch_ver="2.1" --cuda_ver="12.4"
62+
pytest $PYTEST_OPTS -m 'inference_ops' unit/ --torch_ver="2.1" --cuda_ver="12.4"
63+
pytest $PYTEST_OPTS --forked -n 4 -m 'inference' unit/ --torch_ver="2.1" --cuda_ver="12.4"
6464
# run ds_report again to check updated op list
6565
ds_report

.github/workflows/nv-lightning-v100.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ concurrency:
1919

2020
jobs:
2121
unit-tests:
22-
runs-on: [self-hosted, nvidia, cu121, v100]
22+
runs-on: [self-hosted, nvidia, cu124, v100]
2323

2424
steps:
2525
- uses: actions/checkout@v4
@@ -29,7 +29,7 @@ jobs:
2929

3030
- name: Install pytorch
3131
run: |
32-
pip install -U --cache-dir $TORCH_CACHE torch torchvision --index-url https://download.pytorch.org/whl/cu121
32+
pip install -U --cache-dir $TORCH_CACHE torch torchvision --index-url https://download.pytorch.org/whl/cu124
3333
python -c "import torch; print('torch:', torch.__version__, torch)"
3434
python -c "import torch; print('CUDA available:', torch.cuda.is_available())"
3535

.github/workflows/nv-mii.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ concurrency:
2727

2828
jobs:
2929
unit-tests:
30-
runs-on: [self-hosted, nvidia, cu121, v100]
30+
runs-on: [self-hosted, nvidia, cu124, v100]
3131

3232
steps:
3333
- uses: actions/checkout@v4
@@ -37,7 +37,7 @@ jobs:
3737

3838
- name: Install pytorch
3939
run: |
40-
pip3 install -U --cache-dir $TORCH_CACHE torch torchvision --index-url https://download.pytorch.org/whl/cu121
40+
pip3 install -U --cache-dir $TORCH_CACHE torch torchvision --index-url https://download.pytorch.org/whl/cu124
4141
python -c "import torch; print('torch:', torch.__version__, torch)"
4242
python -c "import torch; print('CUDA available:', torch.cuda.is_available())"
4343

.github/workflows/nv-nightly.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ permissions:
1818

1919
jobs:
2020
unit-tests:
21-
runs-on: [self-hosted, nvidia, cu121, v100]
21+
runs-on: [self-hosted, nvidia, cu124, v100]
2222

2323
steps:
2424
- uses: actions/checkout@v4
@@ -28,7 +28,7 @@ jobs:
2828

2929
- name: Install pytorch
3030
run: |
31-
pip install -U --cache-dir $TORCH_CACHE torch torchvision --index-url https://download.pytorch.org/whl/cu121
31+
pip install -U --cache-dir $TORCH_CACHE torch torchvision --index-url https://download.pytorch.org/whl/cu124
3232
python -c "import torch; print('torch:', torch.__version__, torch)"
3333
python -c "import torch; print('CUDA available:', torch.cuda.is_available())"
3434
@@ -58,7 +58,7 @@ jobs:
5858
run: |
5959
unset TORCH_CUDA_ARCH_LIST # only jit compile for current arch
6060
cd tests
61-
pytest $PYTEST_OPTS --forked -m 'nightly' unit/ --torch_ver="2.5" --cuda_ver="12.1"
61+
pytest $PYTEST_OPTS --forked -m 'nightly' unit/ --torch_ver="2.6" --cuda_ver="12.4"
6262
6363
- name: Open GitHub issue if nightly CI fails
6464
if: ${{ failure() && (github.event_name == 'schedule') }}

.github/workflows/nv-torch-latest-v100.yml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ concurrency:
1919

2020
jobs:
2121
unit-tests:
22-
runs-on: [self-hosted, nvidia, cu121, v100]
22+
runs-on: [self-hosted, nvidia, cu124, v100]
2323

2424
steps:
2525
- uses: actions/checkout@v4
@@ -29,7 +29,7 @@ jobs:
2929

3030
- name: Install pytorch
3131
run: |
32-
pip install -U --cache-dir $TORCH_CACHE torch torchvision --index-url https://download.pytorch.org/whl/cu121
32+
pip install -U --cache-dir $TORCH_CACHE torch torchvision --index-url https://download.pytorch.org/whl/cu124
3333
python -c "import torch; print('torch:', torch.__version__, torch)"
3434
python -c "import torch; print('CUDA available:', torch.cuda.is_available())"
3535
@@ -55,5 +55,5 @@ jobs:
5555
run: |
5656
unset TORCH_CUDA_ARCH_LIST # only jit compile for current arch
5757
cd tests
58-
pytest $PYTEST_OPTS --forked -n 4 unit/ --torch_ver="2.5" --cuda_ver="12.1"
59-
pytest $PYTEST_OPTS --forked -m 'sequential' unit/ --torch_ver="2.5" --cuda_ver="12.1"
58+
pytest $PYTEST_OPTS --forked -n 4 unit/ --torch_ver="2.6" --cuda_ver="12.4"
59+
pytest $PYTEST_OPTS --forked -m 'sequential' unit/ --torch_ver="2.6" --cuda_ver="12.4"

.github/workflows/nv-torch-nightly-v100.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ permissions:
1818

1919
jobs:
2020
unit-tests:
21-
runs-on: [self-hosted, nvidia, cu121, v100]
21+
runs-on: [self-hosted, nvidia, cu124, v100]
2222

2323
steps:
2424
- uses: actions/checkout@v4
@@ -28,7 +28,7 @@ jobs:
2828

2929
- name: Install pytorch
3030
run: |
31-
pip install --pre torch torchvision --index-url https://download.pytorch.org/whl/nightly/cu121
31+
pip install --pre torch torchvision --index-url https://download.pytorch.org/whl/nightly/cu124
3232
python -c "import torch; print('torch:', torch.__version__, torch)"
3333
python -c "import torch; print('CUDA available:', torch.cuda.is_available())"
3434

.github/workflows/nv-transformers-v100.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ concurrency:
1818

1919
jobs:
2020
unit-tests:
21-
runs-on: [self-hosted, nvidia, cu121, v100]
21+
runs-on: [self-hosted, nvidia, cu124, v100]
2222

2323
steps:
2424
- uses: actions/checkout@v4
@@ -29,7 +29,7 @@ jobs:
2929
- name: Install pytorch
3030
run: |
3131
# use the same pytorch version as transformers CI
32-
pip install -U --cache-dir $TORCH_CACHE torch==2.0.1+cu121 --index-url https://download.pytorch.org/whl/cu121
32+
pip install -U --cache-dir $TORCH_CACHE torch==2.0.1+cu124 --index-url https://download.pytorch.org/whl/cu124
3333
python -c "import torch; print('torch:', torch.__version__, torch)"
3434
python -c "import torch; print('CUDA available:', torch.cuda.is_available())"
3535

0 commit comments

Comments
 (0)