Skip to content

Commit 47c1a1b

Browse files
ahadnagyArthurZucker
authored andcommitted
Benchmarking v2 GH workflows (#40716)
* WIP benchmark v2 workflow * Container was missing * Change to sandbox branch name * Wrong place for image name * Variable declarations * Remove references to file logging * Remove unnecessary step * Fix deps install * Syntax * Add workdir * Add upload feature * typo * No need for hf_transfer * Pass in runner * Runner config * Runner config * Runner config * Runner config * Runner config * mi325 caller * Name workflow runs properly * Copy-paste error * Add final repo IDs and schedule * Review comments * Remove wf params * Remove parametrization from worfkflow files * Fix callers * Change push trigger to pull_request + label * Add back schedule event * Push to the same dataset * Simplify parameter description
1 parent 5c2f566 commit 47c1a1b

File tree

7 files changed

+311
-10
lines changed

7 files changed

+311
-10
lines changed

.github/workflows/benchmark_v2.yml

Lines changed: 82 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,82 @@
1+
name: Benchmark v2 Framework
2+
3+
on:
4+
workflow_call:
5+
inputs:
6+
runner:
7+
description: 'GH Actions runner group to use'
8+
required: true
9+
type: string
10+
commit_sha:
11+
description: 'Commit SHA to benchmark'
12+
required: false
13+
type: string
14+
default: ''
15+
upload_to_hub:
16+
description: 'Uploading results to a HuggingFace Dataset'
17+
required: false
18+
type: string
19+
default: 'false'
20+
run_id:
21+
description: 'Custom run ID for organizing results (auto-generated if not provided)'
22+
required: false
23+
type: string
24+
default: ''
25+
benchmark_repo_id:
26+
description: 'HuggingFace Dataset to upload results to (e.g., "org/benchmark-results")'
27+
required: false
28+
type: string
29+
default: ''
30+
31+
env:
32+
HF_HOME: /mnt/cache
33+
TRANSFORMERS_IS_CI: yes
34+
# For gated repositories, we still need to agree to share information on the Hub repo. page in order to get access.
35+
# This token is created under the bot `hf-transformers-bot`.
36+
HF_HUB_READ_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}
37+
38+
jobs:
39+
benchmark-v2:
40+
name: Benchmark v2
41+
runs-on: ${{ inputs.runner }}
42+
if: |
43+
(github.event_name == 'pull_request' && contains( github.event.pull_request.labels.*.name, 'run-benchmark')) ||
44+
(github.event_name == 'schedule')
45+
container:
46+
image: huggingface/transformers-pytorch-gpu
47+
options: --gpus all --privileged --ipc host --shm-size "16gb"
48+
steps:
49+
- name: Get repo
50+
uses: actions/checkout@v4
51+
with:
52+
ref: ${{ inputs.commit_sha || github.sha }}
53+
54+
- name: Install benchmark dependencies
55+
run: |
56+
python3 -m pip install -r benchmark_v2/requirements.txt
57+
58+
- name: Reinstall transformers in edit mode
59+
run: |
60+
python3 -m pip uninstall -y transformers
61+
python3 -m pip install -e ".[torch]"
62+
63+
- name: Show installed libraries and their versions
64+
run: |
65+
python3 -m pip list
66+
python3 -c "import torch; print(f'PyTorch version: {torch.__version__}')"
67+
python3 -c "import torch; print(f'CUDA available: {torch.cuda.is_available()}')"
68+
python3 -c "import torch; print(f'CUDA device count: {torch.cuda.device_count()}')" || true
69+
nvidia-smi || true
70+
71+
- name: Run benchmark v2
72+
working-directory: benchmark_v2
73+
run: |
74+
echo "Running benchmarks"
75+
python3 run_benchmarks.py \
76+
--commit-id '${{ inputs.commit_sha || github.sha }}' \
77+
--upload-to-hub '${{ inputs.upload_to_hub || false}}' \
78+
--run-id '${{ inputs.run_id }}' \
79+
--benchmark-repo-id '${{ inputs.benchmark_repo_id}}' \
80+
--log-level INFO
81+
env:
82+
HF_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}
Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
name: Benchmark v2 Scheduled Runner - A10 Single-GPU
2+
3+
on:
4+
schedule:
5+
# Run daily at 16:30 UTC
6+
- cron: "30 16 * * *"
7+
pull_request:
8+
types: [ opened, labeled, reopened, synchronize ]
9+
10+
jobs:
11+
benchmark-v2-default:
12+
name: Benchmark v2 - Default Models
13+
uses: ./.github/workflows/benchmark_v2.yml
14+
with:
15+
runner: aws-g5-4xlarge-cache-use1-public-80
16+
commit_sha: ${{ github.sha }}
17+
upload_to_hub: true
18+
run_id: ${{ github.run_id }}
19+
benchmark_repo_id: hf-internal-testing/transformers-daily-benchmarks
20+
secrets: inherit
Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
name: Benchmark v2 Scheduled Runner - MI325 Single-GPU
2+
3+
on:
4+
schedule:
5+
# Run daily at 16:30 UTC
6+
- cron: "30 16 * * *"
7+
pull_request:
8+
types: [ opened, labeled, reopened, synchronize ]
9+
10+
jobs:
11+
benchmark-v2-default:
12+
name: Benchmark v2 - Default Models
13+
uses: ./.github/workflows/benchmark_v2.yml
14+
with:
15+
runner: amd-mi325-ci-1gpu
16+
commit_sha: ${{ github.sha }}
17+
upload_to_hub: true
18+
run_id: ${{ github.run_id }}
19+
benchmark_repo_id: hf-internal-testing/transformers-daily-benchmarks
20+
secrets: inherit

benchmark_v2/README.md

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,36 @@ python run_benchmarks.py \
2121
--num-tokens-to-generate 200
2222
```
2323

24+
### Uploading Results to HuggingFace Dataset
25+
26+
You can automatically upload benchmark results to a HuggingFace Dataset for tracking and analysis:
27+
28+
```bash
29+
# Upload to a public dataset with auto-generated run ID
30+
python run_benchmarks.py --upload-to-hf username/benchmark-results
31+
32+
# Upload with a custom run ID for easy identification
33+
python run_benchmarks.py --upload-to-hf username/benchmark-results --run-id experiment_v1
34+
```
35+
36+
**Dataset Directory Structure:**
37+
```
38+
dataset_name/
39+
├── 2025-01-15/
40+
│ ├── runs/ # Non-scheduled runs (manual, PR, etc.)
41+
│ │ └── 123-1245151651/ # GitHub run number and ID
42+
│ │ └── benchmark_results/
43+
│ │ ├── benchmark_summary_20250115_143022.json
44+
│ │ └── model-name/
45+
│ │ └── model-name_benchmark_20250115_143022.json
46+
│ └── benchmark_results_abc123de/ # Scheduled runs (daily CI)
47+
│ ├── benchmark_summary_20250115_143022.json
48+
│ └── model-name/
49+
│ └── model-name_benchmark_20250115_143022.json
50+
└── 2025-01-16/
51+
└── ...
52+
```
53+
2454
### Running Specific Benchmarks
2555

2656
```bash

benchmark_v2/benches/llama.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,6 @@
2020
from benchmark_framework import ModelBenchmark
2121

2222

23-
os.environ["HF_HUB_ENABLE_HF_TRANSFER"] = "1"
2423
os.environ["TOKENIZERS_PARALLELISM"] = "1"
2524
torch.set_float32_matmul_precision("high")
2625

benchmark_v2/requirements.txt

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,4 +3,5 @@ psutil>=5.8.0
33
gpustat>=1.0.0
44
torch>=2.0.0
55
transformers>=4.30.0
6-
datasets>=2.10.0
6+
datasets>=2.10.0
7+
huggingface_hub>=0.16.0

0 commit comments

Comments
 (0)