Enable CPU benchmark for VLLM perf dashboard #44

huydhn · 2025-07-09T08:02:12Z

Pick up the work on #39 to support CPU benchmark. The PR is more involved than I expect, and the list of changes include:

Put vLLM benchmark suite into the appropriate platform folders for cuda, rocm, and cpu
Extend the logic in .github/scripts/generate_vllm_benchmark_matrix.py to read from the correct folder from (1)
Add .github/scripts/test_generate_vllm_benchmark_matrix.py for (2) because it's pretty complex now
Extend the logic .github/scripts/setup_vllm_benchmark.py to copy from the correct folder from (1)
Use our existing linux.24xl.spr-metal to run CPU benchmark until Intel's runner is ready
Incorporate the change from enable CPU benchmark for VLLM Perf Dashboard. #39 to the workflow:
1. To use vLLM CPU Docker image at public.ecr.aws/q9t5s3a7/vllm-ci-postmerge-repo:<HEAD_SHA>-cpu
2. To pass ON_CPU to vLLM benchmark script
Fix the use of torch.cuda.get_device_name() in .github/scripts/upload_benchmark_results.py because there is no CUDA device on CPU

Testing

https://github.com/pytorch/pytorch-integration-testing/actions/runs/16231541112

cc @louie-tsai

Co-authored-by: Huy Do <huydhn@gmail.com>

louie-tsai · 2025-07-10T00:40:42Z

some path issue

I saw the serving-test.json file under /workspace/.buildkite/nightly-benchmarks/tests folder.

no sure whether we have right path in current workflow.
docker run --gpus all -e NVIDIA_DRIVER_CAPABILITIES=all -e SCCACHE_SERVER_PORT=5228 -e SCCACHE_BUCKET -e SCCACHE_REGION -e DEVICE_NAME -e DEVICE_TYPE -e HF_TOKEN -e ENGINE_VERSION -e SAVE_TO_PYTORCH_BENCHMARK_FORMAT -e ON_CPU=0 --ipc=host --tty --security-opt seccomp=unconfined -v /home/bob/_work/pytorch-integration-testing/pytorch-integration-testing:/tmp/workspace -w /tmp/workspace public.ecr.aws/q9t5s3a7/vllm-ci-postmerge-repo:b6e7e3d58f57aee30a55b3160645ddb2f029d3c8 bash -xc 'cd vllm-benchmarks/vllm && bash .buildkite/nightly-benchmarks/scripts/run-performance-benchmarks.sh'

huydhn · 2025-07-10T01:31:09Z

some path issue

Yeah, the step to setup the benchmark need a tweak per my comment in #39 (comment). When the device is CPU, it looks for file with _cpu suffix and that's fine. However, for CUDA or ROCm device, there is no _cuda or _rocm suffix. This looks like an ez tweak, so I could do it here if you prefer

louie-tsai · 2025-07-13T01:25:13Z

.github/scripts/generate_vllm_benchmark_matrix.py

    2: [
        "linux.aws.h100.4",
        "linux.rocm.gpu.mi300.2",
+        "linux.24xl.spr-metal",


24xlarge have only 1 NUMA node, so we should not put it under TP=2

.github/scripts/generate_vllm_benchmark_matrix.py

Co-authored-by: Louie Tsai <louie.tsai@intel.com>

fadara01 · 2025-09-24T13:35:55Z

Hi @huydhn - we would like to enable this for AArch64 too (linux.arm64.m7g.metal)
What's the best place to ask for write-access to this repository s.t. once can test the changes?

huydhn · 2025-09-25T00:23:42Z

Hi @huydhn - we would like to enable this for AArch64 too (linux.arm64.m7g.metal) What's the best place to ask for write-access to this repository s.t. once can test the changes?

I could grant you that permission, but want to check what it is needed for. I thought that submitting a PR like this one would be sufficient? We do have linux.arm64.m7g.metal runner ready to use.

cfRod · 2025-09-30T13:27:59Z

@huydhn I think we mean permissions to trigger the workflow for dashboard?
We have this for TorchInductor HUD dashboard i.e.

to raise PRs directly to Pytorch instead of a fork
https://github.com/pytorch/pytorch/actions/workflows/inductor-perf-test-nightly-aarch64.yml to run workflow. Would we need something like this for vllm?

huydhn · 2025-10-02T03:56:56Z

@huydhn I think we mean permissions to trigger the workflow for dashboard? We have this for TorchInductor HUD dashboard i.e.

to raise PRs directly to Pytorch instead of a fork

https://github.com/pytorch/pytorch/actions/workflows/inductor-perf-test-nightly-aarch64.yml to run workflow. Would we need something like this for vllm?

Ah ok, got it. Ping me on vLLM Slack with the usernames, I could help grant the permission that you need

louie-tsai and others added 6 commits June 25, 2025 16:28

first draft to enable CPU benchmark

ec0ac36

Update .github/workflows/vllm-benchmark.yml

1201ea6

Co-authored-by: Huy Do <huydhn@gmail.com>

fix for ROCm changes

c253948

change to use public cpu vllm postmerge registry

1d0271a

target on 4 NUMA node EMR machine

9cffd0e

Merge branch 'main' into cpu_vllm_benchmark

aa55555

facebook-github-bot added the cla signed label Jul 9, 2025

huydhn had a problem deploying to pytorch-x-vllm July 9, 2025 08:02 — with GitHub Actions Failure

huydhn had a problem deploying to pytorch-x-vllm July 9, 2025 08:02 — with GitHub Actions Error

huydhn had a problem deploying to pytorch-x-vllm July 9, 2025 08:02 — with GitHub Actions Failure

huydhn had a problem deploying to pytorch-x-vllm July 9, 2025 08:02 — with GitHub Actions Error

louie-tsai added 2 commits July 10, 2025 13:34

Update vllm-benchmark.yml

caa0cf6

Update vllm-benchmark.yml

e12e2c1

louie-tsai had a problem deploying to pytorch-x-vllm July 10, 2025 20:34 — with GitHub Actions Error

louie-tsai temporarily deployed to pytorch-x-vllm July 10, 2025 20:34 — with GitHub Actions Inactive

louie-tsai had a problem deploying to pytorch-x-vllm July 10, 2025 20:34 — with GitHub Actions Error

huydhn temporarily deployed to pytorch-x-vllm July 12, 2025 01:26 — with GitHub Actions Inactive

huydhn had a problem deploying to pytorch-x-vllm July 12, 2025 01:26 — with GitHub Actions Error

huydhn temporarily deployed to pytorch-x-vllm July 12, 2025 01:26 — with GitHub Actions Inactive

louie-tsai approved these changes Jul 13, 2025

View reviewed changes

c7i.metal-24xl has only 1 NUMA node

90bce79

Co-authored-by: Louie Tsai <louie.tsai@intel.com>

huydhn temporarily deployed to pytorch-x-vllm July 13, 2025 08:29 — with GitHub Actions Inactive

huydhn had a problem deploying to pytorch-x-vllm July 13, 2025 08:29 — with GitHub Actions Error

huydhn temporarily deployed to pytorch-x-vllm July 13, 2025 08:29 — with GitHub Actions Inactive

huydhn had a problem deploying to pytorch-x-vllm July 13, 2025 08:29 — with GitHub Actions Error

huydhn temporarily deployed to pytorch-x-vllm July 13, 2025 08:29 — with GitHub Actions Inactive

louie-tsai mentioned this pull request Jul 15, 2025

enable CPU benchmark for VLLM Perf Dashboard. #39

Closed

huydhn merged commit 046a22c into main Jul 15, 2025
11 of 13 checks passed

huydhn deleted the 39 branch July 17, 2025 01:04

This was referenced Oct 1, 2025

[Feature]: CI workflow for building non-CUDA Aarch64 wheels vllm-project/vllm#26017

Closed

[Feature]: Add Aarch64 to PyTorch CI HUD vLLM dashboard vllm-project/vllm#26019

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Enable CPU benchmark for VLLM perf dashboard #44

Enable CPU benchmark for VLLM perf dashboard #44

Uh oh!

huydhn commented Jul 9, 2025 •

edited

Loading

Uh oh!

louie-tsai commented Jul 10, 2025 •

edited

Loading

Uh oh!

huydhn commented Jul 10, 2025 •

edited

Loading

Uh oh!

louie-tsai Jul 13, 2025

Uh oh!

Uh oh!

Uh oh!

fadara01 commented Sep 24, 2025

Uh oh!

huydhn commented Sep 25, 2025

Uh oh!

cfRod commented Sep 30, 2025

Uh oh!

huydhn commented Oct 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Enable CPU benchmark for VLLM perf dashboard #44

Enable CPU benchmark for VLLM perf dashboard #44

Uh oh!

Conversation

huydhn commented Jul 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Testing

Uh oh!

louie-tsai commented Jul 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

huydhn commented Jul 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

louie-tsai Jul 13, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

fadara01 commented Sep 24, 2025

Uh oh!

huydhn commented Sep 25, 2025

Uh oh!

cfRod commented Sep 30, 2025

Uh oh!

huydhn commented Oct 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

huydhn commented Jul 9, 2025 •

edited

Loading

louie-tsai commented Jul 10, 2025 •

edited

Loading

huydhn commented Jul 10, 2025 •

edited

Loading