add models #42

BoyuanFeng · 2025-06-22T22:34:49Z

This PR adds more benchmarks to ci. After this PR, we cover the following settings:

huydhn · 2025-07-21T20:52:19Z

vllm-benchmarks/benchmarks/cuda/serving-tests.json

    },
    {
-        "test_name": "serving_llama4_maverick_fp8_tp8",
+        "test_name": "serving_llama4_scout_tp4_random_in200_out200",


Note: we need to have a better way to separate these cases with different input/output shapes on the dashboards. At the moment, there are only tensor parallel size and request rate https://hud.pytorch.org/benchmark/llms?repoName=vllm-project%2Fvllm

cc @yangw-dev if you have time to pick this up

huydhn

Thank you for adding these model!

Fixes #6974 Input and output lengths are new dimensions on the dashboard that needs to be displayed after pytorch/pytorch-integration-testing#42. This PR also cleans up some old TODO code path for vLLM dashboard. ### Testing Different input and output lengths are showing up correctly now with their benchmark results on [the preview](https://torchci-git-fork-huydhn-query-input-output-length-fbopensource.vercel.app/benchmark/llms?startTime=Sat%2C%2002%20Aug%202025%2001%3A35%3A55%20GMT&stopTime=Sat%2C%2009%20Aug%202025%2001%3A35%3A55%20GMT&granularity=day&lBranch=main&lCommit=0edaf752d7482a3c170c25376c466e730ab87ddd&rBranch=main&rCommit=e5ebeeba531755a78f68413e88a23d061404f3e3&repoName=vllm-project%2Fvllm&benchmarkName=&modelName=meta-llama%2FLlama-4-Scout-17B-16E-Instruct&backendName=All%20Backends&modeName=All%20Modes&dtypeName=All%20DType&deviceName=All%20Devices&archName=All%20Platforms) --------- Signed-off-by: Huy Do <huydhn@gmail.com>

add gemma-3-27b-it and qwen3_30B-A3B

539b9fd

facebook-github-bot added the cla signed label Jun 22, 2025

BoyuanFeng temporarily deployed to pytorch-x-vllm June 22, 2025 22:35 — with GitHub Actions Inactive

BoyuanFeng had a problem deploying to pytorch-x-vllm June 22, 2025 22:35 — with GitHub Actions Error

BoyuanFeng had a problem deploying to pytorch-x-vllm June 22, 2025 22:35 — with GitHub Actions Failure

BoyuanFeng temporarily deployed to pytorch-x-vllm June 22, 2025 22:35 — with GitHub Actions Inactive

BoyuanFeng temporarily deployed to pytorch-x-vllm June 23, 2025 05:57 — with GitHub Actions Inactive

BoyuanFeng had a problem deploying to pytorch-x-vllm June 23, 2025 05:57 — with GitHub Actions Error

BoyuanFeng temporarily deployed to pytorch-x-vllm June 23, 2025 05:57 — with GitHub Actions Inactive

BoyuanFeng had a problem deploying to pytorch-x-vllm June 23, 2025 05:57 — with GitHub Actions Failure

BoyuanFeng temporarily deployed to pytorch-x-vllm July 20, 2025 04:29 — with GitHub Actions Inactive

BoyuanFeng had a problem deploying to pytorch-x-vllm July 20, 2025 04:29 — with GitHub Actions Failure

BoyuanFeng temporarily deployed to pytorch-x-vllm July 20, 2025 04:29 — with GitHub Actions Inactive

BoyuanFeng had a problem deploying to pytorch-x-vllm July 20, 2025 04:29 — with GitHub Actions Error

BoyuanFeng temporarily deployed to pytorch-x-vllm July 20, 2025 04:29 — with GitHub Actions Inactive

BoyuanFeng had a problem deploying to pytorch-x-vllm July 20, 2025 04:29 — with GitHub Actions Failure

BoyuanFeng temporarily deployed to pytorch-x-vllm July 20, 2025 04:29 — with GitHub Actions Inactive

BoyuanFeng had a problem deploying to pytorch-x-vllm July 21, 2025 20:46 — with GitHub Actions Failure

huydhn reviewed Jul 21, 2025

View reviewed changes

huydhn approved these changes Jul 21, 2025

View reviewed changes

BoyuanFeng merged commit 60beea6 into main Jul 21, 2025
42 of 48 checks passed

This was referenced Aug 6, 2025

Track input and output length on vLLM dashboard pytorch/test-infra#6974

Closed

Show input and output length on vLLM dashboard pytorch/test-infra#6992

Merged

huydhn mentioned this pull request Oct 1, 2025

[ez] Run latency and throughput benchmark for Qwen3 and Gemma3 #86

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

add models #42

add models #42

Uh oh!

BoyuanFeng commented Jun 22, 2025 •

edited

Loading

Uh oh!

huydhn Jul 21, 2025

Uh oh!

huydhn left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

add models #42

add models #42

Uh oh!

Conversation

BoyuanFeng commented Jun 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

huydhn Jul 21, 2025

Choose a reason for hiding this comment

Uh oh!

huydhn left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

BoyuanFeng commented Jun 22, 2025 •

edited

Loading