Skip to content

Conversation

@BoyuanFeng
Copy link
Contributor

@BoyuanFeng BoyuanFeng commented Jun 22, 2025

This PR adds more benchmarks to ci. After this PR, we cover the following settings:

image

},
{
"test_name": "serving_llama4_maverick_fp8_tp8",
"test_name": "serving_llama4_scout_tp4_random_in200_out200",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: we need to have a better way to separate these cases with different input/output shapes on the dashboards. At the moment, there are only tensor parallel size and request rate https://hud.pytorch.org/benchmark/llms?repoName=vllm-project%2Fvllm

cc @yangw-dev if you have time to pick this up

Copy link
Contributor

@huydhn huydhn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for adding these model!

@BoyuanFeng BoyuanFeng merged commit 60beea6 into main Jul 21, 2025
42 of 48 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants