-
Notifications
You must be signed in to change notification settings - Fork 21
add models #42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add models #42
Conversation
| }, | ||
| { | ||
| "test_name": "serving_llama4_maverick_fp8_tp8", | ||
| "test_name": "serving_llama4_scout_tp4_random_in200_out200", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: we need to have a better way to separate these cases with different input/output shapes on the dashboards. At the moment, there are only tensor parallel size and request rate https://hud.pytorch.org/benchmark/llms?repoName=vllm-project%2Fvllm
cc @yangw-dev if you have time to pick this up
huydhn
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for adding these model!
Fixes #6974 Input and output lengths are new dimensions on the dashboard that needs to be displayed after pytorch/pytorch-integration-testing#42. This PR also cleans up some old TODO code path for vLLM dashboard. ### Testing Different input and output lengths are showing up correctly now with their benchmark results on [the preview](https://torchci-git-fork-huydhn-query-input-output-length-fbopensource.vercel.app/benchmark/llms?startTime=Sat%2C%2002%20Aug%202025%2001%3A35%3A55%20GMT&stopTime=Sat%2C%2009%20Aug%202025%2001%3A35%3A55%20GMT&granularity=day&lBranch=main&lCommit=0edaf752d7482a3c170c25376c466e730ab87ddd&rBranch=main&rCommit=e5ebeeba531755a78f68413e88a23d061404f3e3&repoName=vllm-project%2Fvllm&benchmarkName=&modelName=meta-llama%2FLlama-4-Scout-17B-16E-Instruct&backendName=All%20Backends&modeName=All%20Modes&dtypeName=All%20DType&deviceName=All%20Devices&archName=All%20Platforms) --------- Signed-off-by: Huy Do <huydhn@gmail.com>
This PR adds more benchmarks to ci. After this PR, we cover the following settings: