-
-
Notifications
You must be signed in to change notification settings - Fork 11k
[Benchmark] Add plot utility for parameter sweep #27168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
Documentation preview: https://vllm--27168.org.readthedocs.build/en/27168/ |
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better code organization than our mainline code! 😂
A couple of thoughts:
- Should we move SLA into a separate entrypoint
serve_sla.py? To me it's different enough than the serve sweep, and we're already reusing utils. But up to you the current approach is fine - Could you post an example plot command with its output plot? Hard to imagine what the plots look like just from the code
- I'll try the script when I can and let you know if I have more thoughts, we can merge before that though
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
| ```bash | ||
| python vllm/benchmarks/serve_multi.py \ | ||
| python -m vllm.benchmarks.sweep.serve_sla \ | ||
| --serve-cmd 'vllm serve meta-llama/Llama-2-7b-chat-hf' \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we use a vllm serve api address, we should how to config?
Maybe we should add a --serve-host param, user can set a vllm online server, then this --serve-params param can be invalid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can set the server's host via --serve-cmd. And for resetting the server cache after each benchmark run, you can use --after-bench-cmd.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you mean that the benchmark should not be responsible for launching the server, you can just use a dummy command that sleeps infinitely and adjust --bench-cmd to access the real server. Of course, you should also set --after-bench-cmd in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, maybe i not need set --serve-cmd param, use --bench-cmd param to set vllm bench serve --model meta-llama/Llama-2-7b-chat-hf --backend openai is enough.
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Fixed now |
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: Alberto Perdomo <aperdomo@redhat.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>


Purpose
Follow-up to #27085
vllm/benchmarks/sweep, abstracting away common code.vllm/benchmarks/sweep/plot.py.cc @lengrongfu
Test Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.