You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Update latency test script due to deprecation in vllm (#2973)
Summary:
For evaluating latency, currently we use python benchmarks/benchmark_latency.py but it is deprecated recently:
```
DEPRECATED: This script has been moved to the vLLM CLI.
Please use the following command instead:
vllm bench latency
For help with the new command, run:
vllm bench latency --help
Alternatively, you can run the new command directly with:
python -m vllm.entrypoints.cli.main bench latency --help
```
So we updated it to use `vllm bench latency` instead
Test Plan:
sh eval.sh --eval_type latency --model_ids Qwen/Qwen3-8B
Reviewers:
Subscribers:
Tasks:
Tags:
0 commit comments