You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- Input length: randomly sample 200 prompts from ShareGPT dataset (with fixed random seed).
21
21
- Output length: the corresponding output length of these 200 prompts.
22
22
- Batch size: dynamically determined by vllm and the arrival pattern of the requests.
23
23
-**Average QPS (query per second)**: 1, 4, 16 and inf. QPS = inf means all requests come at once. For other QPS values, the arrival time of each query is determined using a random Poisson process (with fixed random seed).
0 commit comments