[Bench] Add "per-gpu-workload" mode #3068

MasterJH5574 · 2024-12-16T12:17:58Z

This PR introduces the per-gpu-workload mode to MLC bench. Under this mode, the specified "num_concurrent_requests" and "request_rate" denote the workload per GPU, which means the overall workload of the entire serving system for benchmarking will be multiplied by the number of GPUs.

Meanwhile, this PR deprecates the argument --testset-name in favor of --dataset-path for Loogle dataset.

This PR introduces the per-gpu-workload mode to MLC bench. Under this mode, the specified "num_concurrent_requests" and "request_rate" denote the workload **per GPU**, which means the overall workload of the entire serving system for benchmarking will be multiplied by the number of GPUs. Meanwhile, this PR deprecates the argument `--testset-name` in favor of `--dataset-path` for Loogle dataset.

jinhongyii merged commit 88074ea into mlc-ai:main Dec 17, 2024
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bench] Add "per-gpu-workload" mode #3068

[Bench] Add "per-gpu-workload" mode #3068

MasterJH5574 commented Dec 16, 2024

[Bench] Add "per-gpu-workload" mode #3068

[Bench] Add "per-gpu-workload" mode #3068

Conversation

MasterJH5574 commented Dec 16, 2024