-
-
Notifications
You must be signed in to change notification settings - Fork 11k
[Misc]: Add support for goodput on guided benchmarking + TPOT calculation refactor #13736
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Misc]: Add support for goodput on guided benchmarking + TPOT calculation refactor #13736
Conversation
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
…tion correction Signed-off-by: Brayden Zhong <b8zhong@uwaterloo.ca>
Signed-off-by: Brayden Zhong <b8zhong@uwaterloo.ca>
41b5609 to
a14d223
Compare
…tion correction Signed-off-by: Brayden Zhong <b8zhong@uwaterloo.ca>
Signed-off-by: Brayden Zhong <b8zhong@uwaterloo.ca>
… into remove-arg-references
|
The relevant benchmarks on master vs. this branch (goodput doesn't change any calculations; it's TPOT here: On master: |
|
Thanks @b8zhong for the contribution!
Actually this is something I had been thinking we should change in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the contribution! @b8zhong
| outputs[i].tpot = sum(tpots) / len(tpots) if len(tpots) else 0 | ||
| outputs[i].tpot = tpot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fix! This is probably an oversight, but at least I don't think we use outputs.[i].tpot for the actual results anyways since they're all based on tpots variable. See line 345-347
In fact I'm not sure where we actually use outputs[i].tpot so need to double check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's used down here on L548:
result = {
# ... (other metrics)
"tpot_description": pd.Series([output.tpot for output in outputs]).describe().to_dict(),
# ... (other metrics)
}Then results is also used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's get this on merged and I can follow up with v1 benchmark scripts
…tion refactor (vllm-project#13736) Signed-off-by: Brayden Zhong <b8zhong@uwaterloo.ca>
…tion refactor (vllm-project#13736) Signed-off-by: Brayden Zhong <b8zhong@uwaterloo.ca> Signed-off-by: Louis Ulmer <ulmerlouis@gmail.com>
…tion refactor (vllm-project#13736) Signed-off-by: Brayden Zhong <b8zhong@uwaterloo.ca>
Implement Goodput Calculation + Fix TPOT Calc in Benchmark
This PR touches
benchmark_serving_guided.pyby adding goodput calculation and correcting TPOT assignment.Changes:
Added goodput calculation:
--goodputargument (consistent withbenchmark_serving.py), accepting SLOs as "KEY:VALUE" pairs (e.g.,--goodput ttft:500 tpot:100), where keys are metric names (ttft,tpot,e2el) and values are in milliseconds.calculate_metricsto computegood_completed, counting requests that meet all specified SLOs.Fixed TPOT calculation:
outputs[i].tpotto store the per-request TPOT value instead of a cumulative average. Not sure if it was the intended behaviour as this is how it was done inbenchmark_serving.py...Minor:
benchmark_serving_guided.pyinstead ofbenchmark_serving.py.