-
-
Notifications
You must be signed in to change notification settings - Fork 11k
Fix more broken speculative decode tests #17450
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The failures come from vllm-project#17084 Signed-off-by: Huy Do <huydhn@gmail.com>
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add 🚀 |
Signed-off-by: Huy Do <huydhn@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. @huydhn Thanks for fixing this!
Signed-off-by: Huy Do <huydhn@gmail.com>
|
Nice, now we just have these to worry about: |
Oh darn, more failures are sneaking in I think. They weren't there before I rebased. |
Signed-off-by: Huy Do <huydhn@gmail.com>
Signed-off-by: Huy Do <huydhn@gmail.com> Signed-off-by: Mu Huai <tianbowen.tbw@antgroup.com>
Signed-off-by: Huy Do <huydhn@gmail.com> Signed-off-by: Yuqi Zhang <yuqizhang@google.com>
A follow-up PR to fix some more speculative decode tests from #17084. There are 2 fixes:
samplerobjects:The latter doesn't have
include_gpu_probs_tensorset to True, which cause a bunch of failures withpytest -v spec_decode/e2e/test_mlp_correctness.py. @WoosukKwon Please let me know if the fix makes sense to you. This feels like a quick patch to cover the underlying setup from #17084. But it kind of works.block_sizeupdate missed by Fix some speculative decode tests with tl.dot #17371