Skip to content

Conversation

@huydhn
Copy link
Contributor

@huydhn huydhn commented Apr 30, 2025

A follow-up PR to fix some more speculative decode tests from #17084. There are 2 fixes:

The latter doesn't have include_gpu_probs_tensor set to True, which cause a bunch of failures with pytest -v spec_decode/e2e/test_mlp_correctness.py. @WoosukKwon Please let me know if the fix makes sense to you. This feels like a quick patch to cover the underlying setup from #17084. But it kind of works.

The failures come from vllm-project#17084

Signed-off-by: Huy Do <huydhn@gmail.com>
@github-actions
Copy link

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

Signed-off-by: Huy Do <huydhn@gmail.com>
@huydhn huydhn marked this pull request as ready for review April 30, 2025 07:15
Copy link
Collaborator

@WoosukKwon WoosukKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. @huydhn Thanks for fixing this!

@WoosukKwon WoosukKwon added the ready ONLY add when PR is ready to merge/full CI is needed label Apr 30, 2025
Signed-off-by: Huy Do <huydhn@gmail.com>
@WoosukKwon WoosukKwon enabled auto-merge (squash) May 1, 2025 09:04
@DarkLight1337
Copy link
Member

Nice, now we just have these to worry about:

FAILED spec_decode/e2e/test_eagle_correctness.py::test_eagle_e2e_greedy_correctness_with_preemption[1-4-128-test_llm_kwargs0-baseline_llm_kwargs0-per_test_common_llm_kwargs0-common_llm_kwargs0] - ValueError: 0 is not in list
FAILED spec_decode/e2e/test_medusa_correctness.py::test_medusa_e2e_greedy_correctness_with_preemption[-1-1-4-128-test_llm_kwargs0-baseline_llm_kwargs0-per_test_common_llm_kwargs0-common_llm_kwargs0] - ValueError: 0 is not in list
FAILED spec_decode/test_memory_usage.py::test_memory_usage_no_spec - TypeError: EngineArgs.__init__() got an unexpected keyword argument 'speculative_model'

@vllm-bot vllm-bot merged commit b74d888 into vllm-project:main May 1, 2025
43 of 46 checks passed
@huydhn
Copy link
Contributor Author

huydhn commented May 1, 2025

Nice, now we just have these to worry about:

FAILED spec_decode/e2e/test_eagle_correctness.py::test_eagle_e2e_greedy_correctness_with_preemption[1-4-128-test_llm_kwargs0-baseline_llm_kwargs0-per_test_common_llm_kwargs0-common_llm_kwargs0] - ValueError: 0 is not in list
FAILED spec_decode/e2e/test_medusa_correctness.py::test_medusa_e2e_greedy_correctness_with_preemption[-1-1-4-128-test_llm_kwargs0-baseline_llm_kwargs0-per_test_common_llm_kwargs0-common_llm_kwargs0] - ValueError: 0 is not in list
FAILED spec_decode/test_memory_usage.py::test_memory_usage_no_spec - TypeError: EngineArgs.__init__() got an unexpected keyword argument 'speculative_model'

Oh darn, more failures are sneaking in I think. They weren't there before I rebased.

radeksm pushed a commit to radeksm/vllm that referenced this pull request May 2, 2025
RichardoMrMu pushed a commit to RichardoMrMu/vllm that referenced this pull request May 12, 2025
Signed-off-by: Huy Do <huydhn@gmail.com>
Signed-off-by: Mu Huai <tianbowen.tbw@antgroup.com>
zzzyq pushed a commit to zzzyq/vllm that referenced this pull request May 24, 2025
Signed-off-by: Huy Do <huydhn@gmail.com>
Signed-off-by: Yuqi Zhang <yuqizhang@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready ONLY add when PR is ready to merge/full CI is needed speculative-decoding

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants