[Spec Decode][V0] Fix spec decode correctness test in V0 eagle/medusa #18175

wwl2755 · 2025-05-15T00:34:02Z

Fix #18166 to make CI back to work.

cc: @WoosukKwon @LiuXiaoxuanPKU @robertgshaw2-redhat

Signed-off-by: wwl2755 <wangwenlong2755@gmail.com>

github-actions · 2025-05-15T00:34:10Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

tests/spec_decode/e2e/test_eagle_correctness.py

robertgshaw2-redhat · 2025-05-15T00:36:18Z

Thank you!

Signed-off-by: wwl2755 <wangwenlong2755@gmail.com>

wwl2755 · 2025-05-16T03:36:58Z

Merged from main to resolve the failing v1 spec test. (Should be fixed by #18223)

DarkLight1337 · 2025-05-16T08:01:48Z

Tests are still failing, but for a different reason than before.

wwl2755 · 2025-05-16T21:29:07Z

Tests are still failing, but for a different reason than before.

@DarkLight1337 Yes. It seems to be a deeper bug related to Medusa. I have tried to reproduce the bug locally, but it seems a bit different than https://buildkite.com/vllm/ci/builds/20196#0196d7c4-85a1-404e-8459-e4b430c434fd.

May I know how the CI environment is set up? Is using pytest locally equivalent to the CI test on the cloud? Thanks!

Local reproduction code and results:

pytest -v -s tests/spec_decode/e2e/test_medusa_correctness.py::test_medusa_e2e_greedy_correctness_with_preemption

>       run_equality_correctness_test(vllm_runner,
                                      common_llm_kwargs,
                                      per_test_common_llm_kwargs,
                                      baseline_llm_kwargs,
                                      test_llm_kwargs,
                                      batch_size,
                                      max_output_len=output_len,
                                      seed=seed,
                                      temperature=0.0)

tests/spec_decode/e2e/test_medusa_correctness.py:249: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
tests/spec_decode/e2e/conftest.py:211: in run_equality_correctness_test
    with vllm_runner(**sd_args) as vllm_model:
tests/conftest.py:1037: in __exit__
    cleanup_dist_env_and_memory()
vllm/distributed/parallel_state.py:1225: in cleanup_dist_env_and_memory
    torch.cuda.empty_cache()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

    def empty_cache() -> None:
        r"""Release all unoccupied cached memory currently held by the caching
        allocator so that those can be used in other GPU application and visible in
        `nvidia-smi`.
    
        .. note::
            :func:`~torch.cuda.empty_cache` doesn't increase the amount of GPU
            memory available for PyTorch. However, it may help reduce fragmentation
            of GPU memory in certain cases. See :ref:`cuda-memory-management` for
            more details about GPU memory management.
        """
        if is_initialized():
>           torch._C._cuda_emptyCache()
E           RuntimeError: CUDA error: device-side assert triggered
E           CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
E           For debugging consider passing CUDA_LAUNCH_BLOCKING=1
E           Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

DarkLight1337 · 2025-05-17T02:48:04Z

The environment should be same as requirements/test.txt. Have you tried running the same command as in .buildkite/test-pipeline.yaml? Perhaps it's a problem with cleanup when multiple tests are run in sequence.

Signed-off-by: wwl2755 <wangwenlong2755@gmail.com>

wwl2755 · 2025-05-18T22:51:23Z

@DarkLight1337 The latest commit passed the medusa check locally. PTAL. Thank you!

wwl2755 · 2025-05-19T02:14:50Z

It seems all the V0/Spec decode tests have passed. The failed V1 test should be fixed by #18169

…vllm-project#18175) Signed-off-by: wwl2755 <wangwenlong2755@gmail.com> Signed-off-by: Yuqi Zhang <yuqizhang@google.com>

fix

2757f4a

Signed-off-by: wwl2755 <wangwenlong2755@gmail.com>

wwl2755 requested review from LiuXiaoxuanPKU and njhill as code owners May 15, 2025 00:34

mergify bot added the speculative-decoding label May 15, 2025

robertgshaw2-redhat reviewed May 15, 2025

View reviewed changes

tests/spec_decode/e2e/test_eagle_correctness.py Outdated Show resolved Hide resolved

wwl2755 mentioned this pull request May 15, 2025

[HELP WANTED] Fix Failing Spec Decoding Test #18166

Closed

fix on comment

addac06

Signed-off-by: wwl2755 <wangwenlong2755@gmail.com>

robertgshaw2-redhat enabled auto-merge (squash) May 16, 2025 01:48

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label May 16, 2025

Merge branch 'main' of github.com:wwl2755/vllm into fix_v0_spec_test

744ca6f

wwl2755 requested a review from robertgshaw2-redhat May 16, 2025 03:48

DarkLight1337 mentioned this pull request May 16, 2025

[CI] Fix Nightly Failures #17997

Closed

simon-mo approved these changes May 16, 2025

View reviewed changes

fix medusa unit tests

e28097b

Signed-off-by: wwl2755 <wangwenlong2755@gmail.com>

auto-merge was automatically disabled May 18, 2025 22:49
Head branch was pushed to by a user without write access

wwl2755 changed the title ~~[Spec Decode][V0] Fix spec decode correctness test in V0 eagle~~ [Spec Decode][V0] Fix spec decode correctness test in V0 eagle/medusa May 18, 2025

vllm-bot merged commit 9da1095 into vllm-project:main May 19, 2025
66 of 68 checks passed

zzzyq pushed a commit to zzzyq/vllm that referenced this pull request May 24, 2025

[Spec Decode][V0] Fix spec decode correctness test in V0 eagle/medusa (…

3ecbf25

…vllm-project#18175) Signed-off-by: wwl2755 <wangwenlong2755@gmail.com> Signed-off-by: Yuqi Zhang <yuqizhang@google.com>

wwl2755 deleted the fix_v0_spec_test branch September 10, 2025 05:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Spec Decode][V0] Fix spec decode correctness test in V0 eagle/medusa #18175

[Spec Decode][V0] Fix spec decode correctness test in V0 eagle/medusa #18175

Uh oh!

wwl2755 commented May 15, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented May 15, 2025

Uh oh!

Uh oh!

robertgshaw2-redhat commented May 15, 2025

Uh oh!

wwl2755 commented May 16, 2025

Uh oh!

DarkLight1337 commented May 16, 2025

Uh oh!

wwl2755 commented May 16, 2025

Uh oh!

DarkLight1337 commented May 17, 2025

Uh oh!

wwl2755 commented May 18, 2025

Uh oh!

wwl2755 commented May 19, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Uh oh!

[Spec Decode][V0] Fix spec decode correctness test in V0 eagle/medusa #18175

[Spec Decode][V0] Fix spec decode correctness test in V0 eagle/medusa #18175

Uh oh!

Conversation

wwl2755 commented May 15, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented May 15, 2025

Uh oh!

Uh oh!

robertgshaw2-redhat commented May 15, 2025

Uh oh!

wwl2755 commented May 16, 2025

Uh oh!

DarkLight1337 commented May 16, 2025

Uh oh!

wwl2755 commented May 16, 2025

Uh oh!

DarkLight1337 commented May 17, 2025

Uh oh!

wwl2755 commented May 18, 2025

Uh oh!

wwl2755 commented May 19, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

wwl2755 commented May 15, 2025 •

edited by github-actions bot

Loading