Skip to content

Conversation

@DarkLight1337
Copy link
Member

@DarkLight1337 DarkLight1337 commented Oct 7, 2025

Purpose

Remove VLLM_USE_V1 flag from all tests, except for:

  • ~~tests/basic_correctness/test_cumem.py: Should we delete the V0-specific code or just comment them out? cc @youkaichao @ywang96 ~~
  • tests/v1/test_oracle.py: For obvious reasons

Updates:

  • tests/v1/core/test_scheduler_e2e.py was previously not run. ~~Updated CPU model runner to avoid assert num_tokens <= self.scheduler_config.max_num_batched_tokens assertion error in dummy run. However now the test causes the process to be terminated, so I'm skipping this for CPU. cc @bigPYJ1151 ~~ Running this in the GPU CI instead of CPU CI
  • tests/v1/sample/test_sampling_params_e2e.py was previously not run. Removed test_priority because priority scheduling is now supported in [Core] feat: Implement Priority Scheduling in V1 Engine #19057.

Test Plan

Test Result


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively removes the VLLM_USE_V1 flag from a large number of test files, which is a crucial step in deprecating the V0 engine and solidifying V1 as the default. The changes are consistent and well-executed across the entire test suite, correctly removing environment variable settings, conditional logic, and fixtures related to the V0 engine. This simplifies the tests and aligns them with the V1-only architecture. I've reviewed the changes thoroughly and found no issues of high or critical severity. The cleanup is well done.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Comment on lines 68 to +86
@pytest.mark.parametrize("model", MODEL_NAMES)
def test_lm_eval_accuracy_v1_engine(model, monkeypatch: pytest.MonkeyPatch):
def test_lm_eval_accuracy_v1_engine(model):
"""Run with the V1 Engine."""

with monkeypatch.context() as m:
m.setenv("VLLM_USE_V1", "1")
more_args = None
if current_platform.is_tpu():
# Limit compilation time for TPU V1

more_args = None
if current_platform.is_tpu():
# Limit compilation time for TPU V1
more_args = "max_model_len=2048,max_num_seqs=64"

more_args = "max_model_len=2048,max_num_seqs=64"
# Add TP test (if provided)
if TPU_TP_TEST_STR:
more_args += ",{}".format(TPU_TP_TEST_STR)

# Add TP test (if provided)
if TPU_TP_TEST_STR:
more_args += ",{}".format(TPU_TP_TEST_STR)
run_test(model, more_args)

run_test(model, more_args)


@pytest.mark.skipif(
not current_platform.is_cuda() and not current_platform.is_tpu(),
reason="V1 is currently only supported on CUDA and TPU",
)
@pytest.mark.parametrize("model", FP8_KV_MODEL_NAMES)
def test_lm_eval_accuracy_v1_engine_fp8_kv_cache(
model, monkeypatch: pytest.MonkeyPatch
):
def test_lm_eval_accuracy_v1_engine_fp8_kv_cache(model):
"""Run with the V1 Engine."""

with monkeypatch.context() as m:
m.setenv("VLLM_USE_V1", "1")

more_args = None
if current_platform.is_tpu():
# Limit compilation time for TPU V1
more_args = "max_model_len=2048,max_num_seqs=128,kv_cache_dtype=fp8"
more_args = None
if current_platform.is_tpu():

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Reinstate platform guard for LM-Eval accuracy tests

The two LM-Eval accuracy tests no longer have the @pytest.mark.skipif guard that previously skipped them when CUDA/TPU are unavailable. These tests spin up vLLM and run gsm8k with 500 concurrent requests, which assumes GPU or TPU support; running them on a CPU-only runner will make CI fail rather than skip as before. Please restore the skip condition (or an equivalent platform gate) so CPU environments don’t execute GPU-only workloads.

Useful? React with 👍 / 👎.

Comment on lines +69 to +76
def test_lm_eval_accuracy_v1_engine():
"""Run with the V1 Engine."""

with monkeypatch.context() as m:
m.setenv("VLLM_USE_V1", "1")
more_args = []
more_args = []

# Limit compilation time for V1
if current_platform.is_tpu():
more_args = ["--max-num-seqs", "64"]
# Limit compilation time for V1
if current_platform.is_tpu():
more_args = ["--max-num-seqs", "64"]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Accuracy correctness test now runs on unsupported platforms

The OpenAI correctness LM-Eval test also had its @pytest.mark.skipif removed and now runs unconditionally. It still launches a RemoteOpenAIServer and evaluates gsm8k with 500 concurrent requests, which requires CUDA/XPU/TPU. Executing this on CPU-only CI will fail rather than skip as intended earlier. A platform guard should be reintroduced to avoid running this GPU-only test on unsupported hardware.

Useful? React with 👍 / 👎.

@DarkLight1337 DarkLight1337 moved this to In Progress in V0 Deprecation Oct 7, 2025
Copy link
Member

@hmellor hmellor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! One nit about a TODO which can be removed now that monkeypatch has been removed.

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
@youkaichao
Copy link
Member

tests/basic_correctness/test_cumem.py: Should we delete the V0-specific code or just comment them out?

we can remove them.

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
@DarkLight1337 DarkLight1337 enabled auto-merge (squash) October 7, 2025 13:09
@mergify mergify bot added the ci/build label Oct 7, 2025
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
@DarkLight1337 DarkLight1337 merged commit 1e4ecca into vllm-project:main Oct 7, 2025
56 checks passed
@DarkLight1337 DarkLight1337 deleted the remove-use-v1-tests branch October 7, 2025 15:42
@github-project-automation github-project-automation bot moved this from In Progress to Done in V0 Deprecation Oct 7, 2025
mrasquinha-g pushed a commit to mrasquinha-g/vllm that referenced this pull request Oct 9, 2025
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 10, 2025
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: xuebwang-amd <xuebwang@amd.com>
Dhruvilbhatt pushed a commit to Dhruvilbhatt/vllm that referenced this pull request Oct 14, 2025
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: Dhruvil Bhatt <bhattdbh@amazon.com>
lywa1998 pushed a commit to lywa1998/vllm that referenced this pull request Oct 20, 2025
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
alhridoy pushed a commit to alhridoy/vllm that referenced this pull request Oct 24, 2025
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: xuebwang-amd <xuebwang@amd.com>
rtourgeman pushed a commit to rtourgeman/vllm that referenced this pull request Nov 10, 2025
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/build kv-connector multi-modality Related to multi-modality (#4194) ready ONLY add when PR is ready to merge/full CI is needed speculative-decoding structured-output tpu Related to Google TPUs v1

Projects

Status: Done
Status: Done

Development

Successfully merging this pull request may close these issues.

3 participants