Skip to content

Conversation

@Isotr0py
Copy link
Member

@Isotr0py Isotr0py commented Sep 23, 2025

Purpose

Test Plan

pytest -s -v tests/kernels/attention/test_attention_selector.py -k test_env

Test Result

All tests should pass now:

tests/kernels/attention/test_attention_selector.py::test_env[cuda_TRITON_MLA_mla_T_blks16] INFO 09-23 19:32:46 [cuda.py:278] Using Triton MLA backend on V1 engine.
W0923 19:32:46.830000 825531 .venv/lib/python3.12/site-packages/torch/utils/cpp_extension.py:2425] TORCH_CUDA_ARCH_LIST is not set, all archs for visible cards are included for compilation. 
W0923 19:32:46.830000 825531 .venv/lib/python3.12/site-packages/torch/utils/cpp_extension.py:2425] If this is not desired, please set os.environ['TORCH_CUDA_ARCH_LIST'] to specific architectures.
PASSED
tests/kernels/attention/test_attention_selector.py::test_env[cuda_TRITON_MLA_mla_T_blks64] PASSED
tests/kernels/attention/test_attention_selector.py::test_env[cuda_FLASHMLA_mla_T_blks16] SKIPPED (FlashMLA only supports block_size 64)
tests/kernels/attention/test_attention_selector.py::test_env[cuda_FLASHMLA_mla_T_blks64] SKIPPED (FlashMLA not supported on this platform)
tests/kernels/attention/test_attention_selector.py::test_env[cuda_FLASHINFER_MLA_mla_T_blks16] SKIPPED (FlashInfer MLA only supports block_size 32 or 64)
tests/kernels/attention/test_attention_selector.py::test_env[cuda_FLASHINFER_MLA_mla_T_blks64] INFO 09-23 19:32:48 [cuda.py:259] Using FlashInfer MLA backend on V1 engine.
PASSED
tests/kernels/attention/test_attention_selector.py::test_env[cuda_FLASH_ATTN_MLA_mla_T_blks16] INFO 09-23 19:32:48 [cuda.py:273] Using FlashAttention MLA backend on V1 engine.
PASSED
tests/kernels/attention/test_attention_selector.py::test_env[cuda_FLASH_ATTN_MLA_mla_T_blks64] PASSED
tests/kernels/attention/test_attention_selector.py::test_env[cuda_CUTLASS_MLA_mla_T_blks16] SKIPPED (CUTLASS_MLA only supports block_size 128)
tests/kernels/attention/test_attention_selector.py::test_env[cuda_CUTLASS_MLA_mla_T_blks64] SKIPPED (CUTLASS_MLA only supports block_size 128)
tests/kernels/attention/test_attention_selector.py::test_env[hip_TRITON_MLA_mla_T_blks16] INFO 09-23 19:32:49 [rocm.py:209] Using Triton MLA backend on V1 engine.
PASSED
tests/kernels/attention/test_attention_selector.py::test_env[hip_TRITON_MLA_mla_T_blks1] PASSED
tests/kernels/attention/test_attention_selector.py::test_env[hip_ROCM_AITER_MLA_mla_T_blks16] PASSED
tests/kernels/attention/test_attention_selector.py::test_env[hip_ROCM_AITER_MLA_mla_T_blks1] INFO 09-23 19:32:50 [rocm.py:218] Using AITER MLA backend on V1 engine.
PASSED
tests/kernels/attention/test_attention_selector.py::test_env[cuda_XFORMERS_mla_F_blks16] INFO 09-23 19:32:50 [cuda.py:347] Using Flash Attention backend on V1 engine.
INFO 09-23 19:32:50 [cuda.py:364] Using FlexAttention backend for head_size=16 on V1 engine.
PASSED
tests/kernels/attention/test_attention_selector.py::test_env[cuda_FLASHINFER_mla_F_blks16] INFO 09-23 19:32:51 [cuda.py:293] Using FlashInfer backend on V1 engine.
PASSED
tests/kernels/attention/test_attention_selector.py::test_env[hip_ROCM_FLASH_mla_F_blks16] INFO 09-23 19:32:51 [rocm.py:245] Using Triton Attention backend on V1 engine.
PASSED
tests/kernels/attention/test_attention_selector.py::test_env[cpu_TORCH_SDPA_mla_F_blks16] INFO 09-23 19:32:52 [cpu.py:101] Using Torch SDPA backend.
PASSED

================================================================================ warnings summary =================================================================================
.venv/lib/python3.12/site-packages/torch/cuda/__init__.py:63
  /home/mozf/develop-projects/vllm/.venv/lib/python3.12/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report this to the maintainers of the package that installed pynvml for you.
    import pynvml  # type: ignore[import]

.venv/lib/python3.12/site-packages/schemathesis/generation/coverage.py:305
  /home/mozf/develop-projects/vllm/.venv/lib/python3.12/site-packages/schemathesis/generation/coverage.py:305: DeprecationWarning: jsonschema.exceptions.RefResolutionError is deprecated as of version 4.18.0. If you wish to catch potential reference resolution errors, directly catch referencing.exceptions.Unresolvable.
    ref_error: type[Exception] = jsonschema.RefResolutionError,

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
============================================================= 13 passed, 5 skipped, 4 deselected, 2 warnings in 6.33s =============================================================

Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
@Isotr0py Isotr0py added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 23, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request re-enables a disabled test for the v1 attention backend selection and corrects an import path within it. The changes are straightforward and correct, successfully fixing the test as described. I find no issues with this update.

@DarkLight1337 DarkLight1337 enabled auto-merge (squash) September 23, 2025 12:35
@DarkLight1337 DarkLight1337 merged commit b6a136b into vllm-project:main Sep 23, 2025
21 of 23 checks passed
@Isotr0py Isotr0py deleted the fix-v1-mla-test branch September 23, 2025 13:05
FeiDaLI pushed a commit to FeiDaLI/vllm that referenced this pull request Sep 25, 2025
charlifu pushed a commit to ROCm/vllm that referenced this pull request Sep 25, 2025
…ject#25471)

Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: charlifu <charlifu@amd.com>
yewentao256 pushed a commit that referenced this pull request Oct 3, 2025
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
gjc0824 pushed a commit to gjc0824/vllm that referenced this pull request Oct 10, 2025
…ject#25471)

Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: gaojc <1055866782@qq.com>
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 10, 2025
…ject#25471)

Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: xuebwang-amd <xuebwang@amd.com>
choprahetarth pushed a commit to Tandemn-Labs/vllm that referenced this pull request Oct 11, 2025
lywa1998 pushed a commit to lywa1998/vllm that referenced this pull request Oct 20, 2025
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025
…ject#25471)

Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Signed-off-by: xuebwang-amd <xuebwang@amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants