[Attention] Register FLASHMLA_SPARSE #26441

MatthewBonanni · 2025-10-08T20:06:54Z

Purpose

Add FLASHMLA_SPARSE to the backend registry

Test Plan

CI should suffice

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>

gemini-code-assist

Code Review

This pull request correctly registers the FLASHMLA_SPARSE attention backend. The changes are consistent with the existing structure for registering backends. I've reviewed the modifications in vllm/attention/backends/registry.py and vllm/v1/attention/backends/mla/flashmla_sparse.py and found no issues of high or critical severity. The new enum member and its corresponding entry in BACKEND_MAP are correctly added, and the name of the backend in get_name is now consistent with its registration key in-tree conventions.

yewentao256

LGTM, thanks for the work!

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>

…to loader * 'loader' of https://github.com/dsxsteven/vllm_splitPR: (778 commits) [torchao] Add support for ModuleFqnToConfig using regex (vllm-project#26001) Add: Support for multiple hidden layers in Eagle3 (vllm-project#26164) Enable `RMSNorm` substitution for Transformers backend (vllm-project#26353) [Model] Gemma3: Fix GGUF loading and quantization (vllm-project#26189) Bump Flashinfer to v0.4.0 (vllm-project#26326) Update Dockerfile and install runai-model-streamer[gcs] package (vllm-project#26464) [Core] Relax the LoRA max rank (vllm-project#26461) [CI/Build] Fix model nightly tests (vllm-project#26466) [Hybrid]: Decouple Kernel Block Size from KV Page Size (vllm-project#24486) [Core][KVConnector] Propagate all tokens on resumed preemptions (vllm-project#24926) [MM][Doc] Add documentation for configurable mm profiling (vllm-project#26200) [Hardware][AMD] Enable FlexAttention backend on ROCm (vllm-project#26439) [Bugfix] Incorrect another MM data format in vllm bench throughput (vllm-project#26462) [Bugfix] Catch and log invalid token ids in detokenizer #2 (vllm-project#26445) [Minor] Change warning->warning_once in preprocess (vllm-project#26455) [Bugfix] Set the minimum python version for gpt-oss (vllm-project#26392) [Misc] Redact ray runtime env before logging (vllm-project#26302) Separate MLAAttention class from Attention (vllm-project#25103) [Attention] Register FLASHMLA_SPARSE (vllm-project#26441) [Kernels] Modular kernel refactor (vllm-project#24812) ...

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com> Signed-off-by: Dhruvil Bhatt <bhattdbh@amazon.com>

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com> Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>

register FLASHMLA_SPARSE

938b69c

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>

MatthewBonanni requested a review from LucasWilkinson as a code owner October 8, 2025 20:06

mergify bot added the v1 label Oct 8, 2025

gemini-code-assist bot reviewed Oct 8, 2025

View reviewed changes

yewentao256 approved these changes Oct 8, 2025

View reviewed changes

yewentao256 added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 8, 2025

LucasWilkinson enabled auto-merge (squash) October 8, 2025 20:46

LucasWilkinson merged commit 2a03f93 into vllm-project:main Oct 8, 2025
52 checks passed

mrasquinha-g pushed a commit to mrasquinha-g/vllm that referenced this pull request Oct 9, 2025

[Attention] Register FLASHMLA_SPARSE (vllm-project#26441)

ec52ed9

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>

mrasquinha-g pushed a commit to mrasquinha-g/vllm that referenced this pull request Oct 9, 2025

[Attention] Register FLASHMLA_SPARSE (vllm-project#26441)

6238068

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>

zhiyuan1i pushed a commit to zhiyuan1i/vllm that referenced this pull request Oct 9, 2025

[Attention] Register FLASHMLA_SPARSE (vllm-project#26441)

0cb652a

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>

xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 10, 2025

[Attention] Register FLASHMLA_SPARSE (vllm-project#26441)

c02110d

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>

Dhruvilbhatt pushed a commit to Dhruvilbhatt/vllm that referenced this pull request Oct 14, 2025

[Attention] Register FLASHMLA_SPARSE (vllm-project#26441)

3a8ee54

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com> Signed-off-by: Dhruvil Bhatt <bhattdbh@amazon.com>

lywa1998 pushed a commit to lywa1998/vllm that referenced this pull request Oct 20, 2025

[Attention] Register FLASHMLA_SPARSE (vllm-project#26441)

d3c8c6b

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>

alhridoy pushed a commit to alhridoy/vllm that referenced this pull request Oct 24, 2025

[Attention] Register FLASHMLA_SPARSE (vllm-project#26441)

4eaf0ae

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>

xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025

[Attention] Register FLASHMLA_SPARSE (vllm-project#26441)

ba7a976

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>

0xrushi pushed a commit to 0xrushi/vllm that referenced this pull request Oct 26, 2025

[Attention] Register FLASHMLA_SPARSE (vllm-project#26441)

c33fdf3

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com> Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Attention] Register FLASHMLA_SPARSE #26441

[Attention] Register FLASHMLA_SPARSE #26441

Uh oh!

MatthewBonanni commented Oct 8, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

yewentao256 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

[Attention] Register FLASHMLA_SPARSE #26441

[Attention] Register FLASHMLA_SPARSE #26441

Uh oh!

Conversation

MatthewBonanni commented Oct 8, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

yewentao256 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

MatthewBonanni commented Oct 8, 2025 •

edited by github-actions bot

Loading