Skip to content

Conversation

@MatthewBonanni
Copy link
Contributor

@MatthewBonanni MatthewBonanni commented Sep 29, 2025

Purpose

This PR continues the attention backend selection refactor #24794, which is now being split up. Here, we allow backends to register themselves in a BACKEND_MAP which will be used later in the refactor. This also enables out-of-tree backends for other hardware to plug in neatly.

Test Plan

CI should be sufficient

Test Result


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the attention backend selection mechanism by introducing a centralized registry. Backends now register themselves using a decorator, which is a significant improvement in modularity and removes hardcoded mappings. The changes are extensive and touch many files, mostly for updating imports and applying the new registration decorator.

The core logic is sound, but I've identified one high-severity issue related to testing on ROCm. The refactoring removed a platform-specific hack from a test utility without updating the tests that relied on it. This will lead to tests being skipped on ROCm, reducing test coverage for that platform. My review comment provides details on this issue.

@MatthewBonanni MatthewBonanni changed the title Backends register themselves in BACKEND_MAP [Attention] Backends register themselves in BACKEND_MAP Sep 30, 2025
@mergify
Copy link

mergify bot commented Sep 30, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @MatthewBonanni.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
@MatthewBonanni MatthewBonanni marked this pull request as ready for review October 3, 2025 13:39
@MatthewBonanni
Copy link
Contributor Author

cc @ILikeIneine this has the registration mechanism you proposed in #24794!

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Copy link
Collaborator

@NickLucche NickLucche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM , thanks @MatthewBonanni !

Left a few minor comments, also please check with AMD folks if the backend renaming is fine (just to be on the safe side).

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
@MatthewBonanni
Copy link
Contributor Author

@NickLucche Thanks for your review! I've addressed your comments.

@NickLucche NickLucche added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 8, 2025
@mergify
Copy link

mergify bot commented Oct 8, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @MatthewBonanni.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Oct 8, 2025
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
@mergify mergify bot removed the needs-rebase label Oct 8, 2025
… to selection. Leave decorator to enable out-of-tree backend plugins

Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
@mergify mergify bot removed the tpu Related to Google TPUs label Oct 8, 2025
@MatthewBonanni MatthewBonanni changed the title [Attention] Backends register themselves in BACKEND_MAP [Attention] Implement universal BACKEND_MAP Oct 8, 2025
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
}


def register_attn_backend(backend: _Backend, class_path: Optional[str] = None):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this work for OOT backends, wouldn't they have to extend the _Backend enum?

Copy link
Contributor Author

@MatthewBonanni MatthewBonanni Oct 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The plan is for them to override an existing backend (see #24794 (comment)). We could extend this in the future (and find an alternative to the enum) to make it cleaner though.

@simon-mo simon-mo merged commit 76879cc into vllm-project:main Oct 8, 2025
50 of 53 checks passed
mrasquinha-g pushed a commit to mrasquinha-g/vllm that referenced this pull request Oct 9, 2025
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
zhiyuan1i pushed a commit to zhiyuan1i/vllm that referenced this pull request Oct 9, 2025
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 10, 2025
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: xuebwang-amd <xuebwang@amd.com>
Dhruvilbhatt pushed a commit to Dhruvilbhatt/vllm that referenced this pull request Oct 14, 2025
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: Dhruvil Bhatt <bhattdbh@amazon.com>
lywa1998 pushed a commit to lywa1998/vllm that referenced this pull request Oct 20, 2025
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
alhridoy pushed a commit to alhridoy/vllm that referenced this pull request Oct 24, 2025
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: xuebwang-amd <xuebwang@amd.com>
0xrushi pushed a commit to 0xrushi/vllm that referenced this pull request Oct 26, 2025
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
0xrushi pushed a commit to 0xrushi/vllm that referenced this pull request Oct 26, 2025
Signed-off-by: Matthew Bonanni <mbonanni@redhat.com>
Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kv-connector qwen Related to Qwen models ready ONLY add when PR is ready to merge/full CI is needed rocm Related to AMD ROCm speculative-decoding v1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants