[UX] Fail if an invalid attention backend is specified #22217

mgoin · 2025-08-04T20:25:40Z

Purpose

As the title states, I think if an invalid attention backend is manually specified like VLLM_ATTENTION_BACKEND=INVALID it should fail rather than fallback to some default

Rob found that this PR #21966 causes fallback to V0 if that env variable is set incorrectly

WARNING 08-04 15:00:03 [arg_utils.py:1771] VLLM_ATTENTION_BACKEND=CUTLASS_MLA_VLLM_V1 is not supported by the V1 Engine. Falling back to V0. We recommend to remove VLLM_ATTENTION_BACKEND=CUTLASS_MLA_VLLM_V1 from your config in favor of the V1 Engine.

Test Plan

CI

Test Result

Signed-off-by: mgoin <michael@neuralmagic.com>

github-actions · 2025-08-04T20:25:47Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

gemini-code-assist

Code Review

This pull request correctly changes the behavior to fail fast when an invalid attention backend is specified via an environment variable, rather than silently falling back to a default. The implementation is straightforward and the tests are updated accordingly to verify the new exception-raising behavior. I've identified one potential issue related to caching that could cause stale environment variable values to be used, which could lead to unexpected behavior.

gemini-code-assist · 2025-08-04T20:27:46Z

vllm/attention/selector.py

+            if selected_backend is None:
+                raise ValueError(
+                    f"Invalid attention backend: '{backend_by_env_var}'. "
+                    f"Valid backends are: {list(_Backend.__members__.keys())}")


The _cached_get_attn_backend function is decorated with @cache, which memoizes its results. However, it reads envs.VLLM_ATTENTION_BACKEND directly. This can lead to incorrect behavior if the environment variable is changed while the process is running, as a cached result based on a stale value of the environment variable might be returned.

A similar issue is already addressed for VLLM_USE_V1 in the get_attn_backend wrapper function, where the environment variable is read and passed as an argument to the cached function. A similar approach should be taken for VLLM_ATTENTION_BACKEND to ensure that changes to the backend configuration are always respected.

DarkLight1337

Makes sense, thanks for improving the UX!

DarkLight1337 · 2025-08-05T14:55:26Z

Plugin Tests are failing because of this PR: https://buildkite.com/vllm/ci/builds/26033/steps/canvas?sid=01987a40-050a-40d2-b36c-92e23ebf4d0c

…22217) Signed-off-by: mgoin <michael@neuralmagic.com>

…22217) Signed-off-by: mgoin <michael@neuralmagic.com> Signed-off-by: Jinzhen Lin <linjinzhen@hotmail.com>

…22217) Signed-off-by: mgoin <michael@neuralmagic.com> Signed-off-by: Noam Gat <noamgat@gmail.com>

…22217) Signed-off-by: mgoin <michael@neuralmagic.com> Signed-off-by: Paul Pak <paulpak58@gmail.com>

…22217) Signed-off-by: mgoin <michael@neuralmagic.com> Signed-off-by: Diego-Castan <diego.castan@ibm.com>

…22217) Signed-off-by: mgoin <michael@neuralmagic.com>

…22217) Signed-off-by: mgoin <michael@neuralmagic.com> Signed-off-by: Xiao Yu <xiao.yu@amd.com>

…22217) Signed-off-by: mgoin <michael@neuralmagic.com>

Fail if an invalid attention backend is specified

5c86d20

Signed-off-by: mgoin <michael@neuralmagic.com>

mgoin requested review from WoosukKwon and tlrmchlsmth as code owners August 4, 2025 20:25

mgoin changed the title ~~Fail if an invalid attention backend is specified~~ [UX] Fail if an invalid attention backend is specified Aug 4, 2025

gemini-code-assist bot reviewed Aug 4, 2025

View reviewed changes

DarkLight1337 approved these changes Aug 5, 2025

View reviewed changes

DarkLight1337 enabled auto-merge (squash) August 5, 2025 03:00

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Aug 5, 2025

vllm-bot merged commit e79a12f into vllm-project:main Aug 5, 2025
54 of 56 checks passed

mgoin mentioned this pull request Aug 5, 2025

[CI Failure]: Plugin Tests (2 GPUs) - plugins_tests/test_platform_plugins.py::test_oot_attention_backend #22285

Closed

3 tasks

mgoin deleted the fail-invalid-attn branch August 5, 2025 22:50

npanpaliya pushed a commit to odh-on-pz/vllm-upstream that referenced this pull request Aug 6, 2025

[UX] Fail if an invalid attention backend is specified (vllm-project#…

b2772aa

…22217) Signed-off-by: mgoin <michael@neuralmagic.com>

myselvess pushed a commit to myselvess/vllm that referenced this pull request Aug 7, 2025

[UX] Fail if an invalid attention backend is specified (vllm-project#…

cef7bc7

…22217) Signed-off-by: mgoin <michael@neuralmagic.com>

jinzhen-lin pushed a commit to jinzhen-lin/vllm that referenced this pull request Aug 9, 2025

[UX] Fail if an invalid attention backend is specified (vllm-project#…

1368161

…22217) Signed-off-by: mgoin <michael@neuralmagic.com> Signed-off-by: Jinzhen Lin <linjinzhen@hotmail.com>

noamgat pushed a commit to noamgat/vllm that referenced this pull request Aug 9, 2025

[UX] Fail if an invalid attention backend is specified (vllm-project#…

bef3a1b

…22217) Signed-off-by: mgoin <michael@neuralmagic.com> Signed-off-by: Noam Gat <noamgat@gmail.com>

paulpak58 pushed a commit to paulpak58/vllm that referenced this pull request Aug 13, 2025

[UX] Fail if an invalid attention backend is specified (vllm-project#…

f0688ac

…22217) Signed-off-by: mgoin <michael@neuralmagic.com> Signed-off-by: Paul Pak <paulpak58@gmail.com>

diegocastanibm pushed a commit to diegocastanibm/vllm that referenced this pull request Aug 15, 2025

[UX] Fail if an invalid attention backend is specified (vllm-project#…

7a4bfbf

…22217) Signed-off-by: mgoin <michael@neuralmagic.com> Signed-off-by: Diego-Castan <diego.castan@ibm.com>

epwalsh pushed a commit to epwalsh/vllm that referenced this pull request Aug 28, 2025

[UX] Fail if an invalid attention backend is specified (vllm-project#…

1e7be4f

…22217) Signed-off-by: mgoin <michael@neuralmagic.com>

xiao-llm pushed a commit to xiao-llm/vllm that referenced this pull request Aug 28, 2025

[UX] Fail if an invalid attention backend is specified (vllm-project#…

9caef64

…22217) Signed-off-by: mgoin <michael@neuralmagic.com> Signed-off-by: Xiao Yu <xiao.yu@amd.com>

zhewenl pushed a commit to zhewenl/vllm that referenced this pull request Aug 28, 2025

[UX] Fail if an invalid attention backend is specified (vllm-project#…

a264d52

…22217) Signed-off-by: mgoin <michael@neuralmagic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[UX] Fail if an invalid attention backend is specified #22217

[UX] Fail if an invalid attention backend is specified #22217

Uh oh!

mgoin commented Aug 4, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Aug 4, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Aug 4, 2025

Uh oh!

DarkLight1337 left a comment

Uh oh!

Uh oh!

DarkLight1337 commented Aug 5, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

[UX] Fail if an invalid attention backend is specified #22217

[UX] Fail if an invalid attention backend is specified #22217

Uh oh!

Conversation

mgoin commented Aug 4, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

github-actions bot commented Aug 4, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Aug 4, 2025

Choose a reason for hiding this comment

Uh oh!

DarkLight1337 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

DarkLight1337 commented Aug 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mgoin commented Aug 4, 2025 •

edited by github-actions bot

Loading

DarkLight1337 commented Aug 5, 2025 •

edited

Loading