Skip to content

Conversation

@yma11
Copy link
Contributor

@yma11 yma11 commented Sep 12, 2025

Purpose

Fix MRoPE dispatch issue on xpu introduced in #24444

Test Plan

VLLM_ALLOW_LONG_MAX_MODEL_LEN=1 VLLM_WORKER_MULTIPROC_METHOD=spawn python3 examples/offline_inference/basic/generate.py --model Qwen/Qwen2.5-VL-7B-Instruct --enforce-eager

Test Result


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: Yan Ma <yan.ma@intel.com>
@yma11
Copy link
Contributor Author

yma11 commented Sep 12, 2025

@jikunshang please take a review.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a dispatch issue for MRoPE (Multimodal Rotary Position Embedding) on XPU devices. The PR description mentions CPU, but the change correctly targets XPU. The MRotaryEmbedding class was incorrectly inheriting the forward_xpu method from its RotaryEmbedding base class. The base implementation lacks support for the specific logic required for multimodal inputs (where positions.ndim == 2), which would lead to incorrect behavior.

The fix introduces a forward_xpu method in the MRotaryEmbedding class that dispatches to its own forward_native implementation. This is the correct approach, as MRotaryEmbedding.forward_native contains the necessary logic to handle multimodal inputs, ensuring correct functionality on XPU devices. This change aligns the XPU implementation with the existing CPU fallback, providing a correct execution path. The change is sound and resolves the bug.

key = torch.cat((key_rot, key_pass), dim=-1).reshape(key_shape)
return query, key

def forward_xpu(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what will xpu call without this?

Copy link
Contributor Author

@yma11 yma11 Sep 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/vllm-project/vllm/blob/main/vllm/model_executor/layers/rotary_embedding/base.py#L122, I think this is the expected path but there will be tensor mismatch error when calling the kernel and strange that previously we don't run in this path. I will take a look at this further, but need this fix to unblock Qwen2.5 VL.

@jikunshang jikunshang added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 12, 2025
@DarkLight1337 DarkLight1337 merged commit 4d7c1d5 into vllm-project:main Sep 12, 2025
55 checks passed
skyloevil pushed a commit to skyloevil/vllm that referenced this pull request Sep 13, 2025
Signed-off-by: Yan Ma <yan.ma@intel.com>
MengqingCao pushed a commit to MengqingCao/vllm that referenced this pull request Sep 13, 2025
Signed-off-by: Yan Ma <yan.ma@intel.com>
dsxsteven pushed a commit to dsxsteven/vllm_splitPR that referenced this pull request Sep 15, 2025
Signed-off-by: Yan Ma <yan.ma@intel.com>
FeiDaLI pushed a commit to FeiDaLI/vllm that referenced this pull request Sep 25, 2025
Signed-off-by: Yan Ma <yan.ma@intel.com>
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 10, 2025
Signed-off-by: Yan Ma <yan.ma@intel.com>
Signed-off-by: xuebwang-amd <xuebwang@amd.com>
choprahetarth pushed a commit to Tandemn-Labs/vllm that referenced this pull request Oct 11, 2025
Signed-off-by: Yan Ma <yan.ma@intel.com>
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025
Signed-off-by: Yan Ma <yan.ma@intel.com>
Signed-off-by: xuebwang-amd <xuebwang@amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants