[XPU][bugfix] fix rope for llama4 and deepseek #25145

yma11 · 2025-09-18T07:00:47Z

Purpose

Fix more dispatch issue on xpu introduced in #24444

Test Plan

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

gemini-code-assist

Code Review

This pull request correctly fixes a dispatch issue for RoPE on XPU for the llama4 and deepseek models by providing a forward_xpu method that falls back to the native PyTorch implementation. This prevents the use of an incorrect specialized kernel from the base class. My review includes suggestions to refactor the newly added methods to reduce code duplication and improve maintainability, which will help prevent potential bugs in the future.

gemini-code-assist · 2025-09-18T07:02:33Z

vllm/model_executor/layers/rotary_embedding/deepseek_scaling_rope.py

+    def forward_xpu(
+        self,
+        positions: torch.Tensor,
+        query: torch.Tensor,
+        key: Optional[torch.Tensor] = None,
+        offsets: Optional[torch.Tensor] = None,
+    ) -> tuple[torch.Tensor, Optional[torch.Tensor]]:
+        return self.forward_native(positions, query, key, offsets)


To improve maintainability and reduce code duplication, you can directly alias forward_native to forward_xpu. The current implementation duplicates the body of forward_cuda, and both just delegate to forward_native. Using a direct assignment makes the intent clearer and ensures that any future changes to the signature of forward_native only need to be made in one place, reducing the risk of future bugs.

forward_xpu = forward_native

I think I understand root cause now.
before #24444, RotaryEmbedding will use forward_xpu() , its child class like Llama4VisionRotaryEmbedding, MRotaryEmbedding, DeepseekScalingRotaryEmbedding will use forward method directly, ignore parent class dispatch forward.
after #24444, all these child classes will extends parent class RotaryEmbedding 's forward_xpu method, which is not match.
Maybe a best fix is we define a BaseRotaryEmbedding class which not do any dispatch. and all rope extends this base class.

Yes. That's the root cause. This is already a base class introduced in ROPE refactor PR #22192. In this class, we have forward_xpu dispatch which will go to forward_native or ops.rotary_embedding and this default behavior makes sense. But for cases like Llama4VisionRotaryEmbedding, MRotaryEmbedding, DeepseekScalingRotaryEmbedding, our kernel doesn't support them and we need fix it at kernel level to avoid using forward_native at child class.

I like the Gemini suggestion here, could you try it out?

gemini-code-assist · 2025-09-18T07:02:33Z

vllm/model_executor/layers/rotary_embedding/llama4_vision_rope.py

+    def forward_xpu(  # type: ignore[override]
+        self,
+        query: torch.Tensor,
+        key: Optional[torch.Tensor] = None,
+    ) -> tuple[torch.Tensor, Optional[torch.Tensor]]:
+        return self.forward_native(query, key)


To avoid code duplication and enhance maintainability, it's better to alias forward_native for forward_xpu, as both this method and forward_cuda simply call forward_native. This approach is cleaner and less prone to errors if the underlying forward_native implementation or its signature changes in the future.

forward_xpu = forward_native # type: ignore[override]

xuechendi · 2025-09-19T15:43:08Z

May you add a description to explain current fixing.
Looks like currently,This PR is to copy forward_xpu in all derived ROPE classes, right?
And based on discussion, better plan to provide a non_custom_op base class and derive from that?

ProExpertProg · 2025-09-19T23:22:10Z

I think a BaseRoPE class makes sense!

frost-intel · 2025-10-14T13:06:47Z

@yma11 Any movement on this? Would love to have Llama4 functional here.

yma11 · 2025-10-15T02:30:46Z

@yma11 Any movement on this? Would love to have Llama4 functional here.

updated based on comments. let's wait for CI result.

jikunshang · 2025-10-15T02:57:56Z

do we need consider other rope class in this folder?

yma11 · 2025-10-15T03:06:55Z

do we need consider other rope class in this folder?

should all be covered.

yma11 · 2025-10-16T02:18:49Z

I think a BaseRoPE class makes sense!

@ProExpertProg can you help review this PR again? A base class is added.

Signed-off-by: Yan Ma <yan.ma@intel.com>

Signed-off-by: Yan Ma <yan.ma@intel.com> Signed-off-by: Eldar Kurtic <8884008+eldarkurtic@users.noreply.github.com>

mergify bot added deepseek Related to DeepSeek models llama Related to Llama models labels Sep 18, 2025

gemini-code-assist bot reviewed Sep 18, 2025

View reviewed changes

yma11 force-pushed the rope-fix branch 4 times, most recently from f23ca75 to a010ec7 Compare October 15, 2025 02:29

yma11 force-pushed the rope-fix branch from a010ec7 to 0b92158 Compare October 15, 2025 02:36

yma11 force-pushed the rope-fix branch from daddc7d to 0b92158 Compare October 15, 2025 03:22

yma11 force-pushed the rope-fix branch from 0b92158 to d2f9129 Compare October 24, 2025 06:30

ProExpertProg approved these changes Oct 24, 2025

View reviewed changes

jikunshang enabled auto-merge (squash) October 27, 2025 02:08

jikunshang added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 27, 2025

xuechendi approved these changes Oct 27, 2025

View reviewed changes

auto-merge was automatically disabled October 28, 2025 02:18
Head branch was pushed to by a user without write access

yma11 force-pushed the rope-fix branch 2 times, most recently from ae13379 to a9d1af1 Compare October 29, 2025 02:17

yma11 added 2 commits October 29, 2025 08:12

[XPU][bugfix] fix rope for llama4 and deepseek

2ca339a

Signed-off-by: Yan Ma <yan.ma@intel.com>

refactor to avoid incorrect rope dispatch

dba39a5

Signed-off-by: Yan Ma <yan.ma@intel.com>

yma11 force-pushed the rope-fix branch from a9d1af1 to dba39a5 Compare October 29, 2025 08:12

jikunshang merged commit b798e39 into vllm-project:main Oct 30, 2025
47 checks passed

bigPYJ1151 mentioned this pull request Oct 30, 2025

[Bugfix][CPU] Fix MRoPE dispatch on the CPU backend #27800

Merged

5 tasks

MatthewBonanni pushed a commit to MatthewBonanni/vllm that referenced this pull request Oct 30, 2025

[XPU][bugfix] fix rope for llama4 and deepseek (vllm-project#25145)

ab079dd

Signed-off-by: Yan Ma <yan.ma@intel.com>

ilmarkov pushed a commit to neuralmagic/vllm that referenced this pull request Nov 7, 2025

[XPU][bugfix] fix rope for llama4 and deepseek (vllm-project#25145)

357a481

Signed-off-by: Yan Ma <yan.ma@intel.com>

ZhengHongming888 pushed a commit to ZhengHongming888/vllm that referenced this pull request Nov 8, 2025

[XPU][bugfix] fix rope for llama4 and deepseek (vllm-project#25145)

c89ec12

Signed-off-by: Yan Ma <yan.ma@intel.com>

rtourgeman pushed a commit to rtourgeman/vllm that referenced this pull request Nov 10, 2025

[XPU][bugfix] fix rope for llama4 and deepseek (vllm-project#25145)

a15ddbf

Signed-off-by: Yan Ma <yan.ma@intel.com>

eldarkurtic pushed a commit to eldarkurtic/vllm that referenced this pull request Nov 12, 2025

[XPU][bugfix] fix rope for llama4 and deepseek (vllm-project#25145)

08f182b

Signed-off-by: Yan Ma <yan.ma@intel.com> Signed-off-by: Eldar Kurtic <8884008+eldarkurtic@users.noreply.github.com>

Uh oh!

[XPU][bugfix] fix rope for llama4 and deepseek #25145

[XPU][bugfix] fix rope for llama4 and deepseek #25145

Uh oh!

Conversation

yma11 commented Sep 18, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

jikunshang Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

yma11 Sep 19, 2025

Choose a reason for hiding this comment

Uh oh!

ProExpertProg Sep 19, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

xuechendi commented Sep 19, 2025

Uh oh!

ProExpertProg commented Sep 19, 2025

Uh oh!

frost-intel commented Oct 14, 2025

Uh oh!

yma11 commented Oct 15, 2025

Uh oh!

jikunshang commented Oct 15, 2025

Uh oh!

yma11 commented Oct 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yma11 commented Oct 16, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

yma11 commented Sep 18, 2025 •

edited by github-actions bot

Loading

yma11 commented Oct 15, 2025 •

edited

Loading