-
-
Notifications
You must be signed in to change notification settings - Fork 11.4k
[Bugfix][ROCm][DeepSeek] Fix for forward_hip in rope for DeepSeek #27373
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…-place Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request addresses a bug in the forward_hip method within the RotaryEmbedding class for ROCm platforms. The previous implementation incorrectly assumed that its fallback to forward_cuda would always involve an in-place modification of the query and key tensors. This assumption fails for certain implementations like DeepSeek's deepseek_scaling_rope, which do not modify tensors in-place. The fix rectifies this by returning the result from forward_cuda directly, making the forward_hip method robust to both in-place and out-of-place forward_cuda implementations. This change is a clear and correct fix that enhances the reliability of the rotary embedding layer.
SageMoore
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this is an inheritance bug where the subclass overwrites forward_cuda and uses an out-of-place implementation? Nice find.
|
@gshtras A question, why not choose to override at the def forward_hip(
self,
positions: torch.Tensor,
query: torch.Tensor,
key: torch.Tensor | None = None,
offsets: torch.Tensor | None = None,
) -> tuple[torch.Tensor, torch.Tensor | None]:
return self.forward_native(positions, query, key, offsets)Is the vllm/vllm/model_executor/layers/rotary_embedding/deepseek_scaling_rope.py Lines 108 to 147 in 3fa2c12
CC @SageMoore |
…lm-project#27373) Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com> Signed-off-by: Alberto Perdomo <aperdomo@redhat.com>
…lm-project#27373) Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>
…lm-project#27373) Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com> Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
…lm-project#27373) Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com> Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
…lm-project#27373) Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>
…lm-project#27373) Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>
Starting with #25135 Rope's forward_hip falls back to forward_cuda while assuming that the function modifies the values in place, which is not the case for DeepSeek's deepseek_scaling_rope implementation.
Instead of relying on an in-place modification we should return the values from the forward function implementation.