Skip to content

Conversation

@amd-hhashemi
Copy link
Contributor

@amd-hhashemi amd-hhashemi commented Sep 16, 2025

Purpose

Test Plan

Test Result


Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

@mergify mergify bot added the rocm Related to AMD ROCm label Sep 16, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for bias in skinny GEMM kernels for fp16 and bf16 data types. The changes are mostly correct, but there is a critical issue in how the optional bias tensor is handled. The current implementation will fail if None is passed for the bias from Python, and it is also vulnerable to a division-by-zero error if an empty tensor is passed. My review provides a set of fixes to correctly handle the optional bias tensor using c10::optional<at::Tensor>, which involves changes to the op registration, C++ function signatures, and bias handling logic.

Signed-off-by: Hashem Hashemi <hashem.hashemi@amd.com>
Signed-off-by: Hashem Hashemi <hashem.hashemi@amd.com>
Signed-off-by: Hashem Hashemi <hashem.hashemi@amd.com>
Signed-off-by: Hashem Hashemi <hashem.hashemi@amd.com>
Signed-off-by: Hashem Hashemi <hashem.hashemi@amd.com>
Signed-off-by: Hashem Hashemi <hashem.hashemi@amd.com>
Signed-off-by: Hashem Hashemi <hashem.hashemi@amd.com>
Signed-off-by: Hashem Hashemi <hashem.hashemi@amd.com>
@mergify
Copy link

mergify bot commented Sep 20, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @amd-hhashemi.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Sep 20, 2025
…ny gemm tests to be zero-centered, to avoid saturation and false passes.

Signed-off-by: Hashem Hashemi <hashem.hashemi@amd.com>
Signed-off-by: Hashem Hashemi <159079214+amd-hhashemi@users.noreply.github.com>
@mergify mergify bot removed the needs-rebase label Sep 20, 2025
Signed-off-by: Hashem Hashemi <hashem.hashemi@amd.com>
Signed-off-by: Hashem Hashemi <hashem.hashemi@amd.com>
Signed-off-by: Hashem Hashemi <hashem.hashemi@amd.com>
Signed-off-by: Hashem Hashemi <hashem.hashemi@amd.com>
@amd-hhashemi amd-hhashemi changed the title Add skinny gemm bias support for dtypes fp16,bf16 Add skinny gemm bias support for dtypes fp16,bf16,fp8 Sep 20, 2025
@gshtras gshtras added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 22, 2025
@gshtras gshtras enabled auto-merge (squash) September 22, 2025 14:52
amd-hhashemi and others added 2 commits September 22, 2025 09:21
Signed-off-by: Hashem Hashemi <hashem.hashemi@amd.com>
auto-merge was automatically disabled September 22, 2025 17:40

Head branch was pushed to by a user without write access

@gshtras gshtras enabled auto-merge (squash) September 22, 2025 19:20
Signed-off-by: Hashem Hashemi <hashem.hashemi@amd.com>
auto-merge was automatically disabled September 23, 2025 07:36

Head branch was pushed to by a user without write access

@gshtras gshtras changed the title Add skinny gemm bias support for dtypes fp16,bf16,fp8 [ROCm] Add skinny gemm bias support for dtypes fp16,bf16,fp8 Sep 23, 2025
@gshtras gshtras merged commit a3a7828 into vllm-project:main Sep 23, 2025
78 checks passed
FeiDaLI pushed a commit to FeiDaLI/vllm that referenced this pull request Sep 25, 2025
…oject#24988)

Signed-off-by: Hashem Hashemi <hashem.hashemi@amd.com>
Signed-off-by: Hashem Hashemi <159079214+amd-hhashemi@users.noreply.github.com>
yewentao256 pushed a commit that referenced this pull request Oct 3, 2025
Signed-off-by: Hashem Hashemi <hashem.hashemi@amd.com>
Signed-off-by: Hashem Hashemi <159079214+amd-hhashemi@users.noreply.github.com>
Signed-off-by: yewentao256 <zhyanwentao@126.com>
gjc0824 pushed a commit to gjc0824/vllm that referenced this pull request Oct 10, 2025
…oject#24988)

Signed-off-by: Hashem Hashemi <hashem.hashemi@amd.com>
Signed-off-by: Hashem Hashemi <159079214+amd-hhashemi@users.noreply.github.com>
Signed-off-by: gaojc <1055866782@qq.com>
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 10, 2025
…oject#24988)

Signed-off-by: Hashem Hashemi <hashem.hashemi@amd.com>
Signed-off-by: Hashem Hashemi <159079214+amd-hhashemi@users.noreply.github.com>
Signed-off-by: xuebwang-amd <xuebwang@amd.com>
choprahetarth pushed a commit to Tandemn-Labs/vllm that referenced this pull request Oct 11, 2025
…oject#24988)

Signed-off-by: Hashem Hashemi <hashem.hashemi@amd.com>
Signed-off-by: Hashem Hashemi <159079214+amd-hhashemi@users.noreply.github.com>
lywa1998 pushed a commit to lywa1998/vllm that referenced this pull request Oct 20, 2025
…oject#24988)

Signed-off-by: Hashem Hashemi <hashem.hashemi@amd.com>
Signed-off-by: Hashem Hashemi <159079214+amd-hhashemi@users.noreply.github.com>
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025
…oject#24988)

Signed-off-by: Hashem Hashemi <hashem.hashemi@amd.com>
Signed-off-by: Hashem Hashemi <159079214+amd-hhashemi@users.noreply.github.com>
Signed-off-by: xuebwang-amd <xuebwang@amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready ONLY add when PR is ready to merge/full CI is needed rocm Related to AMD ROCm

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants