Skip to content

Conversation

@Lucaskabela
Copy link
Contributor

@Lucaskabela Lucaskabela commented Oct 9, 2025

Purpose

#25696 converts GroupShape to list where possible for cutlass custom ops; however, this was not done for the aiter or triton impl, which causes the code to fail in dynamo tracing

Test Plan

Run DeepseekR1-0528 on AMD hardware with FP8 kernels

 FLASH_ATTENTION_TRITON_AMD_ENABLE=TRUE VLLM_USE_V1=1 VLLM_MLA_DISABLE=0 VLLM_FP8_PADDING=1 VLLM_USE_TRITON_FLASH_ATTN=1 VLLM_USE_ROCM_FP8_FLASH_ATTN=0 HSA_NO_SCRATCH_RECLAIM=1 
VLLM_USE_STANDALONE_COMPILE=1 with-proxy python examples/offline_inference/basic/generate.py --model=deepseek-ai/DeepSeek-R1-0528 --max-model-len=1024 -tp=8

Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

@Lucaskabela
Copy link
Contributor Author

@Lucaskabela Lucaskabela marked this pull request as ready for review October 9, 2025 21:14
@bradleyhd
Copy link
Contributor

can confirm this unblocks our internal AMD pipeline, thanks @Lucaskabela !

input_scale,
weight_scale,
self.weight_group_shape,
list(self.weight_group_shape),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to add a test somehow? I don't know how vLLM CI runs amd tests

@zou3519 zou3519 requested a review from ProExpertProg October 9, 2025 21:49
@zou3519 zou3519 added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 9, 2025
@zou3519
Copy link
Collaborator

zou3519 commented Oct 9, 2025

If someone has an idea for how to write a test for this please shout (I'm not very good with how AMD works in vLLM CI), otherwise this is pretty self contained and we verified that it fixes a compile regression

Copy link
Member

@yewentao256 yewentao256 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the work!

Copy link
Collaborator

@ProExpertProg ProExpertProg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry we missed this initially, thanks for fixing! AMD testing is in pretty poor state cc @Alexei-V-Ivanov-AMD @gshtras

@zou3519 zou3519 enabled auto-merge (squash) October 10, 2025 11:14
Signed-off-by: Lucas Kabela <lucaskabela@meta.com>
@zou3519 zou3519 force-pushed the lucaskabela/quant_dynamo_amd_fix branch from c9a82a6 to 7206439 Compare October 10, 2025 11:21
@zou3519 zou3519 merged commit 213b644 into vllm-project:main Oct 10, 2025
52 checks passed
Dhruvilbhatt pushed a commit to Dhruvilbhatt/vllm that referenced this pull request Oct 14, 2025
…oject#26535)

Signed-off-by: Lucas Kabela <lucaskabela@meta.com>
Signed-off-by: Dhruvil Bhatt <bhattdbh@amazon.com>
bbartels pushed a commit to bbartels/vllm that referenced this pull request Oct 16, 2025
…oject#26535)

Signed-off-by: Lucas Kabela <lucaskabela@meta.com>
Signed-off-by: bbartels <benjamin@bartels.dev>
lywa1998 pushed a commit to lywa1998/vllm that referenced this pull request Oct 20, 2025
alhridoy pushed a commit to alhridoy/vllm that referenced this pull request Oct 24, 2025
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025
…oject#26535)

Signed-off-by: Lucas Kabela <lucaskabela@meta.com>
Signed-off-by: xuebwang-amd <xuebwang@amd.com>
0xrushi pushed a commit to 0xrushi/vllm that referenced this pull request Oct 26, 2025
…oject#26535)

Signed-off-by: Lucas Kabela <lucaskabela@meta.com>
Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
0xrushi pushed a commit to 0xrushi/vllm that referenced this pull request Oct 26, 2025
…oject#26535)

Signed-off-by: Lucas Kabela <lucaskabela@meta.com>
Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready ONLY add when PR is ready to merge/full CI is needed rocm Related to AMD ROCm

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants