Skip to content

Conversation

@angelayi
Copy link
Contributor

@angelayi angelayi commented Oct 11, 2025

When trying to run the following cmd, I ran into this assertion error because self.backend is equal to "".

vllm bench latency \
    --model=RedHatAI/Meta-Llama-3.1-8B-Instruct-FP8 \
    --output-len 1 --input-len 8192 --batch-size 1 \
    --tensor-parallel-size 8 --load-format dummy \
    --num_iters_warmup 5 --num_iters 15 \
    -O '{"level": 3, "pass_config": {"enable_async_tp": true, "enable_sequence_parallelism": true}, "use_inductor_graph_partition": true, "custom_ops":["+quant_fp8"], "cudagraph_mode":"FULL_AND_PIECEWISE"}' \
    --no-enable-prefix-caching

cc @ProExpertProg @zou3519 @BoyuanFeng @baonudesifeizhai

Signed-off-by: angelayi <yiangela7@gmail.com>
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly fixes a bug in the inductor partition configuration. The is_attention_compiled_piecewise method was previously checking self.backend == "inductor", which would incorrectly evaluate to false when the default backend (an empty string) is used with inductor. The change to check self.use_inductor is the correct approach, as this flag accurately indicates whether inductor compilation is enabled. This resolves the assertion error described in the pull request.

Copy link
Collaborator

@ProExpertProg ProExpertProg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, bad merge of #25845 after #26113 was reverted in #26472. @morrison-turnansky can you fix in #26502?

@ProExpertProg ProExpertProg enabled auto-merge (squash) October 11, 2025 19:12
@github-actions github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 11, 2025
@ProExpertProg ProExpertProg merged commit 01653a9 into vllm-project:main Oct 11, 2025
48 checks passed
@morrison-turnansky
Copy link
Contributor

@ProExpertProg yes, will do

1994 pushed a commit to 1994/vllm that referenced this pull request Oct 14, 2025
Signed-off-by: angelayi <yiangela7@gmail.com>
Signed-off-by: 1994 <1994@users.noreply.github.com>
Dhruvilbhatt pushed a commit to Dhruvilbhatt/vllm that referenced this pull request Oct 14, 2025
Signed-off-by: angelayi <yiangela7@gmail.com>
Signed-off-by: Dhruvil Bhatt <bhattdbh@amazon.com>
bbartels pushed a commit to bbartels/vllm that referenced this pull request Oct 16, 2025
Signed-off-by: angelayi <yiangela7@gmail.com>
Signed-off-by: bbartels <benjamin@bartels.dev>
lywa1998 pushed a commit to lywa1998/vllm that referenced this pull request Oct 20, 2025
Signed-off-by: angelayi <yiangela7@gmail.com>
alhridoy pushed a commit to alhridoy/vllm that referenced this pull request Oct 24, 2025
Signed-off-by: angelayi <yiangela7@gmail.com>
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025
Signed-off-by: angelayi <yiangela7@gmail.com>
Signed-off-by: xuebwang-amd <xuebwang@amd.com>
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025
Signed-off-by: angelayi <yiangela7@gmail.com>
Signed-off-by: xuebwang-amd <xuebwang@amd.com>
0xrushi pushed a commit to 0xrushi/vllm that referenced this pull request Oct 26, 2025
Signed-off-by: angelayi <yiangela7@gmail.com>
Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
0xrushi pushed a commit to 0xrushi/vllm that referenced this pull request Oct 26, 2025
Signed-off-by: angelayi <yiangela7@gmail.com>
Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants