Delete useless allgather in qwen2_5_vl vit attention #21493

alwayshope25 · 2025-07-24T03:18:41Z

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.

Purpose

The allgator+split_tensor-alond_last-dim operation after qkv fusion calculation can be optimized to split according to the last dimension. Redundant communication can be removed to improve inference performance.

Test Plan

vllm serve --model=/home/jenkins/Qwen2.5-VL-3B-Instruct/ --trust_remote_code --tensor_parallel_size=8 --max_model_len=32768 --max-num-seqs 64

Test Result

1080P picture，parallel-num=64 output-tokens=256，verify the tokens throughput

before optimization

after optimization

(Optional) Documentation Update

gemini-code-assist

Code Review

This pull request addresses a performance issue in the Qwen2_5_VisionAttention module by removing a redundant all_gather operation. The original implementation performed an all-gather on a tensor-parallel sharded QKV tensor, only to immediately split it back, which is an expensive no-op. The change correctly removes this unnecessary communication overhead, leading to better performance in distributed settings. The code is now simpler and more efficient. The change is correct and well-justified.

github-actions · 2025-07-24T03:29:26Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

DarkLight1337

cc @fyabc can you validate this?

ywang96

I vaguely remember this allgather was added on purpose due to some issue seeing from supporting Qwen2.5 Omni back then that broke Qwen2.5VL.

Could you verify if both models work under all TP scenarios? Thank you!

DarkLight1337 · 2025-07-24T14:28:41Z

See #16907, #16974

zejunchen-zejun · 2025-07-25T09:17:06Z

It makes sense. Each TP rank has already had the associated weight pieces of the WQ WK WV, so the activation is the associated part of q, k and v, so there is no need to do all gather for interleaved concat the activation.

github-actions · 2025-10-24T02:02:59Z

This pull request has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this pull request should remain open. Thank you!

mergify bot added the qwen Related to Qwen models label Jul 24, 2025

gemini-code-assist bot reviewed Jul 24, 2025

View reviewed changes

alwayshope25 force-pushed the main branch from 31e101f to f609be1 Compare July 24, 2025 03:43

Delete useless allgather in qwen2_5_vl vit attention

48ccf7e

alwayshope25 force-pushed the main branch from f609be1 to 48ccf7e Compare July 24, 2025 03:47

DarkLight1337 reviewed Jul 24, 2025

View reviewed changes

ywang96 reviewed Jul 24, 2025

View reviewed changes

github-actions bot added the stale Over 90 days of inactivity label Oct 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Delete useless allgather in qwen2_5_vl vit attention #21493

Delete useless allgather in qwen2_5_vl vit attention #21493

alwayshope25 commented Jul 24, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

github-actions bot commented Jul 24, 2025

Uh oh!

DarkLight1337 left a comment

Uh oh!

ywang96 left a comment •

edited

Loading

Uh oh!

DarkLight1337 commented Jul 24, 2025

Uh oh!

zejunchen-zejun commented Jul 25, 2025

Uh oh!

github-actions bot commented Oct 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Delete useless allgather in qwen2_5_vl vit attention #21493

Are you sure you want to change the base?

Delete useless allgather in qwen2_5_vl vit attention #21493

Conversation

alwayshope25 commented Jul 24, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Essential Elements of an Effective PR Description Checklist

Purpose

Test Plan

Test Result

(Optional) Documentation Update

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

github-actions bot commented Jul 24, 2025

Uh oh!

DarkLight1337 left a comment

Choose a reason for hiding this comment

Uh oh!

ywang96 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

DarkLight1337 commented Jul 24, 2025

Uh oh!

zejunchen-zejun commented Jul 25, 2025

Uh oh!

github-actions bot commented Oct 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

alwayshope25 commented Jul 24, 2025 •

edited by github-actions bot

Loading

ywang96 left a comment •

edited

Loading