[VLM] Merged multi-modal processor and V1 support for Qwen-VL #12504

DarkLight1337 · 2025-01-28T08:32:08Z

(Not to be confused with Qwen2-VL (which is already supported) and Qwen2.5-VL (not supported yet))

I have tested V1 locally and the model output is basically the same, but not entirely sure why an extra EOS token is being outputted.

V0 output:

# python examples/offline_inference/vision_language.py -m qwen_vl
The Tokyo Skytree tower is seen through cherry blossoms.
The Tokyo Skytree in the background with cherry blossoms in the foreground
The Tokyo Skytree is a must-see during cherry blossom season in Tokyo.
The Tokyo Skytree is seen through cherry blossoms in Tokyo, Japan.

V1 output (used TP=4 to avoid OOM locally):

# VLLM_USE_V1=1 python examples/offline_inference/vision_language.py -m qwen_vl
The Tokyo Skytree tower is seen through cherry blossoms.<|endoftext|>
The Tokyo Skytree is a must-see during cherry blossom season in Tokyo<|endoftext|>
The Tokyo Skytree is a must-see in Tokyo in spring<|endoftext|>
The Tokyo Skytree is seen through cherry blossoms in Tokyo, Japan.<|endoftext|>

I'll work on updating the model development guide with an example of custom HF processor in another PR.

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

github-actions · 2025-01-28T08:32:19Z

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can do one of these:

Add ready label to the PR
Enable auto-merge.

🚀

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

…roject#12504) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

…roject#12504) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: Isotr0py <2037008807@qq.com>

…roject#12504) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

DarkLight1337 added 3 commits January 28, 2025 08:25

Merged multi-modal processor for Qwen-VL

9298780

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

Update tests

65ff83d

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

Update docs

15613c7

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

DarkLight1337 requested a review from Isotr0py January 28, 2025 08:32

DarkLight1337 requested a review from ywang96 as a code owner January 28, 2025 08:32

mergify bot added the documentation Improvements or additions to documentation label Jan 28, 2025

Remove unnecessary line change

e34fc48

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

This was referenced Jan 28, 2025

[RFC]: Multi-modality Support on vLLM #4194

Open

[RFC]: Merge input processor and input mapper for multi-modal models #10114

Open

DarkLight1337 added 2 commits January 28, 2025 08:42

Fix trust remote code

f944d4c

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

Fix decode error in tokenizer

b683f5e

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

Isotr0py approved these changes Jan 28, 2025

View reviewed changes

Isotr0py enabled auto-merge (squash) January 28, 2025 09:48

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Jan 28, 2025

Ignore URLs

6ecef8b

Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

Isotr0py merged commit 8f58a51 into vllm-project:main Jan 28, 2025
50 of 51 checks passed

DarkLight1337 deleted the qwenvl-v1 branch January 28, 2025 16:25

DarkLight1337 mentioned this pull request Jan 29, 2025

[Bug]: Load a custom model when VLLM_USE_V1=1 #12533

Closed

1 task

rasmith pushed a commit to rasmith/vllm that referenced this pull request Jan 30, 2025

[VLM] Merged multi-modal processor and V1 support for Qwen-VL (vllm-p…

cac900b

…roject#12504) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

Isotr0py pushed a commit to Isotr0py/vllm that referenced this pull request Feb 2, 2025

[VLM] Merged multi-modal processor and V1 support for Qwen-VL (vllm-p…

99d1b38

…roject#12504) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: Isotr0py <2037008807@qq.com>

NickLucche pushed a commit to NickLucche/vllm that referenced this pull request Feb 7, 2025

[VLM] Merged multi-modal processor and V1 support for Qwen-VL (vllm-p…

366099f

…roject#12504) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[VLM] Merged multi-modal processor and V1 support for Qwen-VL #12504

[VLM] Merged multi-modal processor and V1 support for Qwen-VL #12504

DarkLight1337 commented Jan 28, 2025 •

edited by github-actions bot

Loading

github-actions bot commented Jan 28, 2025

[VLM] Merged multi-modal processor and V1 support for Qwen-VL #12504

[VLM] Merged multi-modal processor and V1 support for Qwen-VL #12504

Conversation

DarkLight1337 commented Jan 28, 2025 • edited by github-actions bot Loading

github-actions bot commented Jan 28, 2025

DarkLight1337 commented Jan 28, 2025 •

edited by github-actions bot

Loading