-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[VLM] Merged multi-modal processor and V1 support for Qwen-VL #12504
Conversation
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
👋 Hi! Thank you for contributing to the vLLM project. Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can do one of these:
🚀 |
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
…roject#12504) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
…roject#12504) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: Isotr0py <2037008807@qq.com>
…roject#12504) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
(Not to be confused with Qwen2-VL (which is already supported) and Qwen2.5-VL (not supported yet))
I have tested V1 locally and the model output is basically the same, but not entirely sure why an extra EOS token is being outputted.
V0 output:
V1 output (used TP=4 to avoid OOM locally):
I'll work on updating the model development guide with an example of custom HF processor in another PR.