-
-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[V1] Support Pixtral-HF on V1 #11409
base: main
Are you sure you want to change the base?
Conversation
👋 Hi! Thank you for contributing to the vLLM project. Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can do one of these:
🚀 |
cc @mgoin Currently I'm facing some difficulties on how to patch the result image embeddings tensors with the
and 2752 + 42 + 1 = 2795 For mistral-format Pixtral, this wasn't an issue because vllm/vllm/model_executor/models/pixtral.py Lines 227 to 256 in 72d9c31
|
|
Prepare for vllm-project#11409 For pixtral model, we need to insert placeholders in the middle of encoder output, to fit into whole soft embedding. This case makes slicing operation tricky. This PR raises assertion if something's off. Signed-off-by: Linkun Chen <github@lkchen.net>
Prepare for vllm-project#11409 For pixtral model, we need to insert placeholders in the middle of encoder output, to fit into whole soft embedding. This case makes slicing operation tricky. This PR raises assertion if something's off. Signed-off-by: Linkun Chen <github@lkchen.net>
Prepare for vllm-project#11409 For pixtral model, we need to insert placeholders in the middle of encoder output, to fit into whole soft embedding. This case makes slicing operation tricky. This PR raises assertion if something's off. Signed-off-by: Linkun Chen <github@lkchen.net>
Prepare for vllm-project#11409 For pixtral model, we need to insert placeholders in the middle of encoder output, to fit into whole soft embedding. This case makes slicing operation tricky. This PR raises assertion if something's off. Signed-off-by: Linkun Chen <github@lkchen.net>
Per my test, there are several cases:
I think the bug is from GpuModelRunner._gather_encoder_outputs where the offset calculation didn't take image break and image end token into consideration. This should be fixed with #13080 |
Prepare for vllm-project#11409 For pixtral model, we need to insert placeholders in the middle of encoder output, to fit into whole soft embedding. This case makes slicing operation tricky. This PR raises assertion if something's off. Signed-off-by: Linkun Chen <github@lkchen.net>
Support Transformers compatible Pixtral checkpoints on V1