[Model] VLM2Vec, the first multimodal embedding model in vLLM #9303

DarkLight1337 · 2024-10-12T00:35:03Z

Support VLM2Vec embedding model by TIGER-Lab.

This is a low-hanging fruit as the model architecture is exactly the same as Phi3V.

Future works, in order of priority:

Add CLI option to specify whether to use a model for generation or embedding so we don't have to hardcode the model name when the same model architecture can be used for both.
Add multimodal embedding API for OpenAI-compatible server
More multimodal embedding models, e.g. E5-V which can be similarly supported with our existing LLaVA-NeXT implementation

@jeejeelee are you available to help provide LoRA support for this model after this PR?

github-actions · 2024-10-12T00:35:17Z

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can do one of these:

Add ready label to the PR
Enable auto-merge.

🚀

jeejeelee · 2024-10-12T05:36:25Z

Okay, I'd be happy to try supporting LoRA.

DarkLight1337 · 2024-10-12T13:13:55Z

vllm/model_executor/models/gemma2.py

@@ -461,3 +463,50 @@ def load_weights(self, weights: Iterable[Tuple[str, torch.Tensor]]):
                           if self.config.tie_word_embeddings else None),
        )
        loader.load_weights(weights)
+
+
+class Gemma2EmbeddingModel(nn.Module, SupportsPP):


I'm preemptively moving them into the same file to be consistent with the upcoming BERT PR. (#9056)

wenhuchen · 2024-10-14T16:02:25Z

Nice work! There were some typos in our paper. We actually used last token instead eos token representation. I saw that you already used the last token as the representation, which is the correct implementation.

Isotr0py

LGTM. Especially the implementation correctness is checked by model vendors :)

Isotr0py · 2024-10-16T06:39:31Z

Ooops, seems that the new added vision embedding test didn't include in test-pipeline:

vllm/.buildkite/test-pipeline.yaml

Lines 337 to 349 in 7abba39

    
           - label: Other Models Test # 6min 
        
             #mirror_hardwares: [amd] 
        
             source_file_dependencies: 
        
             - vllm/ 
        
             - tests/models/embedding/language 
        
             - tests/models/encoder_decoder/language 
        
             - tests/models/encoder_decoder/vision_language 
        
             commands: 
        
               - pytest -v -s models/embedding/language 
        
               - pytest -v -s models/encoder_decoder/language 
        
               - pytest -v -s models/encoder_decoder/vision_language

We might need to open another PR to include it.

DarkLight1337 · 2024-10-16T06:46:07Z

Ooops, seems that the new added vision embedding test didn't include in test-pipeline:

vllm/.buildkite/test-pipeline.yaml

Lines 337 to 349 in 7abba39

- label: Other Models Test # 6min

#mirror_hardwares: [amd]

source_file_dependencies:

- vllm/

- tests/models/embedding/language

- tests/models/encoder_decoder/language

- tests/models/encoder_decoder/vision_language

commands:

- pytest -v -s models/embedding/language

- pytest -v -s models/encoder_decoder/language

- pytest -v -s models/encoder_decoder/vision_language

We might need to open another PR to include it.

Nice catch, I have opened #9406

…roject#9303) Signed-off-by: charlifu <charlifu@amd.com>

…roject#9303) Signed-off-by: Vinay Damodaran <vrdn@hey.com>

jvlinsta · 2024-10-25T11:44:01Z

What will it take / when would it be expected for online engine support for this? ^^

DarkLight1337 · 2024-10-25T11:54:43Z

What will it take / when would it be expected for online engine support for this? ^^

Maybe sometime in the next two weeks.

…roject#9303) Signed-off-by: Alvant <alvasian@yandex.ru>

…roject#9303) Signed-off-by: Amit Garg <mitgarg17495@gmail.com>

…roject#9303) Signed-off-by: qishuai <ferdinandzhong@gmail.com>

DarkLight1337 · 2024-11-01T08:15:41Z

What will it take / when would it be expected for online engine support for this? ^^

Maybe sometime in the next two weeks.

Quick heads-up that it's done now!

wenhuchen · 2024-11-01T14:11:37Z

When will the LoRA version be online? Actually, the LoRA version works better.

DarkLight1337 · 2024-11-01T14:25:18Z

When will the LoRA version be online? Actually, the LoRA version works better.

Currently, vLLM only supports LoRA for the language backbone of VLMs - some re-arch work is necessary to extend this to the vision encoder. @jeejeelee do you have a timeframe regarding this?

wenhuchen · 2024-11-01T14:31:43Z

When will the LoRA version be online? Actually, the LoRA version works better.

Currently, vLLM only supports LoRA for the language backbone of VLMs - some re-arch work is necessary to extend this to the vision encoder. @jeejeelee do you have a timeframe regarding this?

I guess you meant "vLLM doesn't support LoRA" instead of "vLLM only support LoRA"?

DarkLight1337 · 2024-11-01T14:37:29Z

I guess you meant "vLLM doesn't support LoRA" instead of "vLLM only support LoRA"?

I mean that vLLM supports LoRA on the language backbone, but not LoRA on the vision encoder of VLMs.

wenhuchen · 2024-11-01T14:39:13Z

Got it. Thanks for the explanation!

DarkLight1337 added 2 commits October 12, 2024 00:31

Move embedding models

e04844f

Enable embedding for Phi3V

15a330f

DarkLight1337 requested a review from ywang96 October 12, 2024 00:35

DarkLight1337 added 2 commits October 12, 2024 01:08

Fix unable to load model

11fbbd4

Fix regular Phi3V loading

cd11ccb

DarkLight1337 force-pushed the vlm2vec branch from 73f8571 to cd11ccb Compare October 12, 2024 01:22

DarkLight1337 added 2 commits October 12, 2024 01:30

Add example

b7a4055

format

f2af818

Fix processor not considering image token at start

96a028c

DarkLight1337 force-pushed the vlm2vec branch from b3aaa43 to 3bdc9d5 Compare October 12, 2024 07:02

Update docs

4f6b1a8

DarkLight1337 force-pushed the vlm2vec branch from 3bdc9d5 to 4f6b1a8 Compare October 12, 2024 07:12

DarkLight1337 marked this pull request as ready for review October 12, 2024 07:13

Add correctness test

28c627e

DarkLight1337 added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 12, 2024

Remove print

2cca757

DarkLight1337 commented Oct 12, 2024

View reviewed changes

DarkLight1337 added 3 commits October 12, 2024 13:14

format

cdcb3df

Fix import

7ee72d7

Fix test

ef722b1

This was referenced Oct 12, 2024

[RFC]: Let every model be a reward model/embedding model for PRMs #9314

Closed

[RFC]: Multi-modality Support Refactoring #4194

Open

Isotr0py approved these changes Oct 16, 2024

View reviewed changes

DarkLight1337 merged commit 7abba39 into main Oct 16, 2024
56 checks passed

DarkLight1337 deleted the vlm2vec branch October 16, 2024 06:31

DarkLight1337 mentioned this pull request Oct 16, 2024

[CI/Build] Test VLM embeddings #9406

Merged

DarkLight1337 mentioned this pull request Oct 16, 2024

[Model] Add user-configurable task for models that support both generation and embedding #9424

Merged

charlifu pushed a commit to charlifu/vllm that referenced this pull request Oct 23, 2024

[Model] VLM2Vec, the first multimodal embedding model in vLLM (vllm-p…

696373a

…roject#9303) Signed-off-by: charlifu <charlifu@amd.com>

vrdn-23 pushed a commit to vrdn-23/vllm that referenced this pull request Oct 23, 2024

[Model] VLM2Vec, the first multimodal embedding model in vLLM (vllm-p…

639e509

…roject#9303) Signed-off-by: Vinay Damodaran <vrdn@hey.com>

Alvant pushed a commit to compressa-ai/vllm that referenced this pull request Oct 26, 2024

[Model] VLM2Vec, the first multimodal embedding model in vLLM (vllm-p…

c913237

…roject#9303) Signed-off-by: Alvant <alvasian@yandex.ru>

garg-amit pushed a commit to garg-amit/vllm that referenced this pull request Oct 28, 2024

[Model] VLM2Vec, the first multimodal embedding model in vLLM (vllm-p…

911c07c

…roject#9303) Signed-off-by: Amit Garg <mitgarg17495@gmail.com>

DarkLight1337 mentioned this pull request Oct 28, 2024

[Frontend] Chat-based Embeddings API #9759

Merged

FerdinandZhong pushed a commit to FerdinandZhong/vllm that referenced this pull request Oct 29, 2024

[Model] VLM2Vec, the first multimodal embedding model in vLLM (vllm-p…

8bddc36

…roject#9303) Signed-off-by: qishuai <ferdinandzhong@gmail.com>

FurtherAI mentioned this pull request Nov 2, 2024

[Model] Adding Support for Qwen2VL as an Embedding Model. Using MrLight/dse-qwen2-2b-mrl-v1 #9944

Open

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Model] VLM2Vec, the first multimodal embedding model in vLLM #9303

[Model] VLM2Vec, the first multimodal embedding model in vLLM #9303

DarkLight1337 commented Oct 12, 2024 •

edited

Loading

github-actions bot commented Oct 12, 2024

jeejeelee commented Oct 12, 2024

DarkLight1337 Oct 12, 2024

wenhuchen commented Oct 14, 2024

Isotr0py left a comment

Isotr0py commented Oct 16, 2024

DarkLight1337 commented Oct 16, 2024

jvlinsta commented Oct 25, 2024

DarkLight1337 commented Oct 25, 2024

DarkLight1337 commented Nov 1, 2024

wenhuchen commented Nov 1, 2024

DarkLight1337 commented Nov 1, 2024 •

edited

Loading

wenhuchen commented Nov 1, 2024

DarkLight1337 commented Nov 1, 2024

wenhuchen commented Nov 1, 2024

[Model] VLM2Vec, the first multimodal embedding model in vLLM #9303

[Model] VLM2Vec, the first multimodal embedding model in vLLM #9303

Conversation

DarkLight1337 commented Oct 12, 2024 • edited Loading

github-actions bot commented Oct 12, 2024

jeejeelee commented Oct 12, 2024

DarkLight1337 Oct 12, 2024

Choose a reason for hiding this comment

wenhuchen commented Oct 14, 2024

Isotr0py left a comment

Choose a reason for hiding this comment

Isotr0py commented Oct 16, 2024

DarkLight1337 commented Oct 16, 2024

jvlinsta commented Oct 25, 2024

DarkLight1337 commented Oct 25, 2024

DarkLight1337 commented Nov 1, 2024

wenhuchen commented Nov 1, 2024

DarkLight1337 commented Nov 1, 2024 • edited Loading

wenhuchen commented Nov 1, 2024

DarkLight1337 commented Nov 1, 2024

wenhuchen commented Nov 1, 2024

DarkLight1337 commented Oct 12, 2024 •

edited

Loading

DarkLight1337 commented Nov 1, 2024 •

edited

Loading