[Feature]: Add multi-image input support for LLaVA offline inference (similar to #7230) #8236

yinsong1986 · 2024-09-06T11:51:06Z

🚀 The feature, motivation and pitch

Recently, I released a model https://huggingface.co/aws-prototyping/long-llava-qwen2-7b using the llava model architecture and could support multiple images (or video) as input, so it is reasonable to enable multiple images as a input for llava in vllm also. Thank you!

Alternatives

No response

Additional context

No response

Before submitting a new issue...

Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

The text was updated successfully, but these errors were encountered:

yinsong1986 · 2024-09-07T11:30:35Z

Thanks @DarkLight1337 awesome work!

yinsong1986 added the feature request label Sep 6, 2024

DarkLight1337 mentioned this issue Sep 6, 2024

[Model] Multi-input support for LLaVA and fix embedding inputs for multi-image models #8238

Merged

DarkLight1337 closed this as completed in #8238 Sep 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: Add multi-image input support for LLaVA offline inference (similar to #7230) #8236

[Feature]: Add multi-image input support for LLaVA offline inference (similar to #7230) #8236

yinsong1986 commented Sep 6, 2024

yinsong1986 commented Sep 7, 2024

[Feature]: Add multi-image input support for LLaVA offline inference (similar to #7230) #8236

[Feature]: Add multi-image input support for LLaVA offline inference (similar to #7230) #8236

Comments

yinsong1986 commented Sep 6, 2024

🚀 The feature, motivation and pitch

Alternatives

Additional context

Before submitting a new issue...

yinsong1986 commented Sep 7, 2024