Add conversion for interleave llava #31858

zucchini-nlp · 2024-07-09T10:05:38Z

What does this PR do?

Adds conversion script for the models proposed in this blog post.

Briefly: Llava-Next_Interleave is a series of models trained with interleaved visual inputs, including video and 3D images. Even though it's a lllava-next series, the authors have hardcoded the inference script to not use anyres technique, so I added it in LLaVa. We will not support video natively as LLaVaNeXTVideo, but user can pass each frame separately as if it's an interleaved image input

The main idea was to support the 0.5B checkpoint, because it has same performance as llava-next-7b in the blog post. Imo it can be a cool addition

HuggingFaceDocBuilderDev · 2024-07-09T10:39:03Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

amyeroberts

Thanks for adding!

src/transformers/models/llava/convert_llava_weights_to_hf.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

NielsRogge

Thanks for adding! Pretty cool that the conversion works out-of-the-box with SigLIP instead of CLIP as vision encoder.

zucchini-nlp added 2 commits July 9, 2024 11:31

add conversion for interleave llava

72df931

remove debug lines

fa39698

zucchini-nlp requested review from amyeroberts and NielsRogge July 9, 2024 10:05

remove unused imports

96f70c9

amyeroberts approved these changes Jul 9, 2024

View reviewed changes

src/transformers/models/llava/convert_llava_weights_to_hf.py Outdated Show resolved Hide resolved

zucchini-nlp and others added 2 commits July 10, 2024 10:21

Update src/transformers/models/llava/convert_llava_weights_to_hf.py

6be154b

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

small changes + docs

01f4246

NielsRogge approved these changes Jul 10, 2024

View reviewed changes

zucchini-nlp merged commit 97aa3e2 into huggingface:main Jul 10, 2024
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add conversion for interleave llava #31858

Add conversion for interleave llava #31858

zucchini-nlp commented Jul 9, 2024

HuggingFaceDocBuilderDev commented Jul 9, 2024

amyeroberts left a comment

NielsRogge left a comment

Add conversion for interleave llava #31858

Add conversion for interleave llava #31858

Conversation

zucchini-nlp commented Jul 9, 2024

What does this PR do?

HuggingFaceDocBuilderDev commented Jul 9, 2024

amyeroberts left a comment

Choose a reason for hiding this comment

NielsRogge left a comment

Choose a reason for hiding this comment