-
Notifications
You must be signed in to change notification settings - Fork 27.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Standardize image-text-to-text-models outputs #32471
Standardize image-text-to-text-models outputs #32471
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
@yonigozlan Chameleon can also do image-text-to-text |
Thanks! Will add it to the list |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this! left a few comments, moving on to 32472 now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice! Looking forward to having all of the processing behaviour more standardized ❤️
Main comment is on the handling of the legacy behaviour
f25eb1d
to
7074649
Compare
7074649
to
aa2b417
Compare
4137b24
to
04fb918
Compare
add post_process_image_text_to_text to chameleon and cleanup Fix legacy kwarg behavior and deprecation warning add post_process_image_text_to_text to qwen2_vl and llava_onevision Add post_process_image_text_to_text to idefics3, mllama, pixtral processor
04fb918
to
bc5cf3c
Compare
@LysandreJik This should be ready for a final review, and should significantly reduce the loc count and number of files changed for the image-text-to-text pipeline PR :). |
cc @molbap can you do an initial review please? |
Maybe not initial, but pre-final ? 😁 |
The changes from this PR were merged in #34170 |
What does this PR do?
Standardize outputs for existing image-text-to-text models by adding a
post_process_image_text_to_text
function to their processor.Blocking PR for
image-text-to-text
pipeline.The following models' processors need to be modified:
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
@molbap @amyeroberts