Skip to content

Conversation

@yonigozlan
Copy link
Member

What does this PR do?

Following the trial testing with Qwen_VL image processors, this extends defaulting to fast image processors even for checkpoints saved with a slow one to all models.

Also made sure that all processors use AutoImageProcessor to instantiate their image_processor_class.
On that point, defining default subclass in processors feels a bit redundant, as we basically already have that in auto classes. It would be nice to get rid of this for v5, wdyt @molbap @zucchini-nlp @ArthurZucker ?
I'll open a PR for that too.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Contributor

@molbap molbap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good for v5! Let's see if we can even simplify further in this iteration

if common_kwargs:
for kwarg in output_kwargs.values():
kwarg.update(common_kwargs)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sure there's a good reason but I'm missing it, why is this moved up?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, it is a fix from #41381 :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aah, which fixes #40931, got it

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes mb, switch to a new branch without checking out main first 🥴

Comment on lines 42 to +49
def __init__(self, **kwargs):
super().__init__(**kwargs)
if not self.is_fast:
logger.warning_once(
f"Using a slow image processor (`{self.__class__.__name__}`). "
"As we are transitioning to fast (PyTorch-native) processors, consider using `AutoImageProcessor` or the model-specific fast image processor class "
"to instantiate a fast image processor."
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM!

Related, since we're touching on the topic of "loading old models from the hub with new utils" this is related to the "from_pretrained conversion" @Cyrilvallez is working on, if we have modifications to apply to some old image processors, they should be in the from_pretrained as well to "convert" the processor in the same sense.

Copy link
Member

@zucchini-nlp zucchini-nlp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Just wondering about some models where we had no lancsoz resampling. Do we get the closest resampling in those cases and are the diffs small enough?

class VideoLlavaProcessor(ProcessorMixin):
r"""
Constructs a VideoLlava processor which wraps a VideoLlava image processor and a Llava tokenizer into a single processor.
Constructs a VideoLlava processor which wraps a AutoImageProcessor and a Llava tokenizer into a single processor.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: imo we need not change the name when it is not referenced. Instead we only change the "[VideoLlavaImageProcessor] " one line below

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes you're right, not very useful to have AutoImageProcessor as in the docstring. I'll change these back. I'm also working on getting auto_docstring to work on processors which should do all that automatically (check which subprocessors are in auto for this model) ;)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also working on getting auto_docstring to work on processors which should do all that automaticall

nice, very needed

if common_kwargs:
for kwarg in output_kwargs.values():
kwarg.update(common_kwargs)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, it is a fix from #41381 :)

@yonigozlan
Copy link
Member Author

LGTM. Just wondering about some models where we had no lancsoz resampling. Do we get the closest resampling in those cases and are the diffs small enough?

Good point for the lanczos sampling, I might add an exception for these, as the diffs are not close enough imo

@github-actions
Copy link
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: align, auto, blip, blip_2, bridgetower, chameleon, chinese_clip, clip, clipseg, emu3, flava, fuyu, grounding_dino, idefics, idefics2, idefics3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants