Default to fast image processors for all models #41388

yonigozlan · 2025-10-06T19:11:56Z

What does this PR do?

Following the trial testing with Qwen_VL image processors, this extends defaulting to fast image processors even for checkpoints saved with a slow one to all models.

Also made sure that all processors use AutoImageProcessor to instantiate their image_processor_class.
On that point, defining default subclass in processors feels a bit redundant, as we basically already have that in auto classes. It would be nice to get rid of this for v5, wdyt @molbap @zucchini-nlp @ArthurZucker ?
I'll open a PR for that too.

…oImageProcessor

HuggingFaceDocBuilderDev · 2025-10-06T19:21:11Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

molbap

Sounds good for v5! Let's see if we can even simplify further in this iteration

molbap · 2025-10-07T07:47:17Z

src/transformers/processing_utils.py

+        if common_kwargs:
+            for kwarg in output_kwargs.values():
+                kwarg.update(common_kwargs)
+


I'm sure there's a good reason but I'm missing it, why is this moved up?

yep, it is a fix from #41381 :)

aah, which fixes #40931, got it

Yes mb, switch to a new branch without checking out main first 🥴

molbap · 2025-10-07T07:49:33Z

src/transformers/image_processing_utils.py

    def __init__(self, **kwargs):
        super().__init__(**kwargs)
+        if not self.is_fast:
+            logger.warning_once(
+                f"Using a slow image processor (`{self.__class__.__name__}`). "
+                "As we are transitioning to fast (PyTorch-native) processors, consider using `AutoImageProcessor` or the model-specific fast image processor class "
+                "to instantiate a fast image processor."
+            )


SGTM!

Related, since we're touching on the topic of "loading old models from the hub with new utils" this is related to the "from_pretrained conversion" @Cyrilvallez is working on, if we have modifications to apply to some old image processors, they should be in the from_pretrained as well to "convert" the processor in the same sense.

zucchini-nlp

LGTM. Just wondering about some models where we had no lancsoz resampling. Do we get the closest resampling in those cases and are the diffs small enough?

zucchini-nlp · 2025-10-07T09:12:32Z

src/transformers/models/video_llava/processing_video_llava.py

 class VideoLlavaProcessor(ProcessorMixin):
    r"""
-    Constructs a VideoLlava processor which wraps a VideoLlava image processor and a Llava tokenizer into a single processor.
+    Constructs a VideoLlava processor which wraps a AutoImageProcessor and a Llava tokenizer into a single processor.


nit: imo we need not change the name when it is not referenced. Instead we only change the "[VideoLlavaImageProcessor] " one line below

Yes you're right, not very useful to have AutoImageProcessor as in the docstring. I'll change these back. I'm also working on getting auto_docstring to work on processors which should do all that automatically (check which subprocessors are in auto for this model) ;)

I'm also working on getting auto_docstring to work on processors which should do all that automaticall

nice, very needed

zucchini-nlp · 2025-10-07T09:13:22Z

src/transformers/processing_utils.py

+        if common_kwargs:
+            for kwarg in output_kwargs.values():
+                kwarg.update(common_kwargs)
+


yep, it is a fix from #41381 :)

yonigozlan · 2025-10-07T16:40:25Z

LGTM. Just wondering about some models where we had no lancsoz resampling. Do we get the closest resampling in those cases and are the diffs small enough?

Good point for the lanczos sampling, I might add an exception for these, as the diffs are not close enough imo

…proc-all-models

github-actions · 2025-10-15T10:08:29Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: align, auto, blip, blip_2, bridgetower, chameleon, chinese_clip, clip, clipseg, emu3, flava, fuyu, grounding_dino, idefics, idefics2, idefics3

yonigozlan added 3 commits October 6, 2025 15:59

set common_kwargs defaults before updating with kwargs

221f587

modify auto image procesor logic and make sure all processors use Aut…

e76ddae

…oImageProcessor

fix-copies

c35a1c8

molbap reviewed Oct 7, 2025

View reviewed changes

zucchini-nlp reviewed Oct 7, 2025

View reviewed changes

yonigozlan added 2 commits October 14, 2025 14:55

Merge remote-tracking branch 'upstream/main' into default-fast-image-…

75512d9

…proc-all-models

Fix more tests

9469618

yonigozlan mentioned this pull request Oct 15, 2025

[v5] 🚨Refactor subprocessors handling in processors #41633

Open

Default to fast image processors for all models #41388

Are you sure you want to change the base?

Default to fast image processors for all models #41388

Uh oh!

Conversation

yonigozlan commented Oct 6, 2025

What does this PR do?

Uh oh!

HuggingFaceDocBuilderDev commented Oct 6, 2025

Uh oh!

molbap left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yonigozlan commented Oct 7, 2025

Uh oh!

github-actions bot commented Oct 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants