[FeatureExtractorSavingUtils] Refactor PretrainedFeatureExtractor #10594

patrickvonplaten · 2021-03-08T13:35:49Z

What does this PR do?

This PR refactors the class PreTrainedFeatureExtractor. The following changes are done to move functionality that is shared between sequence and image feature extractors into a separate file. This should unblock the PRs of DETR, VIT, and CLIP

PreTrainedFeatureExtractor is renamed to PreTrainedSequenceFeatureExtractor because it implicitly assumed that the it will treat only sequential inputs (a.k.a sequence of float values or a sequence of float vectors). PreTrainedFeatureExtractor was too general
All functionality that is shared between Image and Speech feature extractors (which IMO all relates to "saving" utilities) is moved to a FeatureExtractorSavingUtilsMixin
BatchFeature is moved from the feature_extraction_sequence_utils.py to feature_extraction_common_utils.py to be used by the PreTrainedImageFeatureExtractor class as well
The tests are refactored accordingly

The following things were assumed before applying the changes.

In the mid-term future there will only be three modalities in HF: text, sequential features (value sequence, vector sequence), image features (2D non-sequential array)
Models, such as ViT, DETR & CLIP will call their "preprocessor" VITFeatureExtractor, .... IMO, feature extractor is also a fitting name for image recognition (see: https://en.wikipedia.org/wiki/Feature_extraction) so that it is assumed that for image-text or image-only models there will be a PreTrainedImageFeatureExtractor, a VITFeatureExtractor, (and maybe a VITTokenizer & VITProcessor as well, but not necessary). For vision-text models that do require both a tokenizer and a feature extractor such as CLIP it is assumed that the classes CLIPFeatureExtractor and CLIPTokenizer are wrapped into a CLIPProcessor class similar to Wav2Vec2Processor. I think this is the most important assumption that is taken here, so we should make sure we are on the same page here @LysandreJik @sgugger @patil-suraj @NielsRogge
Image - Text or Image - only models won't require a BatchImageFeature or BatchImage, but can just use BatchFeature. From looking at the code in @NielsRogge's PR here: Add Vision Transformer + ViTFeatureExtractor #10513 this seems to be the case.

Backwards compatibility:

The class PreTrainedFeatureExtractor was accessible via:

from transformers import PreTrainedFeatureExtractor

but is now replaced by PreTrainedSequenceFeatureExtractor. However, since PreTrainedFeatureExtractor so far was only available on master, this change is OK IMO.

patrickvonplaten · 2021-03-08T15:39:59Z

src/transformers/feature_extraction_common_utils.py

 class BatchFeature(UserDict):
    r"""
-    Holds the output of the :meth:`~transformers.PreTrainedFeatureExtractor.pad` and feature extractor specific
+    Holds the output of the :meth:`~transformers.PreTrainedSequenceFeatureExtractor.pad` and feature extractor specific


@NielsRogge, here you can just overwrite the docstring by

of the :meth: .... or the :meth:`~transformers.PreTrainedImageFeatureExtractor.pad`

patrickvonplaten · 2021-03-08T15:40:38Z

src/transformers/feature_extraction_common_utils.py


        Examples::

-            # We can't instantiate directly the base class `PreTrainedFeatureExtractor` so let's show the examples on a
+            # We can't instantiate directly the base class `PreTrainedSequenceFeatureExtractor` so let's show the examples on a


Examples for ImageFeatureExtractor saving/loading can be appended to the Examples here

LysandreJik

Looks good! I like the API. Something I would change to streamline it with our tokenizers is to have the pad and _pad methods defined in the superclass, but raise a NotImplementedError if they're not implemented. Similarly to tokenize() in the tokenization utils base.

Also, here you imported PaddingStrategy which is great so as to not duplicate existing objects, how would we manage this for vision models? Since the padding/cropping strategies would probably be different (i.e., largest instead of longest, as @NielsRogge was mentioning)

src/transformers/feature_extraction_common_utils.py

…into add_processing_save_load_utils

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

sgugger

Looks good to me, thanks for working on this!

As I said offline, I don't like the very long names for the new classes and the module names so we should strive to find something easier.

src/transformers/feature_extraction_common_utils.py

src/transformers/models/wav2vec2/processing_wav2vec2.py

…ggingface#10594) * save first version * finish refactor * finish refactor * correct naming * correct naming * shorter names * Update src/transformers/feature_extraction_common_utils.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * change name * finish Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

patrickvonplaten added 5 commits March 8, 2021 16:34

save first version

c009e90

finish refactor

88ed32f

finish refactor

add513a

correct naming

ce029eb

correct naming

94210bb

patrickvonplaten requested review from LysandreJik, sgugger and patil-suraj March 8, 2021 15:37

patrickvonplaten commented Mar 8, 2021

View reviewed changes

LysandreJik approved these changes Mar 8, 2021

View reviewed changes

patrickvonplaten and others added 3 commits March 8, 2021 21:26

shorter names

17a116f

Merge branch 'master' of https://github.com/huggingface/transformers …

bc7219f

…into add_processing_save_load_utils

Update src/transformers/feature_extraction_common_utils.py

e1ee646

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

sgugger approved these changes Mar 8, 2021

View reviewed changes

patrickvonplaten added 2 commits March 9, 2021 10:37

change name

3090d54

finish

49b3364

patrickvonplaten merged commit 9a06b6b into huggingface:master Mar 9, 2021

dribnet mentioned this pull request Mar 11, 2021

[WIP] CLIP #10426

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FeatureExtractorSavingUtils] Refactor PretrainedFeatureExtractor #10594

[FeatureExtractorSavingUtils] Refactor PretrainedFeatureExtractor #10594

patrickvonplaten commented Mar 8, 2021 •

edited

Loading

patrickvonplaten Mar 8, 2021

patrickvonplaten Mar 8, 2021

LysandreJik left a comment

sgugger left a comment

[FeatureExtractorSavingUtils] Refactor PretrainedFeatureExtractor #10594

[FeatureExtractorSavingUtils] Refactor PretrainedFeatureExtractor #10594

Conversation

patrickvonplaten commented Mar 8, 2021 • edited Loading

What does this PR do?

Backwards compatibility:

patrickvonplaten Mar 8, 2021

Choose a reason for hiding this comment

patrickvonplaten Mar 8, 2021

Choose a reason for hiding this comment

LysandreJik left a comment

Choose a reason for hiding this comment

sgugger left a comment

Choose a reason for hiding this comment

patrickvonplaten commented Mar 8, 2021 •

edited

Loading