Add ImageGPT to mappings #16869

ydshieh · 2022-04-21T07:59:50Z

What does this PR do?

Similar to #16857, but only for ImageGPT, together with the necessary to make the (pipeline) tests pass.

ydshieh · 2022-04-21T08:11:50Z

src/transformers/pipelines/base.py

+        # No need to perform padding if all lengths are equal
+        all_lengths = [item[key].shape[1] for item in items]
+        if len(set(all_lengths)) == 1:
+            tensor = torch.stack([item[key][0] for item in items], dim=0)
+            return tensor
+
+        max_length = max(all_lengths)


Necessary to make image classification pipeline test (with ImageGPTForImageClassification) pass.

More precisely, the follofwing line would fail with padding_value being None.

torch.zeros((batch_size, max_length), dtype=dtype) + padding_value

I don't think we should make that change.

We can on the other hand add a meaningful padding value for that key (here 0 I guess).

The fact that None is passed is the bug here I think.

The problem with this change, is that you have about 0 information what kind of tensors are sent here, so shape[1] is most likely going to trigger a bug somewhere somehow (maybe shape[1] doesn't exist, or shape[2] exists and prevent stacking for instance).

Yeah, that's true. I forgot to add some checks as those have done in the blocks below.

ydshieh · 2022-04-21T08:12:19Z

src/transformers/pipelines/image_classification.py

+        if type(self.model).__name__ == "ImageGPTForImageClassification":
+            # Temporary workaround
+            # Check that the model can still do a forward pass successfully (every parameter should be resized)
+            # Input ids should be clamped to the maximum size of the vocabulary
+            model_inputs["pixel_values"].clamp_(max=self.model.config.vocab_size - 15 - 1)


ImageGPTModelTester has vocab_size=99, but ImageGPTFeatureExtractor doesn't use vocab_size, and will produce values between 0 and 512.

This operation is copied from ImageGPTModelTest.

Again, this is very much something we should avoid as much as possible.

Why does the feature_extractor does not produce valid pixel_values (clamping himself if possible) ?

I didn't look the details in ImageGPT. It seems to me it is special: unlike other vision models' feature extractors which output pixel values, ImageGPTFeatureExtractor outputs something called color clusters which should be treated like tokens as in NLP models.

In fact, the following message is in ImageGPTModel

if "pixel_values" in kwargs: warnings.warn( "The `pixel_values` argument is deprecated and will be removed in a future version, use `input_ids` instead.", FutureWarning, )

Probably we can move this logic (about clamp_) to be inside ImageGPTFeatureExtractor itself. But this is not done for any vision model so far, and @NielsRogge 's opinion is important for this change.

ydshieh · 2022-04-21T08:12:42Z

tests/imagegpt/test_modeling_imagegpt.py

@@ -70,7 +70,7 @@ def __init__(
        hidden_act="gelu",
        hidden_dropout_prob=0.1,
        attention_probs_dropout_prob=0.1,
-        max_position_embeddings=512,
+        max_position_embeddings=1024,


ImageGPTFeatureExtractor will create sequences of length 32 * 32 = 1024.
Using max_position_embeddings=512 will give index out of range when dealing with position embeddings.

You don't need to modify the config in such a way (if you don't want to).

You can use get_pipeline_config(self) method that will enable the pipeline tests to use a different config than what is used for the model testing, it's useful specifically for changing stuff like max_pos and vocab_size.

Great to know about this 💯 Thank you!

HuggingFaceDocBuilderDev · 2022-04-21T08:14:34Z

The documentation is not available anymore as the PR was closed or merged.

Narsil

Thanks for this PR !

I think all your changes make sense but are not necessarily the best possible change.
I'll try to come up with different (hopefully better) changes to support this.

Narsil · 2022-04-21T10:14:23Z

src/transformers/pipelines/base.py

+        # No need to perform padding if all lengths are equal
+        all_lengths = [item[key].shape[1] for item in items]
+        if len(set(all_lengths)) == 1:
+            tensor = torch.stack([item[key][0] for item in items], dim=0)
+            return tensor
+
+        max_length = max(all_lengths)


I don't think we should make that change.

We can on the other hand add a meaningful padding value for that key (here 0 I guess).

The fact that None is passed is the bug here I think.

The problem with this change, is that you have about 0 information what kind of tensors are sent here, so shape[1] is most likely going to trigger a bug somewhere somehow (maybe shape[1] doesn't exist, or shape[2] exists and prevent stacking for instance).

Narsil · 2022-04-21T10:15:32Z

src/transformers/pipelines/image_classification.py

+        if type(self.model).__name__ == "ImageGPTForImageClassification":
+            # Temporary workaround
+            # Check that the model can still do a forward pass successfully (every parameter should be resized)
+            # Input ids should be clamped to the maximum size of the vocabulary
+            model_inputs["pixel_values"].clamp_(max=self.model.config.vocab_size - 15 - 1)


Again, this is very much something we should avoid as much as possible.

Why does the feature_extractor does not produce valid pixel_values (clamping himself if possible) ?

Narsil · 2022-04-21T10:16:48Z

tests/imagegpt/test_modeling_imagegpt.py

@@ -70,7 +70,7 @@ def __init__(
        hidden_act="gelu",
        hidden_dropout_prob=0.1,
        attention_probs_dropout_prob=0.1,
-        max_position_embeddings=512,
+        max_position_embeddings=1024,


You don't need to modify the config in such a way (if you don't want to).

You can use get_pipeline_config(self) method that will enable the pipeline tests to use a different config than what is used for the model testing, it's useful specifically for changing stuff like max_pos and vocab_size.

ydshieh · 2022-04-21T10:58:45Z

Closed - in favor of #16871

add imagegpt

4bdcef4

ydshieh changed the title ~~add imagegpt~~ add ImageGPT to mappings Apr 21, 2022

ydshieh changed the title ~~add ImageGPT to mappings~~ Add ImageGPT to mappings Apr 21, 2022

ydshieh mentioned this pull request Apr 21, 2022

Add missing entries in mappings #16857

Merged

ydshieh commented Apr 21, 2022

View reviewed changes

ydshieh requested review from sgugger, Narsil and NielsRogge April 21, 2022 08:13

ydshieh removed the request for review from Narsil April 21, 2022 08:15

Narsil reviewed Apr 21, 2022

View reviewed changes

Narsil mentioned this pull request Apr 21, 2022

Enabling imageGPT auto feature extractor. #16871

Merged

5 tasks

ydshieh closed this Apr 21, 2022

ydshieh deleted the add_imagegpt_to_mappings branch May 5, 2022 10:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ImageGPT to mappings #16869

Add ImageGPT to mappings #16869

ydshieh commented Apr 21, 2022

ydshieh Apr 21, 2022

Narsil Apr 21, 2022

ydshieh Apr 21, 2022

ydshieh Apr 21, 2022

Narsil Apr 21, 2022

ydshieh Apr 21, 2022

ydshieh Apr 21, 2022 •

edited

Loading

Narsil Apr 21, 2022

ydshieh Apr 21, 2022

Narsil Apr 21, 2022

ydshieh Apr 21, 2022

HuggingFaceDocBuilderDev commented Apr 21, 2022 •

edited

Loading

Narsil left a comment

Narsil Apr 21, 2022

Narsil Apr 21, 2022

Narsil Apr 21, 2022

ydshieh commented Apr 21, 2022

Add ImageGPT to mappings #16869

Add ImageGPT to mappings #16869

Conversation

ydshieh commented Apr 21, 2022

What does this PR do?

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ydshieh Apr 21, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented Apr 21, 2022 • edited Loading

Narsil left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ydshieh commented Apr 21, 2022

ydshieh Apr 21, 2022 •

edited

Loading

HuggingFaceDocBuilderDev commented Apr 21, 2022 •

edited

Loading