[Pix2struct] Simplify generation #22527

NielsRogge · 2023-04-03T09:40:36Z

What does this PR do?

This PR aims to fix the warning that is currently printed out when generating text with Pix2Struct:

A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer.

I see that all Pix2Struct models have config.is_encoder_decoder=False, but as Pix2Struct is an encoder-decoder model it'd be great/more logical to have this argument set to True and instead overwrite prepare_inputs_for_generation to have a cleaner way of generating text. This also makes us get rid of the warning.

To do:

for the moment there're still one integration test failing (test_batched_inference_image_captioning_conditioned):

AssertionError: 'An photography of the Temple Bar and a collection of other items.' != 'An photography of the Temple Bar and a few other places.'
E       - An photography of the Temple Bar and a collection of other items.
E       ?                                        ^^^^ ^^^^^^^^       ^^ -
E       + An photography of the Temple Bar and a few other places.

HuggingFaceDocBuilderDev · 2023-04-03T09:57:59Z

The documentation is not available anymore as the PR was closed or merged.

tests/models/pix2struct/test_modeling_pix2struct.py

gante

In general, LGTM 👍

Added a few comments regarding potential fixes/simplifications, conditional on our ability to retroactively fix them without impacting users.

tests/models/pix2struct/test_modeling_pix2struct.py

gante · 2023-04-04T10:03:50Z

src/transformers/models/pix2struct/modeling_pix2struct.py

-                decoder_input_ids = torch.cat(
+        if isinstance(input_ids, torch.Tensor):
+            # check if the first element of `input_ids` is equal to `input_ids`:
+            if (input_ids[:, 0] != self.config.decoder_start_token_id).all().item():


This if wouldn't be needed if the tokenizer had the BOS token defined 😢 It's too late to change that, correct (users may have cloned the model)?

If it's too late to retroactively fix this, let's add add a comment about why we need this if :)

cc @younesbelkada since the model isn't available in a new PyPi release yet, we can probably update this.

There is a short window where we can do breaking changes yes, until next week.

This part still need updating no?

Pinging @younesbelkada here

src/transformers/models/pix2struct/modeling_pix2struct.py

NielsRogge · 2023-04-07T17:55:39Z

PR is ready for review, however checkpoints on the hub will need to be updated (is_encoder_decoder = True) for this PR to be merged

sgugger

Changes LGTM and we can make breaking changes to the model before the release (probably next week).

sgugger · 2023-04-07T18:12:06Z

docs/source/en/model_doc/pix2struct.mdx

 - [Fine-tuning Notebook](https://github.com/huggingface/notebooks/blob/main/examples/image_captioning_pix2struct.ipynb)
+- [Fine-tuning Notebook on key-value pair dataset](https://github.com/NielsRogge/Transformers-Tutorials/blob/master/Pix2Struct/Fine_tune_Pix2Struct_on_key_value_pair_dataset_(PyTorch_Lightning).ipynb)


Please remove this line or showcase a resource using our ecosystem.

sgugger · 2023-04-07T18:14:06Z

src/transformers/models/pix2struct/modeling_pix2struct.py

-                decoder_input_ids = torch.cat(
+        if isinstance(input_ids, torch.Tensor):
+            # check if the first element of `input_ids` is equal to `input_ids`:
+            if (input_ids[:, 0] != self.config.decoder_start_token_id).all().item():


There is a short window where we can do breaking changes yes, until next week.

NielsRogge · 2023-04-10T14:31:32Z

PR is ready, models on the hub don't need to be updated since they don't have is_encoder_decoder set on the model config level (i.e. Pix2StructConfig. They have set it only in Pix2StructTextConfig). cc @younesbelkada

younesbelkada

Thanks a lot for working on this Niels!

* Add model to doc tests * Remove generate and replace by prepare_inputs_for_generation * More fixes * Remove print statements * Update integration tests * Fix generate * Remove model from auto mapping * Use auto processor * Fix integration tests * Fix test * Add inference code snippet * Remove is_encoder_decoder * Update docs * Remove notebook link

* [Pix2struct] Simplify generation (#22527) * Add model to doc tests * Remove generate and replace by prepare_inputs_for_generation * More fixes * Remove print statements * Update integration tests * Fix generate * Remove model from auto mapping * Use auto processor * Fix integration tests * Fix test * Add inference code snippet * Remove is_encoder_decoder * Update docs * Remove notebook link * Release: v4.28.0 * Revert (for now) the change on `Deta` in #22437 (#22750) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> * Patch release: v4.28.1 * update zh chat template. * Update docs/source/zh/chat_templating.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/zh/_toctree.yml Co-authored-by: Michael <haifeng.yao@daocloud.io> * Update docs/source/zh/chat_templating.md Co-authored-by: Michael <haifeng.yao@daocloud.io> * Update docs/source/zh/chat_templating.md Co-authored-by: Michael <haifeng.yao@daocloud.io> * Update docs/source/zh/chat_templating.md Co-authored-by: Michael <haifeng.yao@daocloud.io> * Update docs/source/zh/chat_templating.md Co-authored-by: Michael <haifeng.yao@daocloud.io> * Update docs/source/zh/chat_templating.md Co-authored-by: Michael <haifeng.yao@daocloud.io> * Update docs/source/zh/chat_templating.md Co-authored-by: Michael <haifeng.yao@daocloud.io> --------- Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Michael <haifeng.yao@daocloud.io>

NielsRogge added 5 commits April 1, 2023 16:25

Add model to doc tests

f730caf

Remove generate and replace by prepare_inputs_for_generation

295bb54

More fixes

d435996

Remove print statements

3618ca6

Update integration tests

6913b2b

NielsRogge added 2 commits April 3, 2023 13:33

Fix generate

5c4016b

Remove model from auto mapping

556c7ff

NielsRogge commented Apr 3, 2023

View reviewed changes

tests/models/pix2struct/test_modeling_pix2struct.py Outdated Show resolved Hide resolved

Use auto processor

d953347

gante approved these changes Apr 4, 2023

View reviewed changes

NielsRogge added 2 commits April 5, 2023 16:23

Fix integration tests

a31848e

Fix test

27f47d0

NielsRogge marked this pull request as ready for review April 7, 2023 17:55

NielsRogge requested review from gante and sgugger and removed request for gante April 7, 2023 17:55

sgugger reviewed Apr 7, 2023

View reviewed changes

NielsRogge added 3 commits April 7, 2023 21:12

Add inference code snippet

22768af

Remove is_encoder_decoder

33daba7

Update docs

2d050b0

younesbelkada approved these changes Apr 10, 2023

View reviewed changes

Remove notebook link

4055a27

NielsRogge marked this pull request as draft April 12, 2023 10:05

NielsRogge marked this pull request as ready for review April 13, 2023 06:58

sgugger merged commit 8eb38f6 into huggingface:main Apr 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Pix2struct] Simplify generation #22527

[Pix2struct] Simplify generation #22527

NielsRogge commented Apr 3, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Apr 3, 2023 •

edited

Loading

gante left a comment

gante Apr 4, 2023

NielsRogge Apr 4, 2023

sgugger Apr 7, 2023

sgugger Apr 11, 2023

NielsRogge Apr 11, 2023

NielsRogge commented Apr 7, 2023

sgugger left a comment

sgugger Apr 7, 2023

sgugger Apr 7, 2023

NielsRogge commented Apr 10, 2023

younesbelkada left a comment

		- [Fine-tuning Notebook](https://github.com/huggingface/notebooks/blob/main/examples/image_captioning_pix2struct.ipynb)
		- [Fine-tuning Notebook on key-value pair dataset](https://github.com/NielsRogge/Transformers-Tutorials/blob/master/Pix2Struct/Fine_tune_Pix2Struct_on_key_value_pair_dataset_(PyTorch_Lightning).ipynb)

[Pix2struct] Simplify generation #22527

[Pix2struct] Simplify generation #22527

Conversation

NielsRogge commented Apr 3, 2023 • edited Loading

What does this PR do?

HuggingFaceDocBuilderDev commented Apr 3, 2023 • edited Loading

gante left a comment

Choose a reason for hiding this comment

gante Apr 4, 2023

Choose a reason for hiding this comment

NielsRogge Apr 4, 2023

Choose a reason for hiding this comment

sgugger Apr 7, 2023

Choose a reason for hiding this comment

sgugger Apr 11, 2023

Choose a reason for hiding this comment

NielsRogge Apr 11, 2023

Choose a reason for hiding this comment

NielsRogge commented Apr 7, 2023

sgugger left a comment

Choose a reason for hiding this comment

sgugger Apr 7, 2023

Choose a reason for hiding this comment

sgugger Apr 7, 2023

Choose a reason for hiding this comment

NielsRogge commented Apr 10, 2023

younesbelkada left a comment

Choose a reason for hiding this comment

NielsRogge commented Apr 3, 2023 •

edited

Loading

HuggingFaceDocBuilderDev commented Apr 3, 2023 •

edited

Loading