Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update test_batched_inference_image_captioning_conditioned #23391

Merged
merged 3 commits into from
May 16, 2023
Merged

Conversation

ydshieh
Copy link
Collaborator

@ydshieh ydshieh commented May 16, 2023

What does this PR do?

The test tests/models/pix2struct/test_modeling_pix2struct.py::Pix2StructIntegrationTest::test_batched_inference_image_captioning_conditioned starts to fail on CI run of April 27 which includes the merged PR #23023.

@younesbelkada Could you double check if the changes in this PR are reasonable? Thank you.

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented May 16, 2023

The documentation is not available anymore as the PR was closed or merged.

Copy link
Contributor

@younesbelkada younesbelkada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the PR;
I think that we should educate users that for text-conditioned generation we should never add special tokens to the tokenizer - as introduced in #23004
I think here the model sees An photography of<|endoftext|>' which leads it to generate weird output right after.
As users might check these tests as a reference, I think we should encode in the test the input without the special tokens (i.e. force add_special_tokens=True) and document that somewhere (I can take care of that in a follow up PR)
What do you think?

Copy link
Contributor

@younesbelkada younesbelkada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But in principle, that solution LGTM as well! You can merge that and I can address what I have said later

@ydshieh
Copy link
Collaborator Author

ydshieh commented May 16, 2023

@younesbelkada If you can take over this PR to avoid generate weird output right after, it would be really nice :-)

@younesbelkada
Copy link
Contributor

The test should be now fixed, the generated text produces different output than before, probably due to #23051 that now made the model using a causal attention mask on the text decoder (which was not the case before)

@ydshieh
Copy link
Collaborator Author

ydshieh commented May 16, 2023

Thanks a lot!

@amyeroberts
Copy link
Collaborator

I think that we should educate users that for text-conditioned generation we should never add special tokens to the tokenizer - as introduced in #23004

@younesbelkada The best place to do this I think is in the example docstring for the model, as this is what a lot of users will reference, and it currently doesn't do that. Could you open a PR to update this?

Copy link
Collaborator

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating!

Changes look fine - my only concern is that the generations appear to have become worse. @younesbelkada @ydshieh do we have any other generation samples to make sure the model is behaving as expected?

@younesbelkada
Copy link
Contributor

@younesbelkada The best place to do this I think is in the example docstring for the model, as this is what a lot of users will reference, and it currently doesn't do that. Could you open a PR to update this?

Sure yes will do!

Changes look fine - my only concern is that the generations appear to have become worse. @younesbelkada @ydshieh do we have any other generation samples to make sure the model is behaving as expected?

Yes! I was relieved since we do have the tests test_batched_inference_image_captioning & test_inference_image_captioning that still pass --> meaning that the un-conditional text generation seem to be unaffected!


predictions = model.generate(**inputs)

self.assertEqual(
processor.decode(predictions[0], skip_special_tokens=True), "A picture of a stop sign that says yes."
processor.decode(predictions[0], skip_special_tokens=True),
"A picture of a stop sign with a red stop sign on it.",
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, seeing this image, I would say this new prediction is better than the previous one.

)

self.assertEqual(
processor.decode(predictions[1], skip_special_tokens=True),
"An photography of the Temple Bar and a few other places.",
"An photography of the Temple Bar and the Temple Bar.",
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not better, but I would say the previous about a few other places is not super good neither.

@ydshieh
Copy link
Collaborator Author

ydshieh commented May 16, 2023

@younesbelkada The best place to do this I think is in the example docstring for the model, as this is what a lot of users will reference, and it currently doesn't do that. Could you open a PR to update this?

Sure yes will do!

I am going to merge this PR and leave @amyeroberts 's suggestion for @younesbelkada in a separate PR. Thank you for the review and the refine of this PR.

@ydshieh ydshieh merged commit 21741e8 into main May 16, 2023
@ydshieh ydshieh deleted the fix_pix2stru branch May 16, 2023 12:49
gojiteji pushed a commit to gojiteji/transformers that referenced this pull request Jun 5, 2023
…face#23391)

* fix

* fix

* fix test + add more docs

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
novice03 pushed a commit to novice03/transformers that referenced this pull request Jun 23, 2023
…face#23391)

* fix

* fix

* fix test + add more docs

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants