Encoder-decoder models: move embedding scale to nn.Module #30410

zucchini-nlp · 2024-04-23T07:49:01Z

What does this PR do?

This PR moves embedding scale to nn.Module in encoder decoder models, so that users who want to pass in inputs_embeds to the forward will get the same results as if they passed input_ids via model.get_input_embeddings()(input_ids)

The generation from embeds is not supported for these models, that's why we did not see the inconsistency. I think we do not need specific tests and here I can add support for generation from embeds in another PR if needed.

All the tests (+slow) for the changes models are passing on my end

HuggingFaceDocBuilderDev · 2024-04-23T08:12:39Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

gante

LGTM, thank you for fixing 🙌

There are more models where this inconsistency happens (e.g. MVP, NllbMoe, ...), would you be able to propagate the pattern?

gante · 2024-04-23T11:14:47Z

@zucchini-nlp also, can you:

Check if there is a common test checking that the output of forward with input_ids=input_ids == output of forward with inputs_embeds=model.get_input_embeddings()(input_ids)
If there is not, can you add it? 🙏

We'll tag the core maintainer after we ensure there's a test!

zucchini-nlp · 2024-04-23T11:25:41Z

Hmm, okay, I thought the test for generation with inputs-embeds is enough. There is no such test, yes. If that's needed I can add it and that can trigger handling other models/cases if there are peculiarities :(

gante

Sorry for giving you tons of extra work here 😅 The result will be very cool in the long run, though 💪

tests/models/align/test_modeling_align.py

tests/models/bart/test_modeling_bart.py

zucchini-nlp · 2024-04-24T10:35:34Z

Seems like it's ready for core maintainer's review. Reverted the previous commit with skips and tested that checking signature is working for most cases. In cases where it does not work, model-specific test skipping is kept.

gante · 2024-04-24T11:04:11Z

tests/models/bridgetower/test_modeling_bridgetower.py

+    @unittest.skip(reason="""Bridge Tower does not have input/output embeddings. Thus this test is not applicable.""")
+    def test_inputs_embeds_matches_input_ids(self):
+        pass


Shouldn't this (and most other skips) be caught in the skip conditions of the main test? What's missing?

Some models have all the things implemented, but do not use "ipnuts_embeds" any where, even though accept it in forward. I dod not remove unused args like this due to BC but can be cleaned up in another PR if needed

IMO -- instead of adding custom test code to handle a silent failure, let's properly handle the failure (if the user passes inputs_embeds to forward, raise a NotImplementedError, which can be caught in the test)

You've opened a Pandora's box of work in this PR, but it will be great in the long run! 🤗

Hmm, so in cases when the model actually accepts but not uses inputs_embeds, we should remove "get_input_embeddings()" method? Or raise error inside forward after the input_embeds are passed?

I would like to entirely remove inputs_embeds from the forward argument list, but this leads to my question: if we should have the same list of arguments in all forwards in all models, regardless of usage? And will it break anything?

if we should have the same list of arguments in all forwards in all models

Not necessarily e.g. vision models don't accept input_ids. All models which are grouped together should have a common subset of inputs, which enable a full forward pass of the model. Most important is that the inputs have standardised names e.g. we don't have token_ids for one model, input_ids for another and that their behaviour is consistent e.g. input_embeds "means" the same thing across models.

In the case of text models, I'd say yes, if they accept input_ids, then they should also accept input_embeds, and then throw an error within the forward pass after input_embeds are passes.

Okey, thanks for clarifying. I added NotImplemented errors where possible, leaving skip on some of the tests with explanations.

Btw, now the "test_inputs_embeds" can also be cleaned-up and all model-specific skips removed. I can do it later in another PR :)

tests/models/bark/test_modeling_bark.py

zucchini-nlp · 2024-04-25T09:01:14Z

Failing tests are passing for me locally, and are not related to the changes

amyeroberts

Thanks for fixing this - definitely a behaviour we want.

V. nicely handled and tested - just a few small things to address before merge

amyeroberts · 2024-04-26T14:48:06Z

tests/test_modeling_common.py

+
+            inputs = copy.deepcopy(self._prepare_for_class(inputs_dict, model_class))
+            pad_token_id = config.pad_token_id if config.pad_token_id is not None else 1
+            print(inputs.keys())


Suggested change

print(inputs.keys())

We shouldn't have any print statements in tests

amyeroberts · 2024-04-26T14:50:44Z

tests/test_modeling_common.py

+            pad_token_id = config.pad_token_id if config.pad_token_id is not None else 1
+            print(inputs.keys())
+
+            try:


We shouldn't have try/except patterns in tests: either we're deliberately triggering an error or we're not. Raising exceptions is for within code, allowing us to elegantly handle errors at runtime. Instead, models which don't use input_embeds should explicitly skip this test, and the try/except block be removed

Hmm I thought we are adding all the NotImplementedErrors to get rid of many copies "skipTests", as discussed with @gante above:

IMO -- instead of adding custom test code to handle a silent failure, let's properly handle the failure (if the user passes inputs_embeds to forward, raise a NotImplementedError, which can be caught in the test)

I can bring back all skips if that's needed for tests-consistency but if we are specifically catching only "NotImplementedErrors" isn't it okay?

Ah, OK, sorry, I missed the bit about handling in the tests.

We should raise NotImplementedError on the model side, but let users handle that however they want, and then explicitly skip in the tests using unittest.skip for the specific models. This avoids accidentally skipping because a different NotImplementedError is raised. It's true we end up with more code, but it's better for tests to be DAMP i.e. very clear and explicit.

but it's better for tests to be DAMP

hehe the opposite of what I've been telling @zucchini-nlp in this PR 🙈 My bad :D

Not at all - the test suite is one of the most inconsistent places in our codebase 😬

Ah, okay, TIL about the DAMP principle

amyeroberts · 2024-04-26T14:53:41Z

For the musicgen tests - a fix was pushed to main - rebasing should resolve

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

zucchini-nlp · 2024-04-30T07:17:18Z

@amyeroberts done, all the NotImplementedError models are skipped in their modeling_test files and the CI is green

amyeroberts

Beautiful - thanks for fixing this tricky issue and making sure it's well tested!

* Fix CI after #30410 * [run-slow] blenderbot

zucchini-nlp and others added 2 commits April 22, 2024 16:25

move scaling to nn.Module

c67d30d

Merge branch 'huggingface:main' into embedding_sclae

60f3b87

zucchini-nlp requested a review from gante April 23, 2024 07:49

gante approved these changes Apr 23, 2024

View reviewed changes

zucchini-nlp added 2 commits April 23, 2024 15:10

let the test be here for now (need to fix)

4c14817

failing tests

785c76b

gante reviewed Apr 23, 2024

View reviewed changes

tests/models/align/test_modeling_align.py Show resolved Hide resolved

tests/models/bart/test_modeling_bart.py Outdated Show resolved Hide resolved

zucchini-nlp added 5 commits April 24, 2024 10:55

last failing models

6948455

Revert commit 4c14817

b6237a1

clean-up

44f1237

oops forgot

947fe22

codestyle

1dd506e

zucchini-nlp requested a review from amyeroberts April 24, 2024 10:35

gante reviewed Apr 24, 2024

View reviewed changes

tests/models/bark/test_modeling_bark.py Show resolved Hide resolved

zucchini-nlp and others added 2 commits April 25, 2024 09:55

raise NotImplemented when possible

43e82a5

Merge branch 'huggingface:main' into embedding_sclae

07ae767

amyeroberts reviewed Apr 26, 2024

View reviewed changes

zucchini-nlp and others added 3 commits April 29, 2024 13:34

Update tests/test_modeling_common.py

a424dfe

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

Merge branch 'huggingface:main' into embedding_sclae

5db1bcd

skip tests in respective modeling files

0bac186

amyeroberts approved these changes Apr 30, 2024

View reviewed changes

zucchini-nlp merged commit 38a4bf7 into huggingface:main May 1, 2024
22 checks passed

zucchini-nlp added a commit to zucchini-nlp/transformers that referenced this pull request May 2, 2024

Fix CI after huggingface#30410

633a4da

zucchini-nlp added a commit that referenced this pull request May 2, 2024

Fix CI after #30410 (#30612)

a0e77a1

* Fix CI after #30410 * [run-slow] blenderbot

jorirsan mentioned this pull request May 30, 2024

Converter not working for NLLB models OpenNMT/CTranslate2#1711

Open

TWagner2 mentioned this pull request Jul 8, 2024

Embedding class is replaced when calling resize_token_embeddings #31835

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Encoder-decoder models: move embedding scale to nn.Module #30410

Encoder-decoder models: move embedding scale to nn.Module #30410

zucchini-nlp commented Apr 23, 2024

HuggingFaceDocBuilderDev commented Apr 23, 2024

gante left a comment

gante commented Apr 23, 2024 •

edited

Loading

zucchini-nlp commented Apr 23, 2024

gante left a comment

zucchini-nlp commented Apr 24, 2024

gante Apr 24, 2024

zucchini-nlp Apr 24, 2024

gante Apr 24, 2024 •

edited

Loading

zucchini-nlp Apr 24, 2024

amyeroberts Apr 24, 2024

zucchini-nlp Apr 25, 2024

zucchini-nlp Apr 25, 2024

zucchini-nlp commented Apr 25, 2024

amyeroberts left a comment

amyeroberts Apr 26, 2024

amyeroberts Apr 29, 2024

amyeroberts Apr 26, 2024

zucchini-nlp Apr 29, 2024

amyeroberts Apr 29, 2024

gante Apr 29, 2024 •

edited

Loading

amyeroberts Apr 29, 2024

zucchini-nlp Apr 29, 2024

amyeroberts commented Apr 26, 2024

zucchini-nlp commented Apr 30, 2024

amyeroberts left a comment

Encoder-decoder models: move embedding scale to nn.Module #30410

Encoder-decoder models: move embedding scale to nn.Module #30410

Conversation

zucchini-nlp commented Apr 23, 2024

What does this PR do?

HuggingFaceDocBuilderDev commented Apr 23, 2024

gante left a comment

Choose a reason for hiding this comment

gante commented Apr 23, 2024 • edited Loading

zucchini-nlp commented Apr 23, 2024

gante left a comment

Choose a reason for hiding this comment

zucchini-nlp commented Apr 24, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gante Apr 24, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zucchini-nlp commented Apr 25, 2024

amyeroberts left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gante Apr 29, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

amyeroberts commented Apr 26, 2024

zucchini-nlp commented Apr 30, 2024

amyeroberts left a comment

Choose a reason for hiding this comment

gante commented Apr 23, 2024 •

edited

Loading

gante Apr 24, 2024 •

edited

Loading

gante Apr 29, 2024 •

edited

Loading