from_pretrained: check that the pretrained model is for the right model architecture #10586

vimarshc · 2021-03-08T06:40:11Z

What does this PR do?

Adding Checks to the from_pretrained workflow to check the model name passed belongs to the model being initiated.
Same checks need to be added for Tokenizer.

Fixes #10293

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors which may be interested in your PR.

…are the same

LysandreJik · 2021-03-08T13:07:08Z

Hi @vimarshc, thank you for opening this PR! Could you:

rebase your PR on the most recent master so that the failing tests don't fail anymore
run make fixup at the root of your repository to fix your code quality issue (More information related to this on step 5 of this document

stas00 · 2021-03-08T17:04:37Z

Awesome!

Would you like to attempt to add a test for this check?

We need to use tiny models so it's fast and I made the suggestions here:
#10293 (comment)

If you're not sure how to do it please let me know and I will add a test.

vimarshc · 2021-03-09T05:41:12Z

Hi @stas00,
I'd like to add the tests myself if that's ok. I have to add the same checks for the from_pretrained for Tokenizer however it's not as straightforward. The Tokenizer's from_pretrained is written with some assumptions in mind and I'm not entirely sure where to add the check. Here's the from_pretrained method for Tokenizers.

Regardless, I shall try to add the test for this assertion I've already added and the changes mentioned by @LysandreJik in the next 24 hours.

stas00 · 2021-03-09T06:20:45Z

OK, so your change works for the model and the config:

PYTHONPATH=src python -c 'from transformers import PegasusForConditionalGeneration; PegasusForConditionalGeneration.from_pretrained("patrickvonplaten/t5-tiny-random")'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/mnt/nvme1/code/huggingface/transformers-master/src/transformers/modeling_utils.py", line 975, in from_pretrained
    config, model_kwargs = cls.config_class.from_pretrained(
  File "/mnt/nvme1/code/huggingface/transformers-master/src/transformers/configuration_utils.py", line 387, in from_pretrained
    assert (
AssertionError: You tried to initiate a model of type 'pegasus' with a pretrained model of type 't5'

same for:

PYTHONPATH=src python -c 'from transformers import PegasusConfig; PegasusConfig.from_pretrained("patrickvonplaten/t5-tiny-random")'

As you discovered - and I didn't know - the tokenizer doesn't seem to need the config file, so it doesn't look there is a way to check that the tokenizer being downloaded is of the right kind. I will ask.

And yes, it's great if you can add the test - thank you.

I restyled your PR to fit our style guide - we don't use format and you need to run the code through make fixup or make style (slower) before committing - otherwise CIs may fail. Which is what @LysandreJik was requesting.
https://github.com/huggingface/transformers/blob/master/CONTRIBUTING.md#start-contributing-pull-requests

So please git pull your branch to get my updates.

vimarshc · 2021-03-09T14:21:01Z

Hi @stas00,
Thanks for the update.
Will take a pull, add the test and go through the checklist before pushing the changes.
Will try to push in a few hours.

stas00 · 2021-03-09T18:49:44Z

I'm puzzled. why did you undo my fix? If you want to restore it, it was:

--- a/src/transformers/configuration_utils.py
+++ b/src/transformers/configuration_utils.py
@@ -384,6 +384,9 @@ class PretrainedConfig(object):

         """
         config_dict, kwargs = cls.get_config_dict(pretrained_model_name_or_path, **kwargs)
+        assert (
+            config_dict["model_type"] == cls.model_type
+        ), f"You tried to initiate a model of type '{cls.model_type}' with a pretrained model of type '{config_dict['model_type']}'"
         return cls.from_dict(config_dict, **kwargs)

     @classmethod

vimarshc · 2021-03-09T19:14:22Z

Hi,
Apologies.
I rebased my branch and assumed had to force push which deleted your changes.

vimarshc · 2021-03-09T19:50:02Z

Hi,
I have added the tests.
Everything seems to be working fine.

However, I pushed after taking a pull from the master, and yet it's showing a merge conflict. Not sure how that got there.

stas00 · 2021-03-09T19:59:02Z

you messed up your PR branch - so this PR now contains dozens of unrelated changes.

You can do a soft reset to the last good sha, e.g.:

git reset --soft d70a770
git commit
git push -f

Just save somewhere your newly added test code first.

stas00 · 2021-03-09T20:10:34Z

I think you picked the wrong sha and ended up with an even worse situation. Try d70a770 as I suggested.

…n incompatiable model name

tests/test_modeling_common.py

src/transformers/configuration_utils.py

stas00 · 2021-03-09T20:36:44Z

tests/test_modeling_common.py

+        model = BertModel.from_pretrained(TINY_BERT)
+        self.assertIsNotNone(model)
+
+        self.assertRaises(AssertionError, BertModel.from_pretrained, TINY_T5)


Suggested change

self.assertRaises(AssertionError, BertModel.from_pretrained, TINY_T5)

with self.assertRaises(Exception) as context:

BertModel.from_pretrained(TINY_T5)

self.assertTrue("You tried to initiate a model of type" in str(context.exception))

Let's check the actual assert message here, just in case it asserts on something else and then this test would be misleading.

Just please test that it works. thank you.

…nsure desired assert message is being generated

stas00

Looks very good. Thank you for bearing my requests.

It Looks like this check found some bugs in our code. So we will need to resolve those before merging this. I will update you when this is done.

stas00 · 2021-03-09T21:38:20Z

OK, so looking at the errors - need to solve 2 issues:

Issue 1.

        assert (
>           config_dict["model_type"] == cls.model_type
        ), f"You tried to initiate a model of type '{cls.model_type}' with a pretrained model of type '{config_dict['model_type']}'"
E       KeyError: 'model_type'

so some models don't have the model_type key.

@vimarshc, I suppose you need to edit the code to skip this assert if we don't have the data.

You can verify that your change works with this test:

pytest -sv tests/test_trainer.py::TrainerIntegrationTest -k test_early_stopping_callback

I looked at the config.json generated by this test and it's:

{
  "a": 0,
  "architectures": [
    "RegressionPreTrainedModel"
  ],
  "b": 0,
  "double_output": false,
  "transformers_version": "4.4.0.dev0"
}

so far from being complete.

Issue 2

This one looks trickier:

E       AssertionError: You tried to initiate a model of type 'blenderbot-small' with a pretrained model of type 'blenderbot'

We will ask for help with this one.

stas00 · 2021-03-09T21:54:51Z

@patrickvonplaten, @patil-suraj - your help is needed here.

BlenderbotSmall has an inconsistency. It declares its model type as "blenderbot-small":

src/transformers/models/auto/configuration_auto.py:        ("blenderbot-small", BlenderbotSmallConfig),
src/transformers/models/auto/configuration_auto.py:        ("blenderbot-small", "BlenderbotSmall"),
src/transformers/models/blenderbot_small/configuration_blenderbot_small.py:    model_type = "blenderbot-small"

but the pretrained models all use model_type: blenderbot: https://huggingface.co/facebook/blenderbot-90M/blob/main/config.json

So this new sanity check this PR is trying to add fails.

        config_dict, kwargs = cls.get_config_dict(pretrained_model_name_or_path, **kwargs)
>       assert (
            config_dict["model_type"] == cls.model_type
        ), f"You tried to initiate a model of type '{cls.model_type}' with a pretrained model of type '{config_dict['model_type']}'"
E       AssertionError: You tried to initiate a model of type 'blenderbot-small' with a pretrained model of type 'blenderbot'

What shall we do?

It's possible that that part of the config object needs to be re-designed, so that there is a top architecture/type and then perhaps sub-types?

vimarshc · 2021-03-10T06:31:12Z

Hi @stas00
Will add the check you mentioned today.

stas00 · 2021-03-10T17:18:55Z

Looks good, @vimarshc

So we are down to one failing test:

tests/test_modeling_blenderbot_small.py::Blenderbot90MIntegrationTests::test_90_generation_from_short_input

stas00 · 2021-03-11T23:14:26Z

I wonder if we could sort of cheat and do:

if not cls.model_type in config_dict["model_type"]: assert ...

so this will check whether the main type matches as a substring of a sub-type. It's not a precise solution, but will probably catch the majority of mismatches.

Actually for t5/mt5 it's reversed. model_type are t5 and mt5, but both may have T5ForConditionalGeneration as architecture.
https://huggingface.co/google/mt5-base/blob/main/config.json#L16 since MT5ForConditionalGeneration is a copy of T5ForConditionalGeneration with the only difference of having model_type = "mt5"

So I think this check could fail in some situations. In which case we could perhaps check if one is a subset of another in either direction?

if not (cls.model_type in config_dict["model_type"] or config_dict["model_type"] in cls.model_type): assert ...

So this proposes a sort of fuzzy-match.

patil-suraj · 2021-03-12T05:59:57Z

BlenderbotSmall has an inconsistency. It declares its model type as "blenderbot-small":

@stas00 You are right. Before the BART refactor all blenderbot models shared the same model class, but the config was not updated after the refactor. The model_type on the hub should be blenderbot-small. I will fix that.

patil-suraj · 2021-03-12T06:23:46Z

I updated the config https://huggingface.co/facebook/blenderbot-90M/blob/main/config.json.

And actually, there's a new version of blenderbot-90M , https://huggingface.co/facebook/blenderbot_small-90M

It's actually the same model, but with the proper name. The blenderbot small test uses blenderbot-90M which should be changed to use this new model.

vimarshc · 2021-03-12T06:25:07Z

Hi @stas00,
The fuzzy match approach will not work for the case 'distilbert' vs 'bert'.

stas00 · 2021-03-12T07:10:36Z

Hi @stas00,
The fuzzy match approach will not work for the case 'distilbert' vs 'bert'.

That's an excellent counter-example! As I proposed that it might mostly work ;)

But it looks like your original solution will now work after @patil-suraj fixing.

some unrelated test is failing - I rebased this branch - let's see if it will be green now.

stas00 · 2021-03-12T07:27:17Z

I updated the config https://huggingface.co/facebook/blenderbot-90M/blob/main/config.json.

And actually, there's a new version of blenderbot-90M , https://huggingface.co/facebook/blenderbot_small-90M

It's actually the same model, but with the proper name. The blenderbot small test uses blenderbot-90M which should be changed to use this new model.

Thank you, Suraj!

Since it's sort of related to this PR, do you want to push the change in here, or do it in another PR?

stas00 · 2021-03-12T07:37:22Z

Oh bummer, we have 2 more in TF land:

FAILED tests/test_modeling_tf_flaubert.py::TFFlaubertModelTest::test_compile_tf_model
FAILED tests/test_modeling_tf_flaubert.py::TFFlaubertModelTest::test_save_load

same issue for both tests:

E           AssertionError: You tried to initiate a model of type 'xlm' with a pretrained model of type 'flaubert'

@LysandreJik, who can help resolving this one? Thank you!

LysandreJik · 2021-03-12T12:10:01Z

Yes, I'll take a look as soon as possible!

LysandreJik · 2021-03-15T22:12:33Z

I fixed the tests related to FlauBERT. Flax test is a flaky test that @patrickvonplaten is working on solving, and should not block this PR.

LysandreJik

Overall LGTM. Would like to merge that after the v4.4.0 that comes out tomorrow so that we have time to test it on the master branch before putting it in a version.

stas00 · 2021-03-15T22:16:12Z

Thank you for taking care of this, @LysandreJik

I suppose we will take care of potentially doing the same for the Tokenizer validation in another PR.

LysandreJik · 2021-03-15T22:18:05Z

With the tokenizer it'll likely be a bit more complex, as it is perfectly possible to have decoupled models/tokenizers, e.g., a BERT model and a different tokenizer like it is the case in BERTweet (config.json).

stas00 · 2021-03-15T23:01:24Z

Indeed, I think this will require a change where there is a required tokenizer_config.json which identifies itself which arch it belongs to, so while it should be possible to mix a model and tokenizer from different architectures, this shouldn't fail with random misleading errors like:

python -c 'from transformers import BartTokenizer; BartTokenizer.from_pretrained("prajjwal1/bert-tiny")'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/mnt/nvme1/code/huggingface/transformers-master/src/transformers/tokenization_utils_base.py", line 1693, in from_pretrained
    raise EnvironmentError(msg)
OSError: Can't load tokenizer for 'prajjwal1/bert-tiny'. Make sure that:

- 'prajjwal1/bert-tiny' is a correct model identifier listed on 'https://huggingface.co/models'

- or 'prajjwal1/bert-tiny' is the correct path to a directory containing relevant tokenizer files

but to indicate to the user that they got either the wrong tokenizer class or the the tokenizer identifier, since the above error is invalid - it's the correct identifier

As can be seen from:

python -c 'from transformers import BertTokenizer; BertTokenizer.from_pretrained("prajjwal1/bert-tiny")'

which works.

(and it erroneously says "model identifier" and there is no model here, but that's an unrelated minor issue).

And of course there are many other ways I have seen this mismatch to fail, usually a lot noisier when it's missing some file.

stas00 · 2021-03-18T01:35:24Z

@LysandreJik, I rebased this PR and it looks good. v4.4.0 is out so we can probably merge this one now.

Thank you.

LysandreJik · 2021-03-18T16:51:38Z

Indeed, this is great! Thanks a lot @vimarshc and @stas00 for working on this.

stas00 · 2021-03-18T16:57:44Z

So should I create a new issue for doing the same for the Tokenizers? I think it'd be much more complicated since we don't save any tokenizer data at the moment that puts the tokenizer in any category/architecture.

vimarshc · 2021-03-19T04:12:32Z

Hi,
Thanks, @stas00 for providing the guidance to close this issue. This is my first contribution to transformers so you can imagine my excitement. :D
I understand that a similar change for Tokenizer will be a bit more complicated. Would love to take a shot at fixing that as well. :)

stas00 · 2021-03-19T04:49:42Z

I'm glad to hear it was a good experience for you, @vimarshc.

I'm not quite sure yet how to tackle the same for tokenizers. I will try to remember to tag you if we can think of an idea on how to approach this task.

…del architecture (huggingface#10586) * Added check to ensure model name passed to from_pretrained and model are the same * Added test to check from_pretrained throws assert error when passed an incompatiable model name * Modified assert in from_pretrained with f-strings. Modified test to ensure desired assert message is being generated * Added check to ensure config and model has model_type * Fix FlauBERT heads Co-authored-by: vimarsh chaturvedi <vimarsh chaturvedi> Co-authored-by: Stas Bekman <stas@stason.org> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>

Added check to ensure model name passed to from_pretrained and model …

2bef975

…are the same

vimarshc mentioned this pull request Mar 8, 2021

[pretrained] model classes aren't checking the arch of the pretrained model it loads #10293

Closed

stas00 changed the title ~~Issue 10293: Checks for from_pretrained~~ from_pretrained: check that the pretrained model is for the right model architecture Mar 9, 2021

Added test to check from_pretrained throws assert error when passed a…

44d7076

…n incompatiable model name

stas00 reviewed Mar 9, 2021

View reviewed changes

tests/test_modeling_common.py Outdated Show resolved Hide resolved

stas00 reviewed Mar 9, 2021

View reviewed changes

src/transformers/configuration_utils.py Outdated Show resolved Hide resolved

stas00 reviewed Mar 9, 2021

View reviewed changes

Modified assert in from_pretrained with f-strings. Modified test to e…

72010d1

…nsure desired assert message is being generated

stas00 approved these changes Mar 9, 2021

View reviewed changes

stas00 requested a review from LysandreJik March 9, 2021 21:34

Added check to ensure config and model has model_type

0415aba

Merge remote-tracking branch 'origin/master' into issue_10293

ce71bcc

Fix FlauBERT heads

d05cecf

LysandreJik approved these changes Mar 15, 2021

View reviewed changes

Merge remote-tracking branch 'origin/master' into issue_10293

2697193

LysandreJik merged commit 094afa5 into huggingface:master Mar 18, 2021

sgugger mentioned this pull request Apr 12, 2021

Replace error by warning when loading an architecture in another #11207

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

from_pretrained: check that the pretrained model is for the right model architecture #10586

from_pretrained: check that the pretrained model is for the right model architecture #10586

vimarshc commented Mar 8, 2021 •

edited by stas00

Loading

LysandreJik commented Mar 8, 2021

stas00 commented Mar 8, 2021

vimarshc commented Mar 9, 2021

stas00 commented Mar 9, 2021 •

edited

Loading

vimarshc commented Mar 9, 2021

stas00 commented Mar 9, 2021 •

edited

Loading

vimarshc commented Mar 9, 2021

vimarshc commented Mar 9, 2021

stas00 commented Mar 9, 2021 •

edited

Loading

stas00 commented Mar 9, 2021

stas00 Mar 9, 2021

stas00 left a comment

stas00 commented Mar 9, 2021 •

edited

Loading

stas00 commented Mar 9, 2021 •

edited

Loading

vimarshc commented Mar 10, 2021

stas00 commented Mar 10, 2021

stas00 commented Mar 11, 2021 •

edited

Loading

patil-suraj commented Mar 12, 2021 •

edited

Loading

patil-suraj commented Mar 12, 2021

vimarshc commented Mar 12, 2021

stas00 commented Mar 12, 2021 •

edited

Loading

stas00 commented Mar 12, 2021 •

edited

Loading

stas00 commented Mar 12, 2021

LysandreJik commented Mar 12, 2021

LysandreJik commented Mar 15, 2021

LysandreJik left a comment

stas00 commented Mar 15, 2021

LysandreJik commented Mar 15, 2021

stas00 commented Mar 15, 2021 •

edited

Loading

stas00 commented Mar 18, 2021

LysandreJik commented Mar 18, 2021

stas00 commented Mar 18, 2021

vimarshc commented Mar 19, 2021

stas00 commented Mar 19, 2021

-        self.assertRaises(AssertionError, BertModel.from_pretrained, TINY_T5)
+        with self.assertRaises(Exception) as context:
+            BertModel.from_pretrained(TINY_T5)
+        self.assertTrue("You tried to initiate a model of type" in str(context.exception))

from_pretrained: check that the pretrained model is for the right model architecture #10586

from_pretrained: check that the pretrained model is for the right model architecture #10586

Conversation

vimarshc commented Mar 8, 2021 • edited by stas00 Loading

What does this PR do?

Before submitting

Who can review?

LysandreJik commented Mar 8, 2021

stas00 commented Mar 8, 2021

vimarshc commented Mar 9, 2021

stas00 commented Mar 9, 2021 • edited Loading

vimarshc commented Mar 9, 2021

stas00 commented Mar 9, 2021 • edited Loading

vimarshc commented Mar 9, 2021

vimarshc commented Mar 9, 2021

stas00 commented Mar 9, 2021 • edited Loading

stas00 commented Mar 9, 2021

stas00 Mar 9, 2021

Choose a reason for hiding this comment

stas00 left a comment

Choose a reason for hiding this comment

stas00 commented Mar 9, 2021 • edited Loading

Issue 1.

Issue 2

stas00 commented Mar 9, 2021 • edited Loading

vimarshc commented Mar 10, 2021

stas00 commented Mar 10, 2021

stas00 commented Mar 11, 2021 • edited Loading

patil-suraj commented Mar 12, 2021 • edited Loading

patil-suraj commented Mar 12, 2021

vimarshc commented Mar 12, 2021

stas00 commented Mar 12, 2021 • edited Loading

stas00 commented Mar 12, 2021 • edited Loading

stas00 commented Mar 12, 2021

LysandreJik commented Mar 12, 2021

LysandreJik commented Mar 15, 2021

LysandreJik left a comment

Choose a reason for hiding this comment

stas00 commented Mar 15, 2021

LysandreJik commented Mar 15, 2021

stas00 commented Mar 15, 2021 • edited Loading

stas00 commented Mar 18, 2021

LysandreJik commented Mar 18, 2021

stas00 commented Mar 18, 2021

vimarshc commented Mar 19, 2021

stas00 commented Mar 19, 2021

vimarshc commented Mar 8, 2021 •

edited by stas00

Loading

stas00 commented Mar 9, 2021 •

edited

Loading

stas00 commented Mar 9, 2021 •

edited

Loading

stas00 commented Mar 9, 2021 •

edited

Loading

stas00 commented Mar 9, 2021 •

edited

Loading

stas00 commented Mar 9, 2021 •

edited

Loading

stas00 commented Mar 11, 2021 •

edited

Loading

patil-suraj commented Mar 12, 2021 •

edited

Loading

stas00 commented Mar 12, 2021 •

edited

Loading

stas00 commented Mar 12, 2021 •

edited

Loading

stas00 commented Mar 15, 2021 •

edited

Loading