Adds `PreTrainedModel.framework` attribute #13817

StellaAthena · 2021-09-30T19:16:44Z

What does this PR do?

This PR introduces an attribute called framework in PreTrainedModel, FlaxPreTrainedModel, and TFPreTrainedModel. The purpose of this attribute is to allow a user to know what framework a provided model is in, as that information is not currently very accessible.

I'm a little confused as to whether this is correctly implemented. I was basing it off of the implementation of base_model_prefix, which doesn't have a getattr in FlaxPretrainedModel and TFPretrainedModel despite those not (AFAICT) inheriting from PreTrainedModel.

Who can review?

@patil-suraj @LysandreJik

LysandreJik · 2021-09-30T20:14:21Z

Great, thank you @StellaAthena! I'm not sure I see the full picture of adding that argument - but I'm definitely not opposed to it if it's helpful for your use case. It's more robust than relying on the class name.

I believe the property implemented in PyTorch (and not implemented in Flax and TensorFlow) isn't voluntary - the former was implemented early (two years ago), and the latter was overlooked.

For this property in particular (framework), I believe having it as a simple attribute should be enough for all three frameworks.

Thanks for offering a PR!

patil-suraj · 2021-10-01T06:28:52Z

Thanks for the PR!

The base_model_prefix serves a different purpose here, it indicates the module name used for the base module in a model with a specific head on top. For example, the base_model_prefix for bert is bert, which is used by the head models as the module name for the base model

transformers/src/transformers/models/bert/modeling_bert.py

Lines 1486 to 1492 in 8bbb53e

    
           class BertForSequenceClassification(BertPreTrainedModel): 
        
               def __init__(self, config): 
        
                   super().__init__(config) 
        
                   self.num_labels = config.num_labels 
        
                   self.config = config 
        
                   self.bert = BertModel(config)

This attribute is useful when loading a base model weights into a model with a head. And the reason base_model property is only added in PT PreTrainedModel and not in FlaxPreTrainedModel is because in pt it's possible to return a submodule and using this the user can access the base model if he needs (for example to freeze the base).
This is not possible for example in flax, because flax modules are stateless, and returning base_model will return a reference to the module without weights. Hope this makes it clear.

And for this property framework, IMO we could simply add it as a getter property and return the framework string, adding it just as a getter will also prevent users from accidentally setting it.

StellaAthena · 2021-10-01T18:02:27Z

Thank you both for the explication, it makes understanding why the transformers code is the way it is.

Great, thank you @StellaAthena! I'm not sure I see the full picture of adding that argument - but I'm definitely not opposed to it if it's helpful for your use case. It's more robust than relying on the class name.

When writing code that takes a user-defined transformers model as an input there are a lot of weird gotchas. The impetus for this PR was my attempt to generalize Google's BIG Bench to work with arbitrary transformer models, but I suspect it'll also be useful to EleutherAI's LM Eval Harness and other similar projects. Unfortunately, there are important properties of models that are impossible to derive from the config file. Another example of this is the fact that some tokenizers auto-append to the end of generations while others do not.

And for this property framework, IMO we could simply add it as a getter property and return the framework string, adding it just as a getter will also prevent users from accidentally setting it.

That's an interesting idea. My thought was that this approach would cause it to be encoded in config files, which seems like a good best practice to follow.

StellaAthena · 2021-10-04T03:46:20Z

@patil-suraj I have updated the code to follow your suggestion. The failing tests seem to have to do with an indentation error that I cannot work out. I even copied an existing function rather than write my own, in case there was something funky about how my keyboard was registering!

Edit: it looks like I was being fooled by a misleading error message! Changing string to str solved the problem.

LysandreJik

Thanks for the explanation and working on this @StellaAthena, this looks good to me! There's an issue with code quality, do you mind running the code quality script? You can do so like this, from the root of your transformers clone:

pip install -e .[quality]
make fixup

patil-suraj

LGTM!

StellaAthena · 2021-10-07T19:32:58Z

@LysandreJik @patil-suraj I don't think I can do any more. I'm having trouble installing Jax, which may be the blocker? IDK. The below image shows me running make fixup and then the verification test that the readout asks me to run.

patil-suraj · 2021-10-08T12:24:21Z

Hi @StellaAthena no problem. I could take care of this, would it be okay if I push to your branch?

StellaAthena · 2021-10-08T13:39:56Z

Hi @StellaAthena no problem. I could take care of this, would it be okay if I push to your branch?

Absolutely! Thanks

LysandreJik · 2021-10-08T14:38:20Z

Thanks for working on the PR @StellaAthena!

StellaAthena added 7 commits September 30, 2021 14:58

Added framework attribute

b0ece79

Update modeling_utils.py

46177e6

Update modeling_flax_utils.py

30a7bf0

Update modeling_tf_utils.py

aeef34c

Update modeling_utils.py

5a149b1

Update modeling_tf_utils.py

2bba691

Update modeling_tf_utils.py

729e535

StellaAthena added 7 commits October 3, 2021 23:35

Merge branch 'huggingface:master' into master

7ae39e3

Update modeling_flax_utils.py

67834a1

Update modeling_tf_utils.py

0fb617b

Update modeling_utils.py

5b3e3a4

Update modeling_utils.py

c33b339

Update modeling_tf_utils.py

f32dd22

Update modeling_flax_utils.py

4534e4b

StellaAthena marked this pull request as ready for review October 4, 2021 03:46

StellaAthena added 3 commits October 3, 2021 23:49

string -> str

2541e7b

Update modeling_tf_utils.py

b1ff69e

string -> str

13419ef

LysandreJik approved these changes Oct 5, 2021

View reviewed changes

patil-suraj approved these changes Oct 6, 2021

View reviewed changes

StellaAthena added 2 commits October 7, 2021 14:49

Merge branch 'huggingface:master' into master

08b87f3

fixup

35640a3

make flake happy

1dd9d7b

patil-suraj merged commit de34481 into huggingface:master Oct 8, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adds `PreTrainedModel.framework` attribute #13817

Adds `PreTrainedModel.framework` attribute #13817

StellaAthena commented Sep 30, 2021

LysandreJik commented Sep 30, 2021

patil-suraj commented Oct 1, 2021

StellaAthena commented Oct 1, 2021

StellaAthena commented Oct 4, 2021 •

edited

Loading

LysandreJik left a comment

patil-suraj left a comment

StellaAthena commented Oct 7, 2021

patil-suraj commented Oct 8, 2021

StellaAthena commented Oct 8, 2021

LysandreJik commented Oct 8, 2021

Adds PreTrainedModel.framework attribute #13817

Adds PreTrainedModel.framework attribute #13817

Conversation

StellaAthena commented Sep 30, 2021

What does this PR do?

Who can review?

LysandreJik commented Sep 30, 2021

patil-suraj commented Oct 1, 2021

StellaAthena commented Oct 1, 2021

StellaAthena commented Oct 4, 2021 • edited Loading

LysandreJik left a comment

Choose a reason for hiding this comment

patil-suraj left a comment

Choose a reason for hiding this comment

StellaAthena commented Oct 7, 2021

patil-suraj commented Oct 8, 2021

StellaAthena commented Oct 8, 2021

LysandreJik commented Oct 8, 2021

Adds `PreTrainedModel.framework` attribute #13817

Adds `PreTrainedModel.framework` attribute #13817

StellaAthena commented Oct 4, 2021 •

edited

Loading