Add T5 Encoder for Feature Extraction #8717

agemagician · 2020-11-22T21:52:53Z

What does this PR do?

While using T5 for feature extraction, I found out that T5 encoder provides better features than T5 decoder. Hence, it makes sense to have T5 encoder only, which should reduce the memory and inference time by half, if feature extraction is needed rather than conditional generation.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

T5: @patrickvonplaten
tensorflow: @jplu

patrickvonplaten · 2020-11-24T13:44:38Z

I like it!

src/transformers/models/t5/modeling_t5.py

src/transformers/models/t5/modeling_tf_t5.py

patrickvonplaten

I like this addition a lot!

We should also include this new model in the documentation here:
https://github.com/huggingface/transformers/blob/master/docs/source/model_doc/t5.rst#t5model
and add it to the tests here:

transformers/tests/test_modeling_t5.py

Line 474 in 2c83b3c

    
           all_model_classes = (T5Model, T5ForConditionalGeneration) if is_torch_available() else ()

=> There might be some issues with the tests after adding the model...I'm happy to go into your PR and fix them accordingly :-)

And we can then also add the model to mT5 :-)

jplu

Thank you very much for this addition!!

@patrickvonplaten is the T5 master here so I let him review for the modeling point of view. About TF I think we should wait that the PR about the new inputs to be merged. Also a tiny comment.

jplu · 2020-11-24T13:54:15Z

src/transformers/models/t5/modeling_tf_t5.py

+        self,
+        inputs,
+        attention_mask=None,
+        encoder_outputs=None,
+        past_key_values=None,
+        head_mask=None,
+        inputs_embeds=None,
+        use_cache=None,
+        output_attentions=None,
+        output_hidden_states=None,
+        return_dict=None,
+        training=False,
+        **kwargs,


I think we can remove the encoder_outputs , pask_key_values and use_cache parameters from the list to avoid confusion. Wdyt @patrickvonplaten?

change output from Seq2SeqModelOutput to BaseModelOutput Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

1. remove encoder_outputs from the function call. 2. remove the encoder_outputs If statement. 3. remove isinstance from return_dict.

remove pask_key_values and use_cache

remove use_cache from the forward method

add doctoring for T5 encoder with T5_ENCODER_INPUTS_DOCSTRING

agemagician · 2020-11-24T17:38:34Z

Great, I am glad that you did like it.

Thanks @patrickvonplaten and @jplu for your feedback.

@patrickvonplaten :
I have adjusted all your code review, and also add it to the T5 documentation.
The only missing part is the tests, it will be great if you could add it.

@jplu :
I have removed the unnecessary parameters from the TF model.

Is there anything else needed from my side to merge the pull request ?

patrickvonplaten · 2020-11-27T13:36:37Z

Great, I am glad that you did like it.

Thanks @patrickvonplaten and @jplu for your feedback.

@patrickvonplaten :
I have adjusted all your code review, and also add it to the T5 documentation.
The only missing part is the tests, it will be great if you could add it.

@jplu :
I have removed the unnecessary parameters from the TF model.

Is there anything else needed from my side to merge the pull request ?

I think that's great! I'll fix the tests and merge :-)

patrickvonplaten · 2020-11-27T21:55:40Z

tests/test_modeling_rag.py

@@ -554,7 +554,7 @@ class RagDPRT5Test(RagTestMixin, unittest.TestCase):
    def config_and_inputs(self):
        question_encoder_tester = DPRModelTester(self)
        dpr_config_and_inputs = question_encoder_tester.prepare_config_and_inputs()
-        generator_tester = T5ModelTester(self, vocab_size=1100, n_positions=30)


n_positions does not exist anymore => it was useless so remove it from tests as well

patrickvonplaten · 2020-11-27T21:56:27Z

Update:

PR is ready for review IMO. Would be great if @LysandreJik @jplu and @sgugger you can take a look :-)

agemagician · 2020-11-27T22:02:47Z

Update:

PR is ready for review IMO. Would be great if @LysandreJik @jplu and @sgugger you can take a look :-)

Thanks a lot @patrickvonplaten ^_^

LysandreJik

Very clean implementation, thanks a lot @agemagician!

src/transformers/models/mt5/modeling_mt5.py

src/transformers/models/mt5/modeling_tf_mt5.py

src/transformers/models/t5/modeling_t5.py

src/transformers/models/t5/modeling_tf_t5.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

jplu

LGTM! Great work!!

jplu · 2020-11-29T17:33:24Z

tests/test_modeling_tf_t5.py

+        config_and_inputs = self.model_tester.prepare_config_and_inputs()
+        self.model_tester.create_and_check_model(*config_and_inputs)
+
+    # is not able to be part of a pipeline


What does-it means not able to be part of a pipeline? It is because the test fails?

Since T5Encoder just outputs the encoded hidden states I cannot really be used in combination with pipeline()

agemagician · 2020-11-29T20:27:41Z

Very clean implementation, thanks a lot @agemagician!

You are welcome.
I am glad that I could help making the library better, even with a small contribution.
I have to say without @patrickvonplaten help, I could not make it ^_^

* Add T5 Encoder class for feature extraction * fix T5 encoder add_start_docstrings indent * update init with T5 encoder * update init with TFT5ModelEncoder * remove TFT5ModelEncoder * change T5ModelEncoder order in init * add T5ModelEncoder to transformers init * clean T5ModelEncoder * update init with TFT5ModelEncoder * add TFModelEncoder for Tensorflow * update init with TFT5ModelEncoder * Update src/transformers/models/t5/modeling_t5.py change output from Seq2SeqModelOutput to BaseModelOutput Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * remove encoder_outputs 1. remove encoder_outputs from the function call. 2. remove the encoder_outputs If statement. 3. remove isinstance from return_dict. * Authorize missing decoder keys * remove unnecessary input parameters remove pask_key_values and use_cache * remove use_cache remove use_cache from the forward method * add doctoring for T5 encoder add doctoring for T5 encoder with T5_ENCODER_INPUTS_DOCSTRING * change return_dict to dot access * add T5_ENCODER_INPUTS_DOCSTRING for TF T5 * change TFT5Encoder output type to BaseModelOutput * remove unnecessary parameters for TFT5Encoder * remove unnecessary if statement * add import BaseModelOutput * fix BaseModelOutput typo to TFBaseModelOutput * update T5 doc with T5ModelEncoder * add T5ModelEncoder to tests * finish pytorch * finish docs and mt5 * add mtf to init * fix init * remove n_positions * finish PR * Update src/transformers/models/mt5/modeling_mt5.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/t5/modeling_t5.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/t5/modeling_tf_t5.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/mt5/modeling_tf_mt5.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * make style Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

agemagician added 11 commits November 22, 2020 22:10

Add T5 Encoder class for feature extraction

35f502b

fix T5 encoder add_start_docstrings indent

b211acd

update init with T5 encoder

b21313d

update init with TFT5ModelEncoder

6b744d0

remove TFT5ModelEncoder

b1c0127

change T5ModelEncoder order in init

67d25b7

add T5ModelEncoder to transformers init

3519db6

clean T5ModelEncoder

5d4db69

update init with TFT5ModelEncoder

ceb83f2

add TFModelEncoder for Tensorflow

adcfb55

update init with TFT5ModelEncoder

725b225

LysandreJik requested a review from patrickvonplaten November 23, 2020 22:08