-
Notifications
You must be signed in to change notification settings - Fork 27.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add T5 Encoder for Feature Extraction #8717
Conversation
I like it! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this addition a lot!
We should also include this new model in the documentation here:
https://github.com/huggingface/transformers/blob/master/docs/source/model_doc/t5.rst#t5model
and add it to the tests here:
transformers/tests/test_modeling_t5.py
Line 474 in 2c83b3c
all_model_classes = (T5Model, T5ForConditionalGeneration) if is_torch_available() else () |
=> There might be some issues with the tests after adding the model...I'm happy to go into your PR and fix them accordingly :-)
And we can then also add the model to mT5
:-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for this addition!!
@patrickvonplaten is the T5 master here so I let him review for the modeling point of view. About TF I think we should wait that the PR about the new inputs to be merged. Also a tiny comment.
self, | ||
inputs, | ||
attention_mask=None, | ||
encoder_outputs=None, | ||
past_key_values=None, | ||
head_mask=None, | ||
inputs_embeds=None, | ||
use_cache=None, | ||
output_attentions=None, | ||
output_hidden_states=None, | ||
return_dict=None, | ||
training=False, | ||
**kwargs, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can remove the encoder_outputs
, pask_key_values
and use_cache
parameters from the list to avoid confusion. Wdyt @patrickvonplaten?
change output from Seq2SeqModelOutput to BaseModelOutput Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
1. remove encoder_outputs from the function call. 2. remove the encoder_outputs If statement. 3. remove isinstance from return_dict.
remove pask_key_values and use_cache
remove use_cache from the forward method
add doctoring for T5 encoder with T5_ENCODER_INPUTS_DOCSTRING
Great, I am glad that you did like it. Thanks @patrickvonplaten and @jplu for your feedback. @patrickvonplaten : @jplu : Is there anything else needed from my side to merge the pull request ? |
I think that's great! I'll fix the tests and merge :-) |
@@ -554,7 +554,7 @@ class RagDPRT5Test(RagTestMixin, unittest.TestCase): | |||
def config_and_inputs(self): | |||
question_encoder_tester = DPRModelTester(self) | |||
dpr_config_and_inputs = question_encoder_tester.prepare_config_and_inputs() | |||
generator_tester = T5ModelTester(self, vocab_size=1100, n_positions=30) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
n_positions
does not exist anymore => it was useless so remove it from tests as well
Update:PR is ready for review IMO. Would be great if @LysandreJik @jplu and @sgugger you can take a look :-) |
Thanks a lot @patrickvonplaten ^_^ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very clean implementation, thanks a lot @agemagician!
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Great work!!
config_and_inputs = self.model_tester.prepare_config_and_inputs() | ||
self.model_tester.create_and_check_model(*config_and_inputs) | ||
|
||
# is not able to be part of a pipeline |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does-it means not able to be part of a pipeline? It is because the test fails?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since T5Encoder just outputs the encoded hidden states I cannot really be used in combination with pipeline()
You are welcome. |
* Add T5 Encoder class for feature extraction * fix T5 encoder add_start_docstrings indent * update init with T5 encoder * update init with TFT5ModelEncoder * remove TFT5ModelEncoder * change T5ModelEncoder order in init * add T5ModelEncoder to transformers init * clean T5ModelEncoder * update init with TFT5ModelEncoder * add TFModelEncoder for Tensorflow * update init with TFT5ModelEncoder * Update src/transformers/models/t5/modeling_t5.py change output from Seq2SeqModelOutput to BaseModelOutput Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * remove encoder_outputs 1. remove encoder_outputs from the function call. 2. remove the encoder_outputs If statement. 3. remove isinstance from return_dict. * Authorize missing decoder keys * remove unnecessary input parameters remove pask_key_values and use_cache * remove use_cache remove use_cache from the forward method * add doctoring for T5 encoder add doctoring for T5 encoder with T5_ENCODER_INPUTS_DOCSTRING * change return_dict to dot access * add T5_ENCODER_INPUTS_DOCSTRING for TF T5 * change TFT5Encoder output type to BaseModelOutput * remove unnecessary parameters for TFT5Encoder * remove unnecessary if statement * add import BaseModelOutput * fix BaseModelOutput typo to TFBaseModelOutput * update T5 doc with T5ModelEncoder * add T5ModelEncoder to tests * finish pytorch * finish docs and mt5 * add mtf to init * fix init * remove n_positions * finish PR * Update src/transformers/models/mt5/modeling_mt5.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/t5/modeling_t5.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/t5/modeling_tf_t5.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/models/mt5/modeling_tf_mt5.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * make style Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
What does this PR do?
While using T5 for feature extraction, I found out that T5 encoder provides better features than T5 decoder. Hence, it makes sense to have T5 encoder only, which should reduce the memory and inference time by half, if feature extraction is needed rather than conditional generation.
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
T5: @patrickvonplaten
tensorflow: @jplu