sync #11

spatil6 · 2020-12-16T07:43:52Z

What does this PR do?

Fixes # (issue)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors which may be interested in your PR.

@lhoestq

* Create README.md * correct metrics id cc @lhoestq Co-authored-by: Julien Chaumond <chaumond@gmail.com>

* create README.md * Apply suggestions from code review Co-authored-by: Julien Chaumond <chaumond@gmail.com>

#8660)

…ok (#8616) * Add pip install update to resolve import error Add pip install upgrade tensorflow-gpu to remove error below: ``` --------------------------------------------------------------------------- AttributeError Traceback (most recent call last) <ipython-input-2-094fadb93f3f> in <module>() 1 import torch ----> 2 from transformers import AutoModel, AutoTokenizer, BertTokenizer 3 4 torch.set_grad_enabled(False) 4 frames /usr/local/lib/python3.6/dist-packages/transformers/__init__.py in <module>() 133 134 # Pipelines --> 135 from .pipelines import ( 136 Conversation, 137 ConversationalPipeline, /usr/local/lib/python3.6/dist-packages/transformers/pipelines.py in <module>() 46 import tensorflow as tf 47 ---> 48 from .modeling_tf_auto import ( 49 TF_MODEL_FOR_QUESTION_ANSWERING_MAPPING, 50 TF_MODEL_FOR_SEQ_TO_SEQ_CAUSAL_LM_MAPPING, /usr/local/lib/python3.6/dist-packages/transformers/modeling_tf_auto.py in <module>() 49 from .configuration_utils import PretrainedConfig 50 from .file_utils import add_start_docstrings ---> 51 from .modeling_tf_albert import ( 52 TFAlbertForMaskedLM, 53 TFAlbertForMultipleChoice, /usr/local/lib/python3.6/dist-packages/transformers/modeling_tf_albert.py in <module>() 22 import tensorflow as tf 23 ---> 24 from .activations_tf import get_tf_activation 25 from .configuration_albert import AlbertConfig 26 from .file_utils import ( /usr/local/lib/python3.6/dist-packages/transformers/activations_tf.py in <module>() 52 "gelu": tf.keras.layers.Activation(gelu), 53 "relu": tf.keras.activations.relu, ---> 54 "swish": tf.keras.activations.swish, 55 "silu": tf.keras.activations.swish, 56 "gelu_new": tf.keras.layers.Activation(gelu_new), AttributeError: module 'tensorflow_core.python.keras.api._v2.keras.activations' has no attribute 'swish' ``` I have tried running the colab after this change and it seems to work fine (all the cells run with no errors). * Update notebooks/02-transformers.ipynb only need to upgrade tensorflow, not tensorflow-gpu. Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* [model_cards]: control arabic model examples * [model_cards]: control input examples of Geotrend models * [model_cards]: add link to generatation script

@sgugger

* Make ci fail * Try to make tests actually run? * CI finally failing? * Fix CI * Revert "Fix CI" This reverts commit ca7923b. * Ooops wrong one * one more try * Ok ok let's move this elsewhere * Alternative to globals() (#8667) * Alternative to globals() * Error is raised later so return None * Sentencepiece not installed make some tokenizers None * Apply Lysandre wisdom * Slightly clearer comment? cc @sgugger Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Change default cache path * Document changes * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

@sgugger

* make generate work with multigpu * better fix - thanks @sgugger

* gpt2 and t5 parallel modeling * model_parallel utils update * adding missing model_parallel_utils Adds missing model_parallel_utils and reverses the changes to code in modeling_gpt2 and modeling_t5 * training_args reformat Reformatted training_args * style formatting Style formatting doc string length on training_args and model_parallel_utils * style changes make style && make quality for training_args and model_parallel_utils. * adding tests * minor change in trainer reverts loss calculation * Update training_args.py * Update training_args.py added back docstring language for adam_beta1 and adam_beta2 * Update trainer.py * Update src/transformers/trainer.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Fix style & rebase Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>

* consistent ignore keys + make private * style * - authorized_missing_keys => _keys_to_ignore_on_load_missing - authorized_unexpected_keys => _keys_to_ignore_on_load_unexpected * move public doc of private attributes to private comment

* Add early stopping patience and minimum threshold metric must improve to prevent early stopping to pytorch trainer * Add early stopping test * Set patience counter to 0 if best metric not defined yet * Make early stopping a callback. Add callback event for updating the best metric for early stopping callback to trigger on. * Run make style * make funciton name sensible * Improve new argument docstring wording and hope that flakey CI test passes. * Use on_evaluation callback instead of custom. Remove some debug printing * Move early stopping arguments and state into early stopping callback * Run make style * Remove old code * Fix docs formatting. make style went rogue on me. * Remove copied attributes and fix variable * Add assertions on training arguments instead of mutating them. Move comment out of public docs. * Make separate test for early stopping callback. Add test of invalid arguments. * Run make style... I remembered before CI this time! * appease flake8 * Add EarlyStoppingCallback to callback docs * Make docstring EarlyStoppingCallabck match other callbacks. * Fix typo in docs

* Add parallelize methods to the .rst files * Correct format

* Support BERT relative position embeddings * Fix typo in README.md * Address review comment * Fix failing tests * [tiny] Fix style_doc.py check by adding an empty line to configuration_bert.py * make fix copies * fix configs of electra and albert and fix longformer * remove copy statement from longformer * fix albert * fix electra * Add bert variants forward tests for various position embeddings * [tiny] Fix style for test_modeling_bert.py * improve docstring * [tiny] improve docstring and remove unnecessary dependency * [tiny] Remove unused import * re-add to ALBERT * make embeddings work for ALBERT * add test for albert Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Fix BART test * Fix MBART tests * Remove erroneous line from yaml * Update tests/test_modeling_bart.py * Quality

* MT5 should have an autotokenizer * Different configurations should be able to point to same tokenizers

… PR (#8745) * added instructions for syncing upstream master with forked master via PR * expand to add a note to why this is requested Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

) * implement support for run-time dependency version checking * try not escaping ! * use findall that works on py36 * small tweaks * autoformatter worship * simplify * shorter names * add support for non-versioned checks * add deps * revert * tokenizers not required, check version only if installed * make a proper distutils cmd and add make target * tqdm must be checked before tokenizers * workaround the DistributionNotFound peculiar setup * handle the rest of packages in setup.py * fully sync setup.py's install_requires - to check them all * nit * make install_requires more readable * typo * Update setup.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * restyle * add types * simplify * simplify2 Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply on BERT and ALBERT * Update TF Bart * Add input processing to TF BART * Add input processing for TF CTRL * Add input processing to TF Distilbert * Add input processing to TF DPR * Add input processing to TF Electra * Add input processing for TF Flaubert * Add deprecated arguments * Add input processing to TF XLM * remove unused imports * Add input processing to TF Funnel * Add input processing to TF GPT2 * Add input processing to TF Longformer * Add input processing to TF Lxmert * Apply style * Add input processing to TF Mobilebert * Add input processing to TF GPT * Add input processing to TF Roberta * Add input processing to TF T5 * Add input processing to TF TransfoXL * Apply style * Rebase on master * Bug fix * Retry to bugfix * Retry bug fix * Fix wrong model name * Try another fix * Fix BART * Fix input precessing * Apply style * Put the deprecated warnings in the input processing function * Remove the unused imports * Raise an error when len(kwargs)>0 * test ModelOutput instead of TFBaseModelOutput * Bug fix * Address Patrick's comments * Address Patrick's comments * Address Sylvain's comments * Add the new inputs in new Longformer models * Update the template with the new input processing * Remove useless assert * Apply style * Trigger CI

* First draft * Styling * With all changes staged * Update docs/source/index.rst Co-authored-by: Julien Chaumond <chaumond@gmail.com> * Styling Co-authored-by: Julien Chaumond <chaumond@gmail.com>

* Fix QA argument handler * Attempt to get a better fix for QA (#8768) Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>

* Resize the biases in same time than the embeddings * Trigger CI * Biases are not reset anymore * Remove get_output_embeddings + better LM model detection in generation utils * Apply style * First test on BERT * Update docstring + new name * Apply the new resizing logic to all the models * fix tests * Apply style * Update the template * Fix naming * Fix naming * Apply style * Apply style * Remove unused import * Revert get_output_embeddings * Trigger CI * Update num parameters * Restore get_output_embeddings in TFPretrainedModel and add comments * Style * Add decoder resizing * Style * Fix tests * Separate bias and decoder resize * Fix tests * Fix tests * Apply style * Add bias resizing in MPNet * Trigger CI * Apply style

…rmers (#9098) * fix rag * fix slow test * fix past in bart

* add model parallelism to T5EncoderModel add model parallelism to T5EncoderModel * remove decoder from T5EncoderModel parallelize * uodate T5EncoderModel docs * Extend T5ModelTest for T5EncoderModel * fix T5Stask using range for get_device_map * fix style Co-authored-by: Ahmed Elnaggar <elnaggar@rostlab.informatik.tu-muenchen.de>

* Fix T5 for graphe compilation+execution * Fix BART * Fix import * Fix naming * fix attribute name * Oops * fix import * fix tests * fix tests * Update test * Add mising import * Address Patrick's comments * Style * Address Patrick's comment

* trainer and finetune_trainer enhancements and fixes * add fallback default * move the fixing of incorrect keys back into finetune trainer * s/eval/val/ to match the split * trainer can now use a different prefix than eval_ for metrics * document new arg * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * use 'eval' as the default for metric_key_prefix * complete adjust var names + disambiguate * fix logger * add clarifying comment * add clarifying comment * style * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/trainer.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * complete removal of optional for metric_key_prefix * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

…9076) * Clarify impact of disable_tqdm on Jupyter Notebooks * Add weblink to argparse * Replace "dev set" with more common "validation set" in do_eval * Tweak prediction_loss_only * Tweak description of Adam hyperparameters * Add weblink to TensorBoard * Capitalise apex * Tweak local_rank description * Add weblink for wandb * Replace nlp with datasets * Tweak grammar in model_parallel * Capitalise apex * Update TensorFlow training args to match PyTorch ones * Fix style * Fix underscore in weblink Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Fix underscore in weblink Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Fix underscore in weblink Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Fix underscore in weblink Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Add obj to datasets.Dataset Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

update README with good news that the leak fix has been applied to pytorch-1.7.1.

k

* Fix tests for TF 2.4 * Remove <2.4 limitation * Add version condition * Update tests/test_optimization_tf.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update tests/test_optimization_tf.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update tests/test_optimization_tf.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* TF OpenAI GPT Sequence Classification * Update src/transformers/models/openai/modeling_tf_openai.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* reorder file * delete unnecesarry function * make style * save intermediate * fix attention masks * correct tf bart past key values * solve merge conflict bug * correct tensor dims * save intermediate tf * change attn layer * fix typo re-order past * inputs_embeds * make fix copies * finish tests * fix graph mode * appyl lysandres suggestions

* correct mistake in order * fix tensor copy * clone tensor correctly

…les (#9133) * replaced jnp.split + removing textual model inputs + ensuring warmup_steps > 0 * Add automatic dataset splitting in language-modeling examples

* Add possibility to switch between APEX and AMP in Trainer * Update src/transformers/training_args.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Address review comments * Update src/transformers/training_args.py Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

@sgugger

* First commit: adding all files from tapas_v3 * Fix multiple bugs including soft dependency and new structure of the library * Improve testing by adding torch_device to inputs and adding dependency on scatter * Use Python 3 inheritance rather than Python 2 * First draft model cards of base sized models * Remove model cards as they are already on the hub * Fix multiple bugs with integration tests * All model integration tests pass * Remove print statement * Add test for convert_logits_to_predictions method of TapasTokenizer * Incorporate suggestions by Google authors * Fix remaining tests * Change position embeddings sizes to 512 instead of 1024 * Comment out positional embedding sizes * Update PRETRAINED_VOCAB_FILES_MAP and PRETRAINED_POSITIONAL_EMBEDDINGS_SIZES * Added more model names * Fix truncation when no max length is specified * Disable torchscript test * Make style & make quality * Quality * Address CI needs * Test the Masked LM model * Fix the masked LM model * Truncate when overflowing * More much needed docs improvements * Fix some URLs * Some more docs improvements * Test PyTorch scatter * Set to slow + minify * Calm flake8 down * First commit: adding all files from tapas_v3 * Fix multiple bugs including soft dependency and new structure of the library * Improve testing by adding torch_device to inputs and adding dependency on scatter * Use Python 3 inheritance rather than Python 2 * First draft model cards of base sized models * Remove model cards as they are already on the hub * Fix multiple bugs with integration tests * All model integration tests pass * Remove print statement * Add test for convert_logits_to_predictions method of TapasTokenizer * Incorporate suggestions by Google authors * Fix remaining tests * Change position embeddings sizes to 512 instead of 1024 * Comment out positional embedding sizes * Update PRETRAINED_VOCAB_FILES_MAP and PRETRAINED_POSITIONAL_EMBEDDINGS_SIZES * Added more model names * Fix truncation when no max length is specified * Disable torchscript test * Make style & make quality * Quality * Address CI needs * Test the Masked LM model * Fix the masked LM model * Truncate when overflowing * More much needed docs improvements * Fix some URLs * Some more docs improvements * Add add_pooling_layer argument to TapasModel Fix comments by @sgugger and @patrickvonplaten * Fix issue in docs + fix style and quality * Clean up conversion script and add task parameter to TapasConfig * Revert the task parameter of TapasConfig Some minor fixes * Improve conversion script and add test for absolute position embeddings * Improve conversion script and add test for absolute position embeddings * Fix bug with reset_position_index_per_cell arg of the conversion cli * Add notebooks to the examples directory and fix style and quality * Apply suggestions from code review * Move from `nielsr/` to `google/` namespace * Apply Sylvain's comments Co-authored-by: sgugger <sylvain.gugger@gmail.com> Co-authored-by: Rogge Niels <niels.rogge@howest.be> Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr> Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: sgugger <sylvain.gugger@gmail.com>

bryant1410 and others added 30 commits November 21, 2020 22:58

Fix many typos (#8708)

e1f3156

Create README.md (#8630)

b6d864e

* Create README.md * correct metrics id cc @lhoestq Co-authored-by: Julien Chaumond <chaumond@gmail.com>

added bangla-bert-sentiment model card (#8687)

b5187e3

create README.md (#8682)

52585e4

* create README.md * Apply suggestions from code review Co-authored-by: Julien Chaumond <chaumond@gmail.com>

[model_cards] Add card for gpt2-rnm (#8673)

48cc224

Fix bug in x-attentions output for roberta and harden test to catch it (

18c8cf0

#8660)

[model_cards]: control input examples of Geotrend models (#8727)

eec7661

* [model_cards]: control arabic model examples * [model_cards]: control input examples of Geotrend models * [model_cards]: add link to generatation script

Change default cache path (#8734)

9000242

* Change default cache path * Document changes * Apply suggestions from code review Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

[trainer] make generate work with multigpu (#8716)

1e45bef

* make generate work with multigpu * better fix - thanks @sgugger

Document new training argument

49759c0

consistent ignore keys + make private (#8737)

e84786a

* consistent ignore keys + make private * style * - authorized_missing_keys => _keys_to_ignore_on_load_missing - authorized_unexpected_keys => _keys_to_ignore_on_load_unexpected * move public doc of private attributes to private comment

Fix max length in run_plm script (#8738)

367f497

Update TF BERT test

e1b7e10

TF BERT test update

7f2c009

Model parallel documentation (#8741)

02f48b9

* Add parallelize methods to the .rst files * Correct format

[EsperBERTo] Fix URLs to assets

9e71aa2

Fix slow tests v2 (#8746)

6fdd0bb

* Fix BART test * Fix MBART tests * Remove erroneous line from yaml * Update tests/test_modeling_bart.py * Quality

MT5 should have an autotokenizer (#8743)

e09e54f

* MT5 should have an autotokenizer * Different configurations should be able to point to same tokenizers

added instructions for syncing upstream master with forked master via…

8d4ed7e

… PR (#8745) * added instructions for syncing upstream master with forked master via PR * expand to add a note to why this is requested Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>

fix rag index names in eval_rag.py example (#8730)

a7d73cf

Create README.md (#8761)

90d5ab3

Big model table (#8774)

4821ea5

* First draft * Styling * With all changes staged * Update docs/source/index.rst Co-authored-by: Julien Chaumond <chaumond@gmail.com> * Styling Co-authored-by: Julien Chaumond <chaumond@gmail.com>

Fix QA argument handler (#8765)

138f45c

* Fix QA argument handler * Attempt to get a better fix for QA (#8768) Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>

jplu and others added 29 commits December 13, 2020 23:05

Patch *ForCausalLM model (#9092)

6587cf9

[RAG, Bart] Align RAG, Bart cache with T5 and other models of transfo…

fa1ddce

…rmers (#9098) * fix rag * fix slow test * fix past in bart

correct var name in TrainingArguments docstring (#9096)

d6af344

Fixed a broken link in documentation (#9101)

74daf1f

Testing Experimental CI Features (#9070)

b00eb4f

Fix T5 and BART for TF (#9063)

df3f4d2

* Fix T5 for graphe compilation+execution * Fix BART * Fix import * Fix naming * fix attribute name * Oops * fix import * fix tests * fix tests * Update test * Add mising import * Address Patrick's comments * Style * Address Patrick's comment

Pin TF to < 2.4

e4ef57a

Also pin TF CPU

251eb70

fix a bug in eval_batch_retrieval (#9089)

44c340f

native amp leak fix landed in 1.7.1 (#9115)

14c79c3

update README with good news that the leak fix has been applied to pytorch-1.7.1.

Fix stack overflow (#9114)

59da3f2

Fix T5 model parallel tes (#9107)

6ccea04

k

Added TF OpenAi GPT1 Sequence Classification (#9105)

389aba3

* TF OpenAI GPT Sequence Classification * Update src/transformers/models/openai/modeling_tf_openai.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

Fix typo in trainer_tf.py (#9132)

3caba8d

fix bart loss masking (#9131)

80bdb9c

correct mistake in order (#9134)

d018622

Fix Bart Shift (#9135)

18ecd36

* correct mistake in order * fix tensor copy * clone tensor correctly

Fix add order (#9129)

e771749

[Examples] Add automatic dataset splitting in language-modeling examp…

2a7e8e1

…les (#9133) * replaced jnp.split + removing textual model inputs + ensuring warmup_steps > 0 * Add automatic dataset splitting in language-modeling examples

Add large model config (#9140)

0b2f46f

Fix fp16_backend field

51adb97

spatil6 merged commit 8ece073 into spatil6:tf_ctrl_seq_classification Dec 16, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sync #11

sync #11

spatil6 commented Dec 16, 2020