update #1

mymusise · 2020-11-12T15:39:26Z

What does this PR do?

Fixes # (issue)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to the it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors which may be interested in your PR.

Signed-off-by: mymusise <mymusise1@gmail.com>

* Fix comet_ml import and add ensure availability * Make isort happy * Make flake8 happy * Don't show comet_ml warn if COMET_MODE=DISABLED * Make isort happy

* Fix a few docstrings * More fixes * Styling

* Fix DeBERTa docs * Tokenizer and config

* better reports * a whole bunch of reports in their own files * clean up * improvements * github artifacts experiment * style * complete the report generator with multiple improvements/fixes * fix * save all reports under one dir to easy upload * can remove temp failing tests * doc fix * some cleanup

* Fix callback_list * Add test Signed-off-by: harupy <17039389+harupy@users.noreply.github.com> * Fix test Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* add entailment dim argument * rename dim -> id * fix last name change, style * rm arg, auto-infer only * typo * rm superfluous import

* first attempt to add AzureML callbacks * func arg fix * var name fix, but still won't fix error... * fixing as in https://discuss.huggingface.co/t/how-to-integrate-an-azuremlcallback-for-logging-in-azure/1713/2 * Avoid lint check of azureml import * black compliance * Make isort happy * Fix point typo in docs * Add AzureML to Callbacks docs * Attempt to make sphinx happy * Format callback docs * Make documentation style happy * Make docs compliant to style Co-authored-by: Davide Fiocco <davide.fiocco@frontiersin.net>

* New run_clm script * Formatting * More comments * Remove unused imports * Apply suggestions from code review Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com> * Address review comments * Change link to the hub Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>

…asePlus enhancements (#8107) * move the helper code into testing_utils * port test_trainer_distributed to work with pytest * improve docs * simplify notes * doc * doc * style * doc * further improvements * torch might not be available * real fix * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Improve pipeline() docstrings * make style * Update wording for config

Co-authored-by: Sam Shleifer <sshleifer@gmail.com>

* Remove SO from question template * Styling

* Add auto next sentence prediction * Fix style * Add mobilebert next sentence prediction

* Add a Windows dev section in the contributing file. * Forgotten link * Trigger CI * Rework description * Trigger CI

* [testing utils] get_auto_remove_tmp_dir default change Now that I have been using `get_auto_remove_tmp_dir default change` for a while, I realized that the defaults aren't most optimal. 99% of the time we want the tmp dir to be empty at the beginning of the test - so changing the default to `before=True` - this shouldn't impact any tests since this feature is used only during debug. * simplify things * update docs * fix doc layout * style * Update src/transformers/testing_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * better 3-state doc * style * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * s/tmp/temporary/ + style * correct the statement Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add missing import * Fix dummy objects

* stash * TF Integration testing for ELECTRA, BERT, Longformer * Trigger slow tests * Apply suggestions from code review

…hunking should be in the chunking dimension, an exception was raised if the complete shape of the inputs was not the same rather than only the chunking dimension (#8391) Co-authored-by: pedro <pe25171@mit.edu>

* Add next sentence prediction loss computation * Apply style * Fix tests * Add forgotten import * Add forgotten import * Use a new parameter * Remove kwargs and use positional arguments

The new run_ner.py script tries to run prediction on the input test set `datasets["test"]`, but it should be the tokenized set `tokenized_datasets["test"]`

* Create modeling_tf_dpr.py * Add TFDPR * Add back TFPegasus, TFMarian, TFMBart, TFBlenderBot last commit accidentally deleted these 4 lines, so I recover them back * Add TFDPR * Add TFDPR * clean up some comments, add TF input-style doc string * Add TFDPR * Make return_dict=False as default * Fix return_dict bug (in .from_pretrained) * Add get_input_embeddings() * Create test_modeling_tf_dpr.py The current version is already passed all 27 tests! Please see the test run at : https://colab.research.google.com/drive/1czS_m9zy5k-iSJbzA_DP1k1xAAC_sdkf?usp=sharing * fix quality * delete init weights * run fix copies * fix repo consis * del config_class, load_tf_weights They shoud be 'pytorch only' * add config_class back after removing it, test failed ... so totally only removing "use_tf_weights = None" on Lysandre suggestion * newline after .. note:: * import tf, np (Necessary for ModelIntegrationTest) * slow_test from_pretrained with from_pt=True At the moment we don't have TF weights (since we don't have official official TF model) Previously, I did not run slow test, so I missed this bug * Add simple TFDPRModelIntegrationTest Note that this is just a test that TF and Pytorch gives approx. the same output. However, I could not test with the official DPR repo's output yet * upload correct tf model * remove position_ids as missing keys Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: patrickvonplaten <patrick@huggingface.co>

@sgugger

* First addition of Flax/Jax documentation Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * make style * Ensure input order match between Bert & Roberta Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Install dependencies "all" when building doc Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * wraps build_doc deps with "" Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Addressing @sgugger comments. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Use list to highlight JAX features. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Make style. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Let's not look to much into the future for now. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Style Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>

Co-authored-by: Sam Shleifer <sshleifer@gmail.com>

* Update deploy-docs dependencies on CI to enable Flax Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Added pair of "" Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

@Pierrci

…ames cc @Pierrci

There is a tiny typo in the code "transformers/examples/language-modeling/run_mlm_wwm.py" at line 284. [Details.](huggingface#9012)

mymusise and others added 30 commits October 27, 2020 07:29

fix doc bug (#8082)

985bba9

Signed-off-by: mymusise <mymusise1@gmail.com>

Fix comet_ml import and add ensure availability (#7933)

1496931

* Fix comet_ml import and add ensure availability * Make isort happy * Make flake8 happy * Don't show comet_ml warn if COMET_MODE=DISABLED * Make isort happy

Doc styling fixes (#8074)

c42596b

* Fix a few docstrings * More fixes * Styling

Fix DeBERTa docs (#8092)

33f6ef7

* Fix DeBERTa docs * Tokenizer and config

Move style_doc to extra_quality_checks (#8081)

d93acd6

Fix IterableDataset with __len__ in Trainer (#8095)

286dc19

Styling fix

3220f21

Fix assertion error message for MLflowCallback (#8091)

8e28c32

Fix a bug for CallbackHandler.callback_list (#8052)

7bff0af

* Fix callback_list * Add test Signed-off-by: harupy <17039389+harupy@users.noreply.github.com> * Fix test Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>

[model_cards] Switch to a more explicit domain for the media bucket

55bc0c5

update/add setup targets (#8076)

edd3721

DEP: pinned sentencepiece to 0.1.91 in setup.py (#8069)

9fefdb0

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

infer entailment label id on zero shot pipeline (#8059)

3e58b6b

* add entailment dim argument * rename dim -> id * fix last name change, style * rm arg, auto-infer only * typo * rm superfluous import

Fully remove codecov (#8093)

a090606

Adjust setup so that all extras run on Windows (#8102)

c5f3149

rm multiclass option from model card

556709a

Move installation instructions to the top (#8106)

41cc5f3

Fix typo

b715e40

Remove header

1e01db3

[gh actions] run artifacts job always (#8110)

8065fea

fix(trainer_callback]: typo (#8121)

b4cacb7

[DOC] Improve pipeline() docstrings for config and tokenizer (#8123)

5193172

* Improve pipeline() docstrings * make style * Update wording for config

Document the various LM Auto models (#8118)

6241c87

Rename add_start_docstrings_to_callable (#8120)

378142a

Update CI cache (#8126)

1b6c8d4

Upgrade PyTorch Lightning to 1.0.2 (#7852)

5e24982

Co-authored-by: Sam Shleifer <sshleifer@gmail.com>

ShichaoSun and others added 28 commits November 10, 2020 09:32

[s2s/distill] hparams.tokenizer_name = hparams.teacher (#8382)

ae1cb4e

[examples] better PL version check (#8429)

5d4972e

Question template (#8440)

3213d3b

* Remove SO from question template * Styling

[docs] improve bart/marian/mBART/pegasus docs (#8421)

c314b1f

Add auto next sentence prediction (#8432)

8551a99

* Add auto next sentence prediction * Fix style * Add mobilebert next sentence prediction

Windows dev section in the contributing file (#8436)

e7e1549

* Add a Windows dev section in the contributing file. * Forgotten link * Trigger CI * Rework description * Trigger CI

Add missing import (#8444)

cace39a

* Add missing import * Fix dummy objects

fix t5 special tokens (#8435)

b935694

Add missing tasks to pipeline docstring (#8428)

8fe6629

[No merge] TF integration testing (#7621)

9fd1f56

* stash * TF Integration testing for ELECTRA, BERT, Longformer * Trigger slow tests * Apply suggestions from code review

fix t5 token type ids (#8437)

70708cc

Bug fix for modeling utilities function: apply_chunking_to_forward, c…

eb3bd73

…hunking should be in the chunking dimension, an exception was raised if the complete shape of the inputs was not the same rather than only the chunking dimension (#8391) Co-authored-by: pedro <pe25171@mit.edu>

[model_cards] harmonization

8dda916

Fix TF Longformer (#8460)

2329083

Add next sentence prediction loss computation (#8462)

da842e4

* Add next sentence prediction loss computation * Apply style * Fix tests * Add forgotten import * Add forgotten import * Use a new parameter * Remove kwargs and use positional arguments

Fix next sentence output (#8466)

069b638

Example NER script predicts on tokenized dataset (#8468)

a38d1c7

The new run_ner.py script tries to run prediction on the input test set `datasets["test"]`, but it should be the tokenized set `tokenized_datasets["test"]`

Replaced some iadd operations on lists with proper list methods. (#8433)

aa2a2c6

Skip test until investigation

c7b6bbe

[s2s] distill t5-large -> t5-small (#8376)

81ebd70

Co-authored-by: Sam Shleifer <sshleifer@gmail.com>

Update deploy-docs dependencies on CI to enable Flax (#8475)

121c24e

* Update deploy-docs dependencies on CI to enable Flax Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Added pair of "" Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

[model_cards] other chars than [\w\-_] not allowed anymore in model n…

c6c08eb

…ames cc @Pierrci

Fix typo in roberta-base-squad2-v2 model card (#8489)

17b1fd8

quick fix on concatenating text to support more datasets (#8474)

924c624

mymusise merged commit ad6030f into mymusise:master Nov 12, 2020

mymusise pushed a commit that referenced this pull request Jan 4, 2021

Fix typo huggingface#9012 (#1) (huggingface#9038)

91ab02a

There is a tiny typo in the code "transformers/examples/language-modeling/run_mlm_wwm.py" at line 284. [Details.](huggingface#9012)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

update #1

update #1

mymusise commented Nov 12, 2020

update #1

update #1

Conversation

mymusise commented Nov 12, 2020

What does this PR do?

Before submitting

Who can review?