specify 'truncation' to avoid transformers warning #5120

epwalsh · 2021-04-13T17:10:23Z

Avoids this warning message: #5119 (reply in thread)

epwalsh · 2021-04-13T17:12:00Z

@nelson-liu GitHub won't let me assign you as a reviewer, but I would appreciate a review if you have time.

nelson-liu · 2021-04-13T17:17:02Z

LGTM, thanks! worth adding a quick test to ensure that this is explicit?

epwalsh · 2021-04-13T18:29:27Z

worth adding a quick test to ensure that this is explicit?

We do have a test in place to make sure the output is truncated. I also tried capturing logs and stderr from HuggingFace in that test using pytest's capsys and caplog fixtures, but neither of those worked for some reason.

epwalsh · 2021-04-13T18:32:01Z

When we removed the truncation parameter (55cfb47), we had transformers pinned to 3.x. Now we have it pinned to 4.x, so I think this is okay.

dirkgr

LGTM, if you sign off on 9555b2c.

I don't understand how we can be setting stride when we return only one flat list of tokens. How is that supposed to work?

epwalsh · 2021-04-13T21:25:16Z

I don't understand how we can be setting stride when we return only one flat list of tokens. How is that supposed to work?

Yea, I don't know why we have that. I'm removing it.

* specify 'truncation' to avoid transformers warning * Update docs * Remove `stride` param * Update CHANGELOG.md Co-authored-by: Dirk Groeneveld <dirkg@allenai.org>

@matt-gardner

* Formatting * New activation functions * Makes position embeddings optional in the transformer embeddings * Adds T5 * Various fixes to make this start up * Share weights * Adds one test that passes, and one test that fails * use min_value_of_dtype in apply_mask * fixes, add beam search * encoder fixes * fix * fix beam search * fix tests * rename to just 'T5' * fix initialization from pretrained * add Model, DatasetReader, and Predictor * remove useless dataset reader * move high-level peices to allennlp-models * revert predictor changes * remove unneeded hidden_size * remove stray comment * bool masks * CHANGELOG * fix test file name * revert other change * revert other change * Distributed training with gradient accumulation (#5100) * Fixes distributed training with gradient accumulation * Fix in case we don't do anything in a batch group * Test for the problematic condition * Formatting * More formatting * Changelog * Fix another test * Fix even more tests * Fixes one more test * I can fix these tests all day. * Add link to gallery and demo in README (#5103) * Add link to gallery in README * Update README.md * try emojis Is this overkill? * Adding a metadata field to the basic classifier (#5104) * Adding metadata parameter to BasicClassifier * Fix * Updating the changelog * reformatting * updating parameter type * fixing import Co-authored-by: Dirk Groeneveld <dirkg@allenai.org> * additional W&B params (#5114) * additional W&B params * add wandb_kwargs * fix * fix docs * Add eval_mode argument to pretrained transformer embedder (#5111) * Add eval_mode argument to pretrained transformer embedder * Edit changelog entry * Lint * Update allennlp/modules/token_embedders/pretrained_transformer_embedder.py * Apply suggestions from code review Co-authored-by: Evan Pete Walsh <epwalsh10@gmail.com> Co-authored-by: Evan Pete Walsh <petew@allenai.org> * specify 'truncation' to avoid transformers warning (#5120) * specify 'truncation' to avoid transformers warning * Update docs * Remove `stride` param * Update CHANGELOG.md Co-authored-by: Dirk Groeneveld <dirkg@allenai.org> * Predicting with a dataset reader on a multitask model (#5115) * Create a way to use allennlp predict with a dataset and a multitask model * Fix type ignoration * Changelog * Fix to the predictor * fix bug with interleaving dataset reader (#5122) * fix bug with interleaving dataset reader * more tests * Update allennlp/data/dataset_readers/interleaving_dataset_reader.py * Update allennlp/data/dataset_readers/interleaving_dataset_reader.py * remove jsonpickle from dependencies (#5121) Co-authored-by: Dirk Groeneveld <dirkg@allenai.org> * Update docstring for basic_classifier (#5124) * improve error message from Registrable class (#5125) Co-authored-by: Akshita Bhagia <akshita23bhagia@gmail.com> * Prepare for release v2.3.0 * fix docs CI * Take the number of runs in the test for distributed metrics (#5127) * Take the number of runs in the test for distributed metrics * Changelog * Add influence functions to interpret module (#4988) * creating a new functionality to fields and instances to support outputing instnaces to json files * creating tests for the new functionality * fixing docs * Delete __init__.py * Delete influence_interpreter.py * Delete use_if.py * Delete simple_influence_test.py * fixing docs * finishing up SimpleInfluence * passing lint * passing format * making small progress in coding * Delete fast_influence.py Submit to the wrong branch * Delete faiss_utils.py wrong branch * Delete gpt2_bug.py not sure why it's included * Delete text_class.py not sure why it's included * adding test file * adding testing files * deleted unwanted files * deleted unwanted files and rearrange test files * small bug * adjust function call to save instance in json * Update allennlp/interpret/influence_interpreters/influence_interpreter.py Co-authored-by: Evan Pete Walsh <epwalsh10@gmail.com> * Update allennlp/interpret/influence_interpreters/influence_interpreter.py Co-authored-by: Evan Pete Walsh <epwalsh10@gmail.com> * Update allennlp/interpret/influence_interpreters/influence_interpreter.py Co-authored-by: Evan Pete Walsh <epwalsh10@gmail.com> * move some documentation of parameters to base class * delete one comment * delete one deprecated abstract method * changing interface * formatting * formatting err * passing mypy * passing mypy * passing mypy * passing mypy * passing integration test * passing integration test * adding a new option to the do-all function * modifying the callable function to the interface * update API, fixes * doc fixes * add `from_path` and `from_archive` methods * fix docs, improve logging * add test * address @matt-gardner's comments * fixes to documentation * update docs Co-authored-by: Evan Pete Walsh <epwalsh10@gmail.com> Co-authored-by: Evan Pete Walsh <petew@allenai.org> * Update CONTRIBUTING.md (#5133) * Update CONTRIBUTING.md * updated changelog Co-authored-by: Akshita Bhagia <akshita23bhagia@gmail.com> Co-authored-by: Arjun Subramonian <arjuns@ip-192-168-0-106.us-west-2.compute.internal> * fix #5132 (#5134) * fix * Prepare for release v2.3.1 * Fairness Metrics (#5093) * Added three definitions of fairness * Updated CHANGELOG * Added DemographicParityWithoutGroundTruth and finished tests * finished refactoring Independence, Separation, and Sufficiency to accumulate * added distributed functionality to Independence, Sufficiency, and Separation * Finished aggregate and distributed functionality for DemographicParityWithoutGroundTruth * fixed GPU and doc issues * fixed GPU and doc issues * fixed GPU and doc issues * fixed GPU issues * fixed GPU issues * added init file * fixed typo * minor docstring changes * minor changes to docstring * Added simple explanations of fairness metrics to docstrings * Further vectorized all metric implementations * Fixed device issue Co-authored-by: Arjun Subramonian <arjuns@Arjuns-MacBook-Pro.local> Co-authored-by: Akshita Bhagia <akshita23bhagia@gmail.com> Co-authored-by: Dirk Groeneveld <dirkg@allenai.org> * fix cached_path for hub downloads (#5141) * fix cached_path for hub downloads * fix test name * fix type hint * Update allennlp/common/file_utils.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * fix * fix Co-authored-by: epwalsh <epwalsh10@gmail.com> Co-authored-by: Evan Pete Walsh <petew@allenai.org> Co-authored-by: Jacob Morrison <jacob1morrison@gmail.com> Co-authored-by: Nelson Liu <nelson-liu@users.noreply.github.com> Co-authored-by: Akshita Bhagia <akshita23bhagia@gmail.com> Co-authored-by: Leo Liu <zeyuliu2@uw.edu> Co-authored-by: ArjunSubramonian <arjun.subramonian@gmail.com> Co-authored-by: Arjun Subramonian <arjuns@ip-192-168-0-106.us-west-2.compute.internal> Co-authored-by: Arjun Subramonian <arjuns@Arjuns-MacBook-Pro.local> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* specify 'truncation' to avoid transformers warning * Update docs * Remove `stride` param * Update CHANGELOG.md Co-authored-by: Dirk Groeneveld <dirkg@allenai.org>

@matt-gardner

* Formatting * New activation functions * Makes position embeddings optional in the transformer embeddings * Adds T5 * Various fixes to make this start up * Share weights * Adds one test that passes, and one test that fails * use min_value_of_dtype in apply_mask * fixes, add beam search * encoder fixes * fix * fix beam search * fix tests * rename to just 'T5' * fix initialization from pretrained * add Model, DatasetReader, and Predictor * remove useless dataset reader * move high-level peices to allennlp-models * revert predictor changes * remove unneeded hidden_size * remove stray comment * bool masks * CHANGELOG * fix test file name * revert other change * revert other change * Distributed training with gradient accumulation (#5100) * Fixes distributed training with gradient accumulation * Fix in case we don't do anything in a batch group * Test for the problematic condition * Formatting * More formatting * Changelog * Fix another test * Fix even more tests * Fixes one more test * I can fix these tests all day. * Add link to gallery and demo in README (#5103) * Add link to gallery in README * Update README.md * try emojis Is this overkill? * Adding a metadata field to the basic classifier (#5104) * Adding metadata parameter to BasicClassifier * Fix * Updating the changelog * reformatting * updating parameter type * fixing import Co-authored-by: Dirk Groeneveld <dirkg@allenai.org> * additional W&B params (#5114) * additional W&B params * add wandb_kwargs * fix * fix docs * Add eval_mode argument to pretrained transformer embedder (#5111) * Add eval_mode argument to pretrained transformer embedder * Edit changelog entry * Lint * Update allennlp/modules/token_embedders/pretrained_transformer_embedder.py * Apply suggestions from code review Co-authored-by: Evan Pete Walsh <epwalsh10@gmail.com> Co-authored-by: Evan Pete Walsh <petew@allenai.org> * specify 'truncation' to avoid transformers warning (#5120) * specify 'truncation' to avoid transformers warning * Update docs * Remove `stride` param * Update CHANGELOG.md Co-authored-by: Dirk Groeneveld <dirkg@allenai.org> * Predicting with a dataset reader on a multitask model (#5115) * Create a way to use allennlp predict with a dataset and a multitask model * Fix type ignoration * Changelog * Fix to the predictor * fix bug with interleaving dataset reader (#5122) * fix bug with interleaving dataset reader * more tests * Update allennlp/data/dataset_readers/interleaving_dataset_reader.py * Update allennlp/data/dataset_readers/interleaving_dataset_reader.py * remove jsonpickle from dependencies (#5121) Co-authored-by: Dirk Groeneveld <dirkg@allenai.org> * Update docstring for basic_classifier (#5124) * improve error message from Registrable class (#5125) Co-authored-by: Akshita Bhagia <akshita23bhagia@gmail.com> * Prepare for release v2.3.0 * fix docs CI * Take the number of runs in the test for distributed metrics (#5127) * Take the number of runs in the test for distributed metrics * Changelog * Add influence functions to interpret module (#4988) * creating a new functionality to fields and instances to support outputing instnaces to json files * creating tests for the new functionality * fixing docs * Delete __init__.py * Delete influence_interpreter.py * Delete use_if.py * Delete simple_influence_test.py * fixing docs * finishing up SimpleInfluence * passing lint * passing format * making small progress in coding * Delete fast_influence.py Submit to the wrong branch * Delete faiss_utils.py wrong branch * Delete gpt2_bug.py not sure why it's included * Delete text_class.py not sure why it's included * adding test file * adding testing files * deleted unwanted files * deleted unwanted files and rearrange test files * small bug * adjust function call to save instance in json * Update allennlp/interpret/influence_interpreters/influence_interpreter.py Co-authored-by: Evan Pete Walsh <epwalsh10@gmail.com> * Update allennlp/interpret/influence_interpreters/influence_interpreter.py Co-authored-by: Evan Pete Walsh <epwalsh10@gmail.com> * Update allennlp/interpret/influence_interpreters/influence_interpreter.py Co-authored-by: Evan Pete Walsh <epwalsh10@gmail.com> * move some documentation of parameters to base class * delete one comment * delete one deprecated abstract method * changing interface * formatting * formatting err * passing mypy * passing mypy * passing mypy * passing mypy * passing integration test * passing integration test * adding a new option to the do-all function * modifying the callable function to the interface * update API, fixes * doc fixes * add `from_path` and `from_archive` methods * fix docs, improve logging * add test * address @matt-gardner's comments * fixes to documentation * update docs Co-authored-by: Evan Pete Walsh <epwalsh10@gmail.com> Co-authored-by: Evan Pete Walsh <petew@allenai.org> * Update CONTRIBUTING.md (#5133) * Update CONTRIBUTING.md * updated changelog Co-authored-by: Akshita Bhagia <akshita23bhagia@gmail.com> Co-authored-by: Arjun Subramonian <arjuns@ip-192-168-0-106.us-west-2.compute.internal> * fix #5132 (#5134) * fix * Prepare for release v2.3.1 * Fairness Metrics (#5093) * Added three definitions of fairness * Updated CHANGELOG * Added DemographicParityWithoutGroundTruth and finished tests * finished refactoring Independence, Separation, and Sufficiency to accumulate * added distributed functionality to Independence, Sufficiency, and Separation * Finished aggregate and distributed functionality for DemographicParityWithoutGroundTruth * fixed GPU and doc issues * fixed GPU and doc issues * fixed GPU and doc issues * fixed GPU issues * fixed GPU issues * added init file * fixed typo * minor docstring changes * minor changes to docstring * Added simple explanations of fairness metrics to docstrings * Further vectorized all metric implementations * Fixed device issue Co-authored-by: Arjun Subramonian <arjuns@Arjuns-MacBook-Pro.local> Co-authored-by: Akshita Bhagia <akshita23bhagia@gmail.com> Co-authored-by: Dirk Groeneveld <dirkg@allenai.org> * fix cached_path for hub downloads (#5141) * fix cached_path for hub downloads * fix test name * fix type hint * Update allennlp/common/file_utils.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * fix * fix Co-authored-by: epwalsh <epwalsh10@gmail.com> Co-authored-by: Evan Pete Walsh <petew@allenai.org> Co-authored-by: Jacob Morrison <jacob1morrison@gmail.com> Co-authored-by: Nelson Liu <nelson-liu@users.noreply.github.com> Co-authored-by: Akshita Bhagia <akshita23bhagia@gmail.com> Co-authored-by: Leo Liu <zeyuliu2@uw.edu> Co-authored-by: ArjunSubramonian <arjun.subramonian@gmail.com> Co-authored-by: Arjun Subramonian <arjuns@ip-192-168-0-106.us-west-2.compute.internal> Co-authored-by: Arjun Subramonian <arjuns@Arjuns-MacBook-Pro.local> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

specify 'truncation' to avoid transformers warning

454ba3b

epwalsh requested a review from dirkgr April 13, 2021 17:11

dirkgr added 2 commits April 13, 2021 14:15

Update docs

9555b2c

Merge branch 'main' into truncate

b5bdcf1

dirkgr approved these changes Apr 13, 2021

View reviewed changes

epwalsh added 2 commits April 13, 2021 14:26

Remove stride param

a242a42

Update CHANGELOG.md

a98bd64

epwalsh merged commit b34df73 into main Apr 13, 2021

epwalsh deleted the truncate branch April 13, 2021 22:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

specify 'truncation' to avoid transformers warning #5120

specify 'truncation' to avoid transformers warning #5120

epwalsh commented Apr 13, 2021

epwalsh commented Apr 13, 2021

nelson-liu commented Apr 13, 2021

epwalsh commented Apr 13, 2021

epwalsh commented Apr 13, 2021

dirkgr left a comment

epwalsh commented Apr 13, 2021

specify 'truncation' to avoid transformers warning #5120

specify 'truncation' to avoid transformers warning #5120

Conversation

epwalsh commented Apr 13, 2021

epwalsh commented Apr 13, 2021

nelson-liu commented Apr 13, 2021

epwalsh commented Apr 13, 2021

epwalsh commented Apr 13, 2021

dirkgr left a comment

Choose a reason for hiding this comment

epwalsh commented Apr 13, 2021