Merge with latest code #1

vaibhavad · 2020-12-19T07:34:03Z

Patch description

Testing steps

Logs

Other information

Data tests (if applicable)
If you added a new teacher, you will be asked to run
python tests/datatests/test_new_tasks.py. Please paste this log here.

* Test profile_ scripts * Add test for distributed_eval * More scripts. * Lint. * Also check -t self_chat * Whoops, gotta de-init * Update docstring

Co-authored-by: Diana Rico <dianaglzrico@learnfair0715.h2.fair>

# v0.9.3 Release Known issues - Short options like `-m` and `-t` do fail in Python 3.8. Use `--model` and `--task` Breaking Changes - A number of old MTurk tasks have been archived and removed from the code (#3085) New Features - [image] Detectron feature extraction (#3083) - [data] Natural questions (#3070) - [data] TaskMaster-2 (#2678) - [data] New versions of multiwoz (#3072) - [distributed] Allow non-tcp based distributed setup (#3095) - [core] Move torch.load/torch.save to PathManager. (#3094, #3077) - [mturk] New task on static turn annotations (#3053) - [mturk] New features in human+model annotation (#3006) - [core] TorchClassifierAgent now prints its number of prameters (#3086) Doc Changes: - New Worlds tutorial (#3049) - Tutorial on using `-t jsonfile` (#3061) - Better help message for --init-model (#3090) - Additions to FAQ (#3073) - Updated model zoo descriptions for BlenderBot (#3096) Bug Fixes - Distributed evaluation now writes to world logs earlier (#3122) - An argument was updated from store_true to bool (#3113) - Self-chat now fails loudly with unexpected batchsize (#3081) - Update drqa default tokenizer away from removed (#3069) - Using wizard of wikipedia in interactive mode downloads data (#3079) Developer notes: - New pre-commit git-secrets (#3106) - Code coverage improvements (#3110, #3091) - More reliable tests. (#3108, #3097, #3055) - Mephisto task dependencies have been updates due to security bugs (#3111, #3101, #3104) - MTurk config folders are exempt from __init__.py requirements (#3105)

* Start changes to example_script * Starting to revise blueprint * Port over blueprint script * Port over runner script * Work on example script * Finish porting over example script * Minor * Current time * Fixes * README * Start revising README * Fix README * Path fix * Jack's PR comments * Comments * Removing unused code * Assume block_on_onboarding_fail exists * Minor * test_init_everywhere fix

Co-authored-by: Stephen Roller <roller@fb.com>

Note: I'm currently running a quick local training with jga as the metric to make sure it's functioning properly (cause trying `parlai eval_model -m bart -t taskmaster2` off the bat has 0s in both `slot_r` and `jga` right now), but figured the metric is easy enough to warrant getting review on it sooner rather than later.

* Test parlai/core/script.py. * Fix bad return code. * Reviewer comments.

* Add taskmaster2 command-line arg for single domain Before this change, `parlai dd -t taskmaster2 --display-verbose` displayed a bunch of sports conversations. After this change, running `parlai dd -t taskmaster2 --display-verbose --domains music` displays music conversations. Tried on a few other domains to validate; also had a print in the `_load_data()` function. Also verified that no argument case that all domains were used + defining `domains` multiple times only used the last one. * Add taskmaster2 command-line arg for single domain Before this change, `parlai dd -t taskmaster2 --display-verbose` displayed a bunch of sports conversations. After this change, running `parlai dd -t taskmaster2 --display-verbose --domains music` displays a music. Also verified that no argument case that all domains were used.

* Add categories * Help string * Add category mapper

Add a temporary note to mention the compatibility issue with the static turn-annotations task until the refactor has been tested and merged in

* Fix to make sure folder is always created * New solution

* Add test for interactive_web * Spinlock * Hm. * Lint.

* Allow missing init opt opts * Add part of unit test * Work on unit test * Test fixes * Fix second test * Fix test * Check obsolete arg does not exist

…3145) * Add notion of metrics collections, which can have other Metrics of multiple metrics be added to it See #3138 for context and use * right, having different arguments for the same function aren't a thing in python... (alas, that's what I get for mostly coding in C++ for the past few years. :P) * fixed a bug while integrating into taskmaster2 * address comments (get rid of separate class, add func to Metrics directly) * actually do the things the last comment

* Add agent code * Clean up * Cleanup * Linting * Linting * Fix name

* Dump in readmes * Update READMEs * Formatting * Add __init__ * Another __init__ file * Add to model_list * Fix name * Wording * Revert paren * Remove unused flag * Fix delimiter * Remove flag

* Add in arXiv links * Update README.md

* Add ED test to selfchat * Reviews Co-authored-by: Diana Rico <dianaglzrico@learnfair0721.h2.fair> Co-authored-by: Diana Rico <dianaglzrico@learnfair0715.h2.fair>

* black * minor changes * black again * address comments, remove four class flag * update readme * black

* Listing quests project * Moving rl paper to the correct heading * New heading for LIGHT quests * Didn't save the merge :(

Co-authored-by: Diana Rico <dianaglzrico@devfair0263.h2.fair>

* add a --version flag to the parlai command #3163 * run autoformatter to fix lint issue * correct weird typo * tweak based off PR feedback https://github.com/facebookresearch/ParlAI/pull/3164/files/ff0a5d1d16cdefd111b0723062ee77c165a88683#r500676864 * fix sloppy mixup of argument help descriptions introduced in last commit (sorry!)

* Support special tokens in non-HF BPE dictionaries. * Lint. * Decode implementations. * Bug fixes. * Special tokens in hugging_face/gpt2. Reviewer comments. * Update URLs for reviewers. * Switch --hf-skip-special-tokens to --skip-special-tokens * Just kill the option. * Spelling. * Elaboration actually. * Lint. * Add a test for additional tokens with hugging_face/gpt2. * Add support for special tokens in re/split/space. * Add in a slightly harsher test. * Whoops.

* Implement BPE dropout. * Only BPE dropout on text, not labels. * Add a unit test. * Notes for the future. * Dictionary save works for slow bytelevel bpe * Finish adding tests. * Reviewer coments. * Rip out unrelated change.

) * Revert "Revert "[Safety Recipes] Open source Sensitive Topics classifier and data (#3253)" (#3259)" This reverts commit 1b8bc8c. * fix build data

* urllib3 and fairseq bumping * Update metrics.py

* add yelp * Yelp WIP * add multitask classifier agent * multitask model + interactive world * fix yelp * fix model list naming

Following up on feedback from fairinternal/ParlAI-Internal#1842 Test plan: ran locally, verified commit gets printed

* init model * typo

* Add static turn annotations analysis script * First fixes * Parentheses bug * Dump unit test * Fixes * Hack together unit test * Path fixes * Fixes * Get test to pass * Lint * Lint * CI issues * Fix urllib3 more precisely * Try removing urllib again * Try known-good urllib3 version * Flexible requirements * Easier debugging * More easy debugging * Just remove requirements for right now * Even easier debugging * Float issues * Cleaner cleaner debugging * Don't use self.assertEqual at all * Sort dataframes * Fix broken calculation * Better attempt to compare dfs * Reset index * Add reqs back in * TODO for DataBrowser * Test tweak

* First shot at refactor. * Cover more ground. * Add gpu unittests. * Add markers for teacher tests. * Kill quicktests. Add teacher tests. * Crowdsourcing and mturk tests. * Fill out the rest. * Fix link checker. * Bigger. * Ordering. * Fix process and merge. * Checkpoint. * Check in regressions. * Bump requirements. * Lint. * Fix code test. * Update. * Fix local error. * No more light genderation, emily must fix. * Grr, twitter. * Cut down on the number of tests, speedup. * Drop the datatest. * Add documentation on regression tests. * Reviewer comments. * Also count num examples and num episodes. * Fix build. * Bump cache. * Fix CornellMovie num examples.

Link to Google Form to request time-limited MMB model weights

* Dump in what I have so far * Starting work on Meph tests * Minor * Remove samples * Work on static turn annotations unit test * Formatting file differently * Minor * Fixes * Minor * Don't check onboarding for now * Update convos * Fixes * Don't have test be mixin * Abstract away 1-turn tests * Fixes * More tests * Fixes to 3 tasks * Make tests cleaner * Remember to build the task * Test reversion to test * Revert config.yml * Update import

* Fix argparse issues in python 3.8 * lint

* [dist] Allow arbitrary sizes for object syncs. * Spelling

* Refactor existing crowdsourcing end-to-end tasks * Update import * Add new files * Pass in model config directly * Sample model config * Update RunScriptConfigs * Param tweaks * Clarify task directory var * Clarify var * Fix JSON * Various fixes * Dump Fast ACUTE test * Revisions * Work on unit test * Try to use data regressions * Finish prototype * Various fixes * Move * Fix file issue * Fix tests * Fix tests * Fix tests * Temp tweak * Fix turn annotations static test * Temp raise import errors * Call analysis * Check analysis inputs * Various fixes * Various fixes * Minor * Pytest fixtures * Fix fixture * Fix Mephisto version * Bump reqs again * Partial work on tests * Clean up fast acute tests * Add more tests * Add remaining tests * Comment out some functions for now * Don't yield in superclass * Try to make fast ACUTE code work * Fix tests * Temp test to understand why tests aren't working on CI * Revert "Temp test to understand why tests aren't working on CI" This reverts commit b51680f. * Temporarily block the Q-function runs * Modify fixture * Run setup/teardown once per function * Revert "Run setup/teardown once per function" This reverts commit 9732bd7. * Now just disable base fast acute * Revert "Now just disable base fast acute" This reverts commit 2a3500e. * Just give time for a worker to be registered * Waiting for longer before retrying * Is it about alphabetical order? * Another rename * Add back in setup/teardown for chat demo * Lint * Remove old ACUTE code * Revert temp crowdsourcing changes * Typo * Cleanup * Fix dir * More cleanup * Tweaks * Rename variant * Lint * PR changes * TODO for future flags * Tweak * More nuanced waiting * Get example scripts to work * Fix import * Add back in dependency * Defaults fix * Path tweak * Analysis tweaks * Move blueprints to their own file * Black * Convenience message * Don't remove old ACUTE-Eval in this PR * README notes

#3303) Bumps [ini](https://github.com/isaacs/ini) from 1.3.5 to 1.3.8. - [Release notes](https://github.com/isaacs/ini/releases) - [Commits](npm/ini@v1.3.5...v1.3.8) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Bumps [ini](https://github.com/isaacs/ini) from 1.3.5 to 1.3.8. - [Release notes](https://github.com/isaacs/ini/releases) - [Commits](npm/ini@v1.3.5...v1.3.8) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Remove package-lock files * .gitignore * Specific exclusion

* Pass back dot prod * Pass back attention matrices * Copy in more-outputs version * Add in encoder with more outputs * Layer in model code * Decoder output * Start passing back embedding output * Pass back embedding output from decoder * Just comment out unused code * Black * Fixes * BART tweaks * Fixes * Hack to ignore new projection matrices * Fix import * Note fix * Remove state_dict hack * Add in distillation agents * Prototype unit test * Partial work on unit test * Try to set up reproducible test * Fix last set of nums * Change test path * Embed inputs in a separate function * Revert * Minor * Add hooks * Better way of doing hooks * Starting to use hooks * Abstract away * More abstraction * More abstraction * Finish using hooks for attention matrices * Switch to hooks output * Fix test * Start to not pass stuff back * Decoder fixes * More cleanup * MHA compatibility * Fix decoder output * Generalize flag * Reversions for BART stuff * Just circumvent initializing from bart_large * Manipulate mask * Share teacher model * Type fixes and other fixes * New test * Partial test with note * Remove test * Clear hooks * Partial README * Finish README * Linting * Lint * Test fixes * Test fixes * Help unit test * Test tweak * Fix test * Note on clamping * Script for removing projection matrices * Compatibility with distributed mode * Minor * Clean up clamping code * super() fix * Comment * PR comments * PR comments * Split into 2 separate distillation tests * Fixed message teacher * Split out loss checking * Split tests * Switch to pytest regressions * PR comments * Remove staticmethod * Fixes * Lint * mypy fix * Fix seed * Fix seed?? * New test * fp32 mode * Cleanup * Lint

* fix dialogpt dual usage of endoftext * add null_idx = -1 * dialog bs test * Set null_idx in model and decoder, add to dialogpt test * small formats * accidental delete old test * reviewer comment

* [release] Bump to v0.10.0.

* Add build files * README work * New entries in model list * README * Adding sample responses * Reordering

* Some initial work to the turn annotations task * Finishing blueprint * Onboarding works * Got checkboxes running * It works up through saving * Most of frontend pass flow works now * dropped file * Getting things working * TODO * Mephisto version * Start work to hook in AgentState * Minor cleanup * Add back in tag * Lots of cleanup * Fix self.agents * Pass in task_type * Partial run stats work * Print run stats * Work on saving data * Chat data folder, and cleanups * Fixes * Minor * Fix imports * Bump up version * Annotations fix * problem_data fix * Refactor existing crowdsourcing end-to-end tasks * Update import * Start to generalize chat test * More generalizing * More generalizing * Finish generalizing * Abstract away full test * Add in stub for turn-annotations test * Specify num agents * Adding in expected outputs * More moving stuff around * Load actual results * Work on setting up test * Finish prototype of test * Fixes so far * Match up outputs * Fix chat demo test * First work on removing old turn annotations code * Remove remaining turn annotations code * Minor fix * Autoformat * Add exception for test_init_everywhere * README updates * Channel issue * Fix Meph version * More robust key checking * Fixes * Add dependency back * Space * Pin to new Mephisto version * Test compatibility * Linting Co-authored-by: EricMichaelSmith <ems@fb.com> Co-authored-by: Eric Smith <EricMichaelSmith@users.noreply.github.com>

* Refactor existing crowdsourcing end-to-end tasks * Update import * Add new files * Pass in model config directly * Sample model config * Update RunScriptConfigs * Param tweaks * Clarify task directory var * Clarify var * Fix JSON * Various fixes * Dump Fast ACUTE test * Revisions * Work on unit test * Try to use data regressions * Finish prototype * Various fixes * Move * Fix file issue * Fix tests * Fix tests * Fix tests * Temp tweak * Fix turn annotations static test * Temp raise import errors * Call analysis * Check analysis inputs * Various fixes * Various fixes * Minor * Pytest fixtures * Fix fixture * Fix Mephisto version * Bump reqs again * Partial work on tests * Clean up fast acute tests * Add more tests * Add remaining tests * Comment out some functions for now * Don't yield in superclass * Try to make fast ACUTE code work * Fix tests * Temp test to understand why tests aren't working on CI * Revert "Temp test to understand why tests aren't working on CI" This reverts commit b51680f. * Temporarily block the Q-function runs * Modify fixture * Run setup/teardown once per function * Revert "Run setup/teardown once per function" This reverts commit 9732bd7. * Now just disable base fast acute * Revert "Now just disable base fast acute" This reverts commit 2a3500e. * Just give time for a worker to be registered * Waiting for longer before retrying * Is it about alphabetical order? * Another rename * Add back in setup/teardown for chat demo * Lint * Remove old ACUTE code * Revert temp crowdsourcing changes * Typo * Cleanup * Fix dir * More cleanup * Tweaks * Rename variant * Lint * PR changes * TODO for future flags * Tweak * More nuanced waiting * Get example scripts to work * Fix import * Add back in dependency * Defaults fix * Path tweak * Analysis tweaks * Move blueprints to their own file * Black * Convenience message * Don't remove old ACUTE-Eval in this PR * README notes * Analyze multiple runs * Better path * Fixes * Update flag * Check eval question

* Revisions * Updating results

stephenroller and others added 30 commits September 24, 2020 11:05

Coverage for more scripts (#3110)

1c43b85

* Test profile_ scripts * Add test for distributed_eval * More scripts. * Lint. * Also check -t self_chat * Whoops, gotta de-init * Update docstring

Better help message for --init-model (#3090)

65a28fd

flag to bool (#3113)

328abc5

Co-authored-by: Diana Rico <dianaglzrico@learnfair0715.h2.fair>

swap to have partial results (#3122)

63eb7ae

AcceptabilityChecker doc (#3124)

0b1841a

[TGA] Return text for all beams (#3123)

d761350

Co-authored-by: Stephen Roller <roller@fb.com>

Convert TaskMaster-2 to PathManager. (#3129)

258ae90

Test parlai/core/script.py (#3117)

d68b314

* Test parlai/core/script.py. * Fix bad return code. * Reviewer comments.

Using the parlai color (#3137)

9cd6b6c

Flag to enable polarity categories (#3132)

f6cad6c

* Add categories * Help string * Add category mapper

Temp static-turn-annotations compatibility note (#3142)

0522f47

Add a temporary note to mention the compatibility issue with the static turn-annotations task until the refactor has been tested and merged in

Make sure folder is always created when unzipping COCO (#3143)

2c71d48

* Fix to make sure folder is always created * New solution

Add test for interactive_web (#3114)

23beb37

* Add test for interactive_web * Spinlock * Hm. * Lint.

Allow missing args when using --init-opt (#3112)

6b52b48

* Allow missing init opt opts * Add part of unit test * Work on unit test * Test fixes * Fix second test * Fix test * Check obsolete arg does not exist

Add gender bias agent (#3131)

7d76fd0

* Add agent code * Clean up * Cleanup * Linting * Linting * Fix name

Model card and project README (#3144)

0b0622d

* Dump in readmes * Update READMEs * Formatting * Add __init__ * Another __init__ file * Add to model_list * Fix name * Wording * Revert paren * Remove unused flag * Fix delimiter * Remove flag

Add in arXiv links (#3151)

fd1b8bb

* Add in arXiv links * Update README.md

Add MMB to ParlAI page (#3152)

53dc771

Add ED test to Self-chat (#3082)

f809112

* Add ED test to selfchat * Reviews Co-authored-by: Diana Rico <dianaglzrico@learnfair0721.h2.fair> Co-authored-by: Diana Rico <dianaglzrico@learnfair0715.h2.fair>

[GenderBias] Add Controllable Gender Bias Task (#3146)

dc7437a

* black * minor changes * black again * address comments, remove four class flag * update readme * black

Listing quests project (#3153)

4e30d4a

[LIGHT] New heading for docs about the light project (#3156)

51cf028

* Listing quests project * Moving rl paper to the correct heading * New heading for LIGHT quests * Didn't save the merge :(

Update requests version (#3155)

75e45f5

Co-authored-by: Diana Rico <dianaglzrico@devfair0263.h2.fair>

distributed_eval now mimics distributed_train (#3157)

8f2ecc3

stephenroller and others added 29 commits November 10, 2020 19:16

[bpe] Support BPE dropout (#3232)

2930a64

* Implement BPE dropout. * Only BPE dropout on text, not labels. * Add a unit test. * Notes for the future. * Dictionary save works for slow bytelevel bpe * Finish adding tests. * Reviewer coments. * Rip out unrelated change.

[Safety Recipes] Open source Sensitive Topics classifier and data (#3260

48c1f83

) * Revert "Revert "[Safety Recipes] Open source Sensitive Topics classifier and data (#3253)" (#3259)" This reverts commit 1b8bc8c. * fix build data

Fix broken CI checks (#3264)

4a90df7

* urllib3 and fairseq bumping * Update metrics.py

MDGender Open Source Pt. 2 (#3258)

0bd0fd3

* add yelp * Yelp WIP * add multitask classifier agent * multitask model + interactive world * fix yelp * fix model list naming

[minor] Have param sweep print out parlai_fb commit hash (#3271)

4bc84f0

Following up on feedback from fairinternal/ParlAI-Internal#1842 Test plan: ran locally, verified commit gets printed

[gpt2] Backwards compatibility with very old checkpoints. (#3261)

66c1ce8

[BART] Init model fix (#3272)

7bdc9e9

* init model * typo

Meph API change (#3275)

da77c39

Link to request Multi-Modal BlenderBot weights (#3281)

4fc0617

Link to Google Form to request time-limited MMB model weights

Fix argparse issues in python 3.8 (#3284)

2da250c

* Fix argparse issues in python 3.8 * lint

Updating old Mephisto import reference (#3285)

db7086c

[dist] Allow arbitrary sizes for object syncs. (#3291)

2b9d120

* [dist] Allow arbitrary sizes for object syncs. * Spelling

drian (#3301)

32e10b4

Remove package-lock files (#3306)

f2c15b0

* Remove package-lock files * .gitignore * Specific exclusion

Bump CI to fix versioning issues. (#3310)

21f4dd3

fix dialogpt dual usage of END_IDX (#3256)

a2adbe1

* fix dialogpt dual usage of endoftext * add null_idx = -1 * dialog bs test * Set null_idx in model and decoder, add to dialogpt test * small formats * accidental delete old test * reviewer comment

Bump to v0.10.0 (#3307)

e4f3127

* [release] Bump to v0.10.0.

Release distilled BlenderBot models (#3313)

79672f5

* Add build files * README work * New entries in model list * README * Adding sample responses * Reordering

Revise names/flags of distilled BlenderBot models (#3314)

9916067

* Revisions * Updating results

vaibhavad merged commit 106290b into vaibhavad:master Dec 19, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge with latest code #1

Merge with latest code #1

vaibhavad commented Dec 19, 2020

Merge with latest code #1

Merge with latest code #1

Conversation

vaibhavad commented Dec 19, 2020