facebookresearch · stephenroller · Apr 27, 2021 · Apr 27, 2021
diff --git a/.github/pull_request_template.md b/.github/pull_request_template.md
@@ -4,7 +4,7 @@ it is necessary. If your patch fixes an issue, please reference that issue here.
 
 **Testing steps**
 <!-- Enter steps to test your pull request. Give a clear and concise description of
-what you expected to happen during testing. Include any logs in ```backticks``` if you have them. 
+what you expected to happen during testing. Include any logs in ```backticks``` if you have them.
 Also make sure you have connected your account to CircleCI and those tests run successfully. -->
 
 **Other information**

diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -8,10 +8,10 @@ desired to increase the pool of tasks, models, and baselines.
 ## Pull Requests
 We actively welcome your pull requests.
 
-1. Fork the repo and then clone the forked repository. (See this [github guide](https://guides.github.com/activities/forking/) on forking for more info).  
+1. Fork the repo and then clone the forked repository. (See this [github guide](https://guides.github.com/activities/forking/) on forking for more info).
    **If you have already cloned the repo directly and committed changes, follow the steps in the [section below](#moving-changes-youve-committed-to-a-fork)**
 2. Create your branch from `master`. Set up your environment
-   and run `pre-commit install` once. 
+   and run `pre-commit install` once.
 3. Make your changes
 4. If you've added code that should be tested, [add tests](http://parl.ai/docs/tutorial_tests.html).
 5. If you've changed APIs, update the documentation.
@@ -20,15 +20,15 @@ We actively welcome your pull requests.
 8. If you've added a new dataset, you should also run
    `python -m pytest -m data`. Copy-paste the output into a comment in your PR.
 9. If you haven't already, complete the Contributor License Agreement ("CLA").
-10. Link [CircleCI](https://circleci.com/vcs-authorize/) to your github account 
-   if you haven't done so previously (and make sure the CircleCI tests run 
+10. Link [CircleCI](https://circleci.com/vcs-authorize/) to your github account
+   if you haven't done so previously (and make sure the CircleCI tests run
    successfully on the PR after you push your changes).
 11. Push your changes!
 12. Once the PR is accepted and CI is passing, we will merge the PR for you.
 
 ### Moving changes you've committed to a fork
 1. Fork the repo
-2. In your local repo, rename your origin remote to upstream 
+2. In your local repo, rename your origin remote to upstream
    ```
    git remote rename origin upstream
    ```
@@ -40,7 +40,7 @@ We actively welcome your pull requests.
    ```
    git fetch origin
    ```
-5. Make your local branch track the remote branch (of the forked repo) 
+5. Make your local branch track the remote branch (of the forked repo)
    ```
    git branch --set-upstream-to origin/master master
    ```

diff --git a/docs/source/mutators.md b/docs/source/mutators.md
@@ -12,4 +12,3 @@ their output, as well as their options when available.
 
 ```{include} mutators_list.inc
 ```
-
diff --git a/docs/source/tutorial_basic.md b/docs/source/tutorial_basic.md
@@ -409,7 +409,7 @@ parlai interactive --model-file zoo:pretrained_transformers/model_poly/model --t
 parlai interactive --model projects:wizard_of_wikipedia:interactive_retrieval --task wizard_of_wikipedia
 ```
 
-To view additional fields from the model output, try use the flag `--display-add-fields`. For example, 
+To view additional fields from the model output, try use the flag `--display-add-fields`. For example,
 ```
 parlai interactive --model-file zoo:blender/blender_90M/model --task convai2 --display-add-fields beam_texts
 ```

diff --git a/docs/source/tutorial_crowdsourcing.md b/docs/source/tutorial_crowdsourcing.md
@@ -92,7 +92,7 @@ A few things to keep in mind:
     start returning `True` for the `episode_done()` function.
 2.  Make sure to test your dialog task using Mephisto's sandbox mode (enabled by default) before
     pushing it live. See the [crowdsourcing README](https://github.com/facebookresearch/ParlAI/tree/master/parlai/crowdsourcing#running-tasks-live) for running live tasks.
-    
+
 Advanced Task Techniques
 ------------------------
 
@@ -127,7 +127,7 @@ Follow the steps below:
 -   Mephisto's default MTurk functionality requires a free Heroku account,
     which can be obtained [here](https://signup.heroku.com/). Running
     any Mephisto MTurk operation will walk you through linking the two.
-    
+
 To run a crowdsourcing task, launch its run file (typically `run.py`) with the proper flags, using a command like the following:
 
 ```bash
@@ -207,7 +207,7 @@ Mephisto MTurk Tips and Tricks
     __quite__ getting it allows those workers to work on other tasks for
     you in the future. You can soft-block workers by calling [`Worker.grant_qualification()`](https://github.com/facebookresearch/Mephisto/blob/master/mephisto/data_model/qualification.py) for a certain `qualification_name`, which is typically set by the `mephisto.blueprint.block_qualification` parameter. That worker will then not be able to work on any
     tasks that use the same value for `mephisto.blueprint.block_qualification`.
-    
+
 ### Preventing and Handling Crashes
 
 -   The `max_num_concurrent_units` argument when initializing [`TaskLauncher`](https://github.com/facebookresearch/Mephisto/blob/master/mephisto/operations/task_launcher.py) controls how many people can work on your task at any given time: set this sufficiently low for your task. Leaving this too high might cause your Heroku server to run into issues depending on how many messages per second it's trying to
@@ -251,4 +251,3 @@ Additional Credits
 -   Turker icon credit: [Amazon Mechanical
     Turk](https://requester.mturk.com/).
 -   Robot icon credit: [Icons8](https://icons8.com/).
-
diff --git a/parlai/crowdsourcing/README.md b/parlai/crowdsourcing/README.md
@@ -7,13 +7,13 @@ Code for crowdsourcing tasks that use Mephisto. See the [Mephisto quick start gu
 ## Running tasks
 
 Tasks are launched by calling the appropriate run script: for instance, an ACUTE-Eval run can be launched with `python parlai/crowdsourcing/tasks/acute_eval/run.py`, followed by any appropriate flags. All run parameters are set using [Hydra](https://github.com/facebookresearch/hydra): append the flag `-c job` to your run command to see a list of all available parameters, grouped by their package name (`mephisto.blueprint`, `mephisto.task`, etc.), which determines how they are called. Each run script has a YAML file of default parameters that will be loaded, found in the `conf/` subfolder of each task.
- 
+
 ### Specifying your own YAML file
- 
+
  The easiest way to specify a different YAML file is to create a new file, say, `my_params.yaml`, in the `conf/` subfolder of the task. Then, you can launch HITs with `python ${TASK_FOLDER}/run.py conf=my_params`.
- 
+
  You also can specify a path to a YAML file existing *outside* of `${TASK_FOLDER}`: you will need to have your YAML file stored at a location `${CUSTOM_FOLDER}/conf/my_params.yaml`, and then you can add a `--config-dir ${CUSTOM_FOLDER}` string to the launch command above.
- 
+
 ### Setting parameters on the command line
 
 Suppose that your YAML file has a `task_reward` parameter defined as follows:
@@ -31,11 +31,11 @@ Here is a partial list of MTurk-specific parameters that can be set in YAML file
 - `mephisto.task.task_description`: Includes detailed information about the kind of task that the HIT contains. On the Amazon Mechanical Turk web site, the HIT description appears in the expanded view of search results, and in the HIT and assignment screens.
 - `mephisto.task.task_tags`: One or more words or phrases that describe the HIT, separated by commas. On MTurk website, these words are used in searches to find HITs.
 - `mturk.worker_blocklist_paths`: The path to a text file containing a list of IDs of MTurk workers to soft-block, separated by newlines. Multiple paths can be specified, delimited by commas (i.e. `path1,path2,path3`).
- 
+
 ### Running tasks live
 
 By default, HITs run locally in sandbox mode. To run live HITs, add `mephisto.provider.requester_name=${REQUESTER_NAME} mephisto/architect=heroku` to your launch command, where `${REQUESTER_NAME}` is the MTurk requester name that you specified when setting up Mephisto.
- 
+
 ## Saving data
 
 By default, Mephisto data is saved in the following directory:

diff --git a/parlai/crowdsourcing/tasks/acute_eval/README.md b/parlai/crowdsourcing/tasks/acute_eval/README.md
@@ -109,7 +109,7 @@ python parlai/crowdsourcing/tasks/acute_eval/analysis.py \
 --pairings-filepath ${PATH_TO_PAIRINGS_FILE} \
 --outdir ${OUTPUT_FOLDER}
 ```
-For analyzing results from a Fast ACUTE run (see below), use the `--root-dir` flag to specify the Fast ACUTE root directory (`mephisto.blueprint.root_dir`) instead of specifying the `--pairings-filepath` and `--outdir` flags. 
+For analyzing results from a Fast ACUTE run (see below), use the `--root-dir` flag to specify the Fast ACUTE root directory (`mephisto.blueprint.root_dir`) instead of specifying the `--pairings-filepath` and `--outdir` flags.
 
 The script will analyze the results and save files with information such as the win/loss rate and significance scores.
 

diff --git a/parlai/crowdsourcing/tasks/model_chat/README.md b/parlai/crowdsourcing/tasks/model_chat/README.md
@@ -28,7 +28,7 @@ In `worlds.py`, modify `ModelChatOnboardWorld.check_onboarding_answers()` to cha
 
 `run_image_chat.py` can be run to chat with a model about an image: each conversation will begin with a selected image, and then the human and model will chat about it.
 
-This code replaces the old `parlai/mturk/tasks/image_chat/` and `parlai/mturk/tasks/personality_captions/` tasks, which are deprecated and can be accessed with `git checkout v0.10.0`. Those tasks also featured the ability to compare two possible captions to an image and rate which one is more engaging: this functionality has now been replaced by the [ACUTE-Eval](https://github.com/facebookresearch/ParlAI/tree/master/parlai/crowdsourcing/tasks/acute_eval) task. 
+This code replaces the old `parlai/mturk/tasks/image_chat/` and `parlai/mturk/tasks/personality_captions/` tasks, which are deprecated and can be accessed with `git checkout v0.10.0`. Those tasks also featured the ability to compare two possible captions to an image and rate which one is more engaging: this functionality has now been replaced by the [ACUTE-Eval](https://github.com/facebookresearch/ParlAI/tree/master/parlai/crowdsourcing/tasks/acute_eval) task.
 
 ### Setup
 

diff --git a/parlai/crowdsourcing/tasks/turn_annotations_static/README.md b/parlai/crowdsourcing/tasks/turn_annotations_static/README.md
@@ -8,7 +8,7 @@ Two variants of the blueprint are supported:
 - `TurnAnnotationStaticInFlightQABlueprint`
     - Includes the ability to add an additional in-flight (i.e. mid-HIT) quality assurance check
     - Called with `python parlai/crowdsourcing/tasks/turn_annotations_static/run_in_flight_qa.py`
-    
+
 For both variants of the blueprint, it is required to pass in your own file of conversations with `mephisto.blueprint.data_jsonl=${PATH_TO_CONVERSATIONS}`.
 
 See `turn_annotations_blueprint.py` for various parameters of this task, including passing in custom annotation bucket definitions using the `annotation_buckets` YAML flag, being able to group multiple conversations into one HIT using the `subtasks_per_unit` flag, passing in onboarding data with answers, and being able to ask only for the final utterance as an annotation.

diff --git a/parlai/opt_presets/README.md b/parlai/opt_presets/README.md
@@ -1,7 +1,7 @@
 # Option Aliases
 
 This folder contains a set of "option aliases" that are automatically packaged
-and provided with ParlAI. They are used as shorthand for 
+and provided with ParlAI. They are used as shorthand for
 
 ## Adding option aliases
 

diff --git a/parlai/tasks/genderation_bias/README.md b/parlai/tasks/genderation_bias/README.md
@@ -44,4 +44,4 @@ that is a good show i watch that while drinking iced tea
 i agree . what do you do for a living ? f0m0
    i'm a researcher i'm researching the fact that mermaids are real
 16:33:19 | loaded 131438 episodes with a total of 131438 examples
-```
+```
diff --git a/parlai/tasks/huggingface/README.md b/parlai/tasks/huggingface/README.md
@@ -1,6 +1,5 @@
 Task: HuggingFace
 ===============
-Description: Can load HuggingFace datasets. 
+Description: Can load HuggingFace datasets.
 
 Website: https://huggingface.co/
-
diff --git a/parlai/tasks/sensitive_topics_evaluation/README.md b/parlai/tasks/sensitive_topics_evaluation/README.md
@@ -11,4 +11,3 @@ Description: Task for evaluating a classifier trained to classify the following
 Link: https://arxiv.org/abs/2010.07079
 
 Tags: #All
-
diff --git a/projects/anti_scaling/README.md b/projects/anti_scaling/README.md
@@ -18,7 +18,7 @@ When performing distillation, terms are added for losses on the encoder output,
 
 Distillation in the style of [Jiao, Xiaoqi, et al. "Tinybert: Distilling bert for natural language understanding." *arXiv preprint arXiv:1909.10351* (2019).](https://arxiv.org/abs/1909.10351)
 
-With TinyBERT-style distillation, the student model can have smaller hidden and FFN dimensions than the teacher model, and projection matrices will be used to measure losses such as those between the hidden-layer outputs. Unlike with DistilBERT-style distillation, the weights of the teacher model cannot be used to initialize the student model. 
+With TinyBERT-style distillation, the student model can have smaller hidden and FFN dimensions than the teacher model, and projection matrices will be used to measure losses such as those between the hidden-layer outputs. Unlike with DistilBERT-style distillation, the weights of the teacher model cannot be used to initialize the student model.
 
 In addition to the losses of DistilBERT-style distillation above, losses are also included on the embedding layer and on the per-layer query/key product matrices from encoder self-attention, decoder self-attention, and encoder/decoder attention. `DistillNarrowTransformerAgent` is used for distilling `transformer/generator` models, and `DistillNarrowBartAgent` is used for distilling `bart` models.
 

diff --git a/projects/contradiction/README.md b/projects/contradiction/README.md
@@ -1,11 +1,11 @@
 # *I like fish <span>&#x1F41F;</span>, especially dolphins <span>&#x1F42C;</span>:*<sup>[∗](#dolphion)</sup> Addressing Contradictions in Dialogue Modeling
 
-A study on *contradiction* detection and *non-contradiction* generation in dialogue modeling.  
+A study on *contradiction* detection and *non-contradiction* generation in dialogue modeling.
 The paper can be found here: [Nie et al. (2020)](https://arxiv.org/abs/2012.13391).
 
 ## Abstract
 
-To quantify how well natural language understanding models can capture consistency in a general conversation, we introduce the **D**ialogu**E** **CO**ntradiction **DE**tection task (**DECODE**) and a new conversational dataset containing both human-human and human-bot contradictory dialogues. We then compare a structured utterance-based approach of using pre-trained Transformer models for contradiction detection with the typical unstructured approach. 
+To quantify how well natural language understanding models can capture consistency in a general conversation, we introduce the **D**ialogu**E** **CO**ntradiction **DE**tection task (**DECODE**) and a new conversational dataset containing both human-human and human-bot contradictory dialogues. We then compare a structured utterance-based approach of using pre-trained Transformer models for contradiction detection with the typical unstructured approach.
 
 Results reveal that:
 <ol>
@@ -48,7 +48,7 @@ See [download data from s3 with raw format](https://github.com/facebookresearch/
 If you use the dataset or models in your own work, please cite with the following BibTex entry:
 ```
 @misc{nie2020i,
-      title={I like fish, especially dolphins: Addressing Contradictions in Dialogue Modelling}, 
+      title={I like fish, especially dolphins: Addressing Contradictions in Dialogue Modelling},
       author={Yixin Nie and Mary Williamson and Mohit Bansal and Douwe Kiela and Jason Weston},
       year={2020},
       eprint={2012.13391},

diff --git a/projects/contradiction/download_with_raw_format.md b/projects/contradiction/download_with_raw_format.md
@@ -1,7 +1,7 @@
 ## Directly Download Raw Data
 
-The dataset (**DECODE**) can be download in [this_link](https://sharenlpfile-01.s3.amazonaws.com/data/decode_v0.1.zip).  
-As described in the paper, **DECODE** includes 6 groups of dialogues: *Train*, *Dev*, *Test*, *Human-Bot*, *A2T*, *RCT*.  
+The dataset (**DECODE**) can be download in [this_link](https://sharenlpfile-01.s3.amazonaws.com/data/decode_v0.1.zip).
+As described in the paper, **DECODE** includes 6 groups of dialogues: *Train*, *Dev*, *Test*, *Human-Bot*, *A2T*, *RCT*.
 
 | Group Name    | Count         | Description  |
 | ------------- |---------------| -------------|
@@ -15,7 +15,7 @@ As described in the paper, **DECODE** includes 6 groups of dialogues: *Train*, *
 The details of each group can be found in the [Nie et al. (2020)](https://arxiv.org/abs/2012.13391).
 
 ### Format
-The format of the file is `JSONL`. Each line in the file is one dialogue example saved in a `JSON`.  
+The format of the file is `JSONL`. Each line in the file is one dialogue example saved in a `JSON`.
 Primary fields that are required for the contradiction detection task:
 - `record_id`: It is the unique ID for the example.
 - `turns`: The field contains a list of turns that presents a conversation between two speaker.
@@ -36,34 +36,34 @@ A example `JSON` is shown below:
     "record_id": "1f47fe86-cfc3-469a-bae3-506c81871bf5",
 
     "turns": [
-        {"turn_id": 0, "agent_id": 0, "text": "i've been to new york city once crazy place that city .", "turn_context": "", "auxiliary": {"contradiction": null}}, 
-        {"turn_id": 1, "agent_id": 1, "text": "i wish i could go there . i'm sure they have a place with great meatloaf !", "turn_context": "", "auxiliary": {"contradiction": null}}, 
-        {"turn_id": 2, "agent_id": 0, "text": "They probably do, somewhere! You can find nearly any cuisine there you want.", "turn_context": "", "auxiliary": {"contradiction": null}}, 
-        {"turn_id": 3, "agent_id": 1, "text": "I wonder if they have anything different, I wonder if anyone has tried to make meatloaf with tofu instead.", "turn_context": "", "auxiliary": {"contradiction": null}}, 
-        {"turn_id": 4, "agent_id": 0, "text": "I'm sure somebody has, though I am not sure how it would taste.", "turn_context": "", "auxiliary": {"contradiction": null}}, 
-        {"turn_id": 5, "agent_id": 1, "text": "I make tofu meatloaf all the time, it is delicious", "turn_context": "", "auxiliary": {"contradiction": true}}], 
+        {"turn_id": 0, "agent_id": 0, "text": "i've been to new york city once crazy place that city .", "turn_context": "", "auxiliary": {"contradiction": null}},
+        {"turn_id": 1, "agent_id": 1, "text": "i wish i could go there . i'm sure they have a place with great meatloaf !", "turn_context": "", "auxiliary": {"contradiction": null}},
+        {"turn_id": 2, "agent_id": 0, "text": "They probably do, somewhere! You can find nearly any cuisine there you want.", "turn_context": "", "auxiliary": {"contradiction": null}},
+        {"turn_id": 3, "agent_id": 1, "text": "I wonder if they have anything different, I wonder if anyone has tried to make meatloaf with tofu instead.", "turn_context": "", "auxiliary": {"contradiction": null}},
+        {"turn_id": 4, "agent_id": 0, "text": "I'm sure somebody has, though I am not sure how it would taste.", "turn_context": "", "auxiliary": {"contradiction": null}},
+        {"turn_id": 5, "agent_id": 1, "text": "I make tofu meatloaf all the time, it is delicious", "turn_context": "", "auxiliary": {"contradiction": true}}],
+
+    "is_contradiction": true,
+    "aggregated_contradiction_indices": [3, 5],
 
-    "is_contradiction": true, 
-    "aggregated_contradiction_indices": [3, 5], 
-
     # Other collection related field.
     "num_of_turns_by_writer": 2
-    "writer_contradiction_indices": [3, 5], 
+    "writer_contradiction_indices": [3, 5],
     "verifications": [
-        {"verification_id": "36236842-98fc-495d-9c04-182f1b77c246", "is_contradiction": true, "verifier_contradiction_indices": [3, 5]}, 
-        {"verification_id": "fd25e3f4-7366-42a5-aed5-0d20082cc833", "is_contradiction": true, "verifier_contradiction_indices": [3, 5]}, 
+        {"verification_id": "36236842-98fc-495d-9c04-182f1b77c246", "is_contradiction": true, "verifier_contradiction_indices": [3, 5]},
+        {"verification_id": "fd25e3f4-7366-42a5-aed5-0d20082cc833", "is_contradiction": true, "verifier_contradiction_indices": [3, 5]},
         {"verification_id": "c1648e0f-cd8d-408c-89b5-54a8dc7f522b", "is_contradiction": true, "verifier_contradiction_indices": [3, 5]}],
-    
+
     # Other field you normally wouldn't need.
     "agents": {
-        "1": {"is_human": true, "persona_lines": []}, 
+        "1": {"is_human": true, "persona_lines": []},
         "0": {"is_human": true, "persona_lines": []}
-    }, 
-    "conversation_contexts": null, 
+    },
+    "conversation_contexts": null,
     "is_truncated": true,
-    "auxiliary": { 
+    "auxiliary": {
         "source": "BST_test"
     },
     "conversation_id": "9cb462d9-86f1-4296-af36-009d2e4d90f8#truncated#4",
 }
-```
+```
diff --git a/projects/genderation_bias/README.md b/projects/genderation_bias/README.md
@@ -32,4 +32,4 @@ By default, all mitigation methods are turned on at once. Use the flags `--add-c
 
 ## Models
 
-TBD.
+TBD.
Original file line number	Diff line number	Diff line change
Expand Up		@@ -12,4 +12,3 @@ their output, as well as their options when available.

		```{include} mutators_list.inc
		```
Original file line number	Diff line number	Diff line change
Expand Up		@@ -11,4 +11,3 @@ Description: Task for evaluating a classifier trained to classify the following
		Link: https://arxiv.org/abs/2010.07079

		Tags: #All
Original file line number	Diff line number	Diff line change
Expand Up		@@ -32,4 +32,4 @@ By default, all mitigation methods are turned on at once. Use the flags `--add-c

		## Models

		TBD.
		TBD.