facebookresearch · klshuster · Aug 25, 2022 · Aug 24, 2022
diff --git a/parlai/crowdsourcing/tasks/dialcrowd/README.md b/parlai/crowdsourcing/tasks/dialcrowd/README.md
@@ -35,7 +35,7 @@ You can customize the colors, fonts, and text size associated with your HIT to h
 
 ### Annotation Page
 ![screenshot](images/annotation1.png)
-We show the worker the background for your study, as well as the instructions and the table of intents with their respective definitions, examples, counterexamples, and explanations. 
+We show the worker the background for your study, as well as the instructions and the table of intents with their respective definitions, examples, counterexamples, and explanations.
 
 ***
 
@@ -170,4 +170,4 @@ Information about the tasks can be found in the table `assignments`. It includes
 ## Contributers
 
 Jessica Huynh, Ting-Rui Chiang, Kyusong Lee
-Carnegie Mellon University 2022
+Carnegie Mellon University 2022
diff --git a/parlai/crowdsourcing/tasks/dialcrowd/config.sh b/parlai/crowdsourcing/tasks/dialcrowd/config.sh
@@ -1,5 +1,6 @@
+#!/bin/bash
 # /*********************************************
-#  @ Jessica Huynh, Ting-Rui Chiang, Kyusong Lee 
+#  @ Jessica Huynh, Ting-Rui Chiang, Kyusong Lee
 #  Carnegie Mellon University 2022
 # *********************************************/
 

diff --git a/parlai/crowdsourcing/tasks/dialcrowd/configquality.sh b/parlai/crowdsourcing/tasks/dialcrowd/configquality.sh
@@ -1,5 +1,6 @@
+#!/bin/bash
 # /*********************************************
-#  @ Jessica Huynh, Ting-Rui Chiang, Kyusong Lee 
+#  @ Jessica Huynh, Ting-Rui Chiang, Kyusong Lee
 #  Carnegie Mellon University 2022
 # *********************************************/
 

diff --git a/parlai/crowdsourcing/tasks/dialcrowd/webapp-config/src/css/style.css b/parlai/crowdsourcing/tasks/dialcrowd/webapp-config/src/css/style.css
@@ -3,4 +3,3 @@
  * This source code is licensed under the MIT license found in the
  * LICENSE file in the root directory of this source tree.
  */
-
diff --git a/parlai/crowdsourcing/tasks/dialcrowd/webapp-results/src/css/style.css b/parlai/crowdsourcing/tasks/dialcrowd/webapp-results/src/css/style.css
@@ -3,4 +3,3 @@
  * This source code is licensed under the MIT license found in the
  * LICENSE file in the root directory of this source tree.
  */
-
diff --git a/parlai/crowdsourcing/tasks/dialcrowd/webapp/src/css/style.css b/parlai/crowdsourcing/tasks/dialcrowd/webapp/src/css/style.css
@@ -3,7 +3,7 @@
  * This source code is licensed under the MIT license found in the
  * LICENSE file in the root directory of this source tree.
  */
- 
+
  .hidden {
     display: none;
-  }
+  }
diff --git a/parlai/tasks/spolin/README.md b/parlai/tasks/spolin/README.md
@@ -1,19 +1,19 @@
-# SPOLIN 
+# SPOLIN
 
 This is a ParlAI Teacher for the dataset [SPOLIN](https://justin-cho.com/spolin), which is the dataset from [_Grounding Conversations with Improvised Dialogues_](https://aclanthology.org/2020.acl-main.218/) (ACL 2020)
 
-The teacher loads data from the official Github repo at https://github.com/wise-east/spolin. 
+The teacher loads data from the official Github repo at https://github.com/wise-east/spolin.
 
-Please check out the paper and the repo for details about the dataset. If there are any questions, please reach out to hd.justincho at gmail dot com. 
+Please check out the paper and the repo for details about the dataset. If there are any questions, please reach out to hd.justincho at gmail dot com.
 
-If you use this data, please consider citing our ACL 2020 paper: 
+If you use this data, please consider citing our ACL 2020 paper:
 ```
 @inproceedings{cho2020spolin,
     title={Grounding Conversations with Improvised Dialogues},
     author={Cho, Hyundong and May, Jonathan},
     booktitle ={Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics},
-    publisher = {Association for Computational Linguistics}, 
+    publisher = {Association for Computational Linguistics},
     location =  {Seattle, Washington, USA},
     year={2020}
-}  
-```
+}
+```
diff --git a/parlai/zoo/bb3/MODEL_LICENSE.md b/parlai/zoo/bb3/MODEL_LICENSE.md
@@ -62,4 +62,4 @@ This License Agreement (as may be amended in accordance with this License Agreem
 <br><br>
 11. **MISCELLANEOUS**
 <br><br>
-   If any provision or part of a provision of this License is unlawful, void or unenforceable, that provision or part of the provision is deemed severed from this License, and will not affect the validity and enforceability of any remaining provisions.  The failure of Meta to exercise or enforce any right or provision of this License will not operate as a waiver of such right or provision. This License does not confer any third-party beneficiary rights upon any other person or entity.  This License, together with the Documentation, contains the entire understanding between you and Meta regarding the subject matter of this License, and supersedes all other written or oral agreements and understandings between you and Meta regarding such subject matter.  No change or addition to any provision of this License will be binding unless it is in writing and signed by an authorized representative of both you and Meta.
+   If any provision or part of a provision of this License is unlawful, void or unenforceable, that provision or part of the provision is deemed severed from this License, and will not affect the validity and enforceability of any remaining provisions.  The failure of Meta to exercise or enforce any right or provision of this License will not operate as a waiver of such right or provision. This License does not confer any third-party beneficiary rights upon any other person or entity.  This License, together with the Documentation, contains the entire understanding between you and Meta regarding the subject matter of this License, and supersedes all other written or oral agreements and understandings between you and Meta regarding such subject matter.  No change or addition to any provision of this License will be binding unless it is in writing and signed by an authorized representative of both you and Meta.
diff --git a/parlai/zoo/bb3/data_card.md b/parlai/zoo/bb3/data_card.md
@@ -1,6 +1,6 @@
 # BlenderBot 3 175B data card
 
-## Motivation	
+## Motivation
 
 **For what purpose was the dataset created? Was there a specific task in mind? Was there a specific gap that needed to be filled? Please provide a description.** The several datasets used for fine-tuning BlenderBot 3 were created with various specific tasks in mind. The majority of the tasks are crowdsourced dialogue datasets designed to inform conversational models about how to handle skills within conversation (grounding responses on knowledge, displaying empathy and personality, roleplaying as a character). Other datasets are from the task-oriented domain, in which models must learn to complete tasks requested by humans. Finally, we include question answering data, which is meant to teach models how to answer factual questions. The collected demo deploy data will be used to continually improve future iterations of BlenderBot. This dataset provides organic user interaction data, which is sparingly available in the wild.
 
@@ -10,36 +10,36 @@
 
 **Any other comments?** N/A
 
-	
-## Composition	
+
+## Composition
 
 **What do the instances that comprise the dataset represent (e.g., documents, photos, people, countries)? Are there multiple types of instances (e.g., movies, users, and ratings; people and interactions between them; nodes and edges)? Please provide a description.** The instances are text-based conversational dialogues or question/answer pairs. The BB3 fine-tuning data comprise the following datasets:
 
 - *Question Answering:*
   - MS MARCO (Nguyen et al., 2016)
   - SQuAD (Rajpurkar et al., 2016)
   - TriviaQA (Joshi et al., 2017)
-  - Natural Questions (Kwiatkowski et al., 2019) 
-  - Natural Questions (Open) (Lee et al., 2019) 
-  - Natural Questions (Open Dialogues) (Adolphs et al., 2021) 
+  - Natural Questions (Kwiatkowski et al., 2019)
+  - Natural Questions (Open) (Lee et al., 2019)
+  - Natural Questions (Open Dialogues) (Adolphs et al., 2021)
 - *Knowledge-Grounded Dialogue:*
   - Wizard of the Internet (Komeili et al., 2022)
   - Wizard of Wikipedia (Dinan et al., 2019b)
-  - Funpedia (Dinan et al., 2020b) 
+  - Funpedia (Dinan et al., 2020b)
 - *Open-Domain Dialogue:*
   - PersonaChat (Zhang et al., 2018)
   - Empathetic Dialogues (Rashkin et al., 2019)
   - Blended Skill Talk (Smith et al., 2020)
   - Multi-Session Chat (Xu et al., 2022a)
-  - LIGHT + WILD (Urbanek et al., 2019; Shuster et al., 2021b) 
+  - LIGHT + WILD (Urbanek et al., 2019; Shuster et al., 2021b)
 - *Recovery & Feedback:*
-  - SaFeRDialogues (Ung et al., 2022) 
+  - SaFeRDialogues (Ung et al., 2022)
   - FITS (Xu et al., 2022b)
 - *Task-Oriented Dialogue:*
-  - Google SGD (Rastogi et al., 2020) 
-  - Taskmaster (Byrne et al., 2019) 
-  - Taskmaster 2 (Byrne et al., 2019) 
-  - Taskmaster 3 (Byrne et al., 2019) 
+  - Google SGD (Rastogi et al., 2020)
+  - Taskmaster (Byrne et al., 2019)
+  - Taskmaster 2 (Byrne et al., 2019)
+  - Taskmaster 3 (Byrne et al., 2019)
 
 We additionally make use of a shard of OPT pre-training data: see the data card of [Zhang et al. (2022)](https://arxiv.org/pdf/2205.01068.pdf) for more details.
 
@@ -69,8 +69,8 @@ We additionally make use of a shard of OPT pre-training data: see the data card
 
 **Any other comments?** No.
 
-	
-## Collection Process	
+
+## Collection Process
 
 **How was the data associated with each instance acquired? Was the data directly observable (e.g., raw text, movie ratings), reported by subjects (e.g., survey responses), or indirectly inferred/derived from other data (e.g., part-of-speech tags, model-based guesses for age or language)? If data was reported by subjects or indirectly inferred/derived from other data, was the data validated/verified? If so, please describe how.** For the fine-tuning data, we refer the reader to the various dataset papers for data instance collection details. For the demo deployment data, we collect conversations from organic interactions with our bot.
 
@@ -96,17 +96,17 @@ We additionally make use of a shard of OPT pre-training data: see the data card
 
 **Any other comments?** N/A
 
-	
-## Preprocessing/cleaning/labeling	
+
+## Preprocessing/cleaning/labeling
 
 **Was any preprocessing/cleaning/labeling of the data done (e.g., discretization or bucketing, tokenization, part-of-speech tagging, SIFT feature extraction, removal of instances, processing of missing values)? If so, please provide a description. If not, you may skip the remainder of the questions in this section.** For the fine-tuning data, we process the data to reassign targets depending on the modular function for which we were training BB3. For the demo deployment data, we refer to the main BlenderBot 3 paper for release details.
 
 **Was the "raw" data saved in addition to the preprocessed/cleaned/labeled data (e.g., to support unanticipated future uses)? If so, please provide a link or other access point to the "raw" data.** N/A
 
 **Any other comments?** No.
 
-	
-## Uses	
+
+## Uses
 
 **Has the dataset been used for any tasks already? If so, please provide a description.** These datasets have been used for training BlenderBot 3.
 
@@ -120,8 +120,8 @@ We additionally make use of a shard of OPT pre-training data: see the data card
 
 **Any other comments?** No.
 
-	
-## Distribution	
+
+## Distribution
 
 **Will the dataset be distributed to third parties outside of the entity (e.g., company, institution, organization) on behalf of which the dataset was created? If so, please provide a description.** The dataset from the chatbot demo will be made publicly available, and contributors to these datasets have consented to their release.
 
@@ -135,8 +135,8 @@ We additionally make use of a shard of OPT pre-training data: see the data card
 
 **Any other comments?** No.
 
-	
-## Maintenance	
+
+## Maintenance
 
 **Who is supporting/hosting/maintaining the dataset?** Meta AI.
 

diff --git a/projects/bb3/agents/README.md b/projects/bb3/agents/README.md
@@ -416,4 +416,4 @@ Previous Topic: playing
 Personal Fact: Person 1's Persona: I love playing baseball
 Interesting Fact: The New York Yankees have a 50-20 record
 Person 2:
-```
+```
diff --git a/projects/metacognition/README.md b/projects/metacognition/README.md
@@ -10,4 +10,4 @@ COMING SOON!
 
 ## Paper
 
-[Link](https://arxiv.org/abs/2012.14983); accepted to TACL 2022.
+[Link](https://arxiv.org/abs/2012.14983); accepted to TACL 2022.
diff --git a/tests/crowdsourcing/tasks/model_chat/test_model_chat_analysis/basic__results.txt b/tests/crowdsourcing/tasks/model_chat/test_model_chat_analysis/basic__results.txt
@@ -37,4 +37,4 @@ WORKER_3,HIT_ID_3,blender_90M,2,7,1,Utterance placeholder line 1 3,
 WORKER_3,HIT_ID_3,blender_90M,2,8,0,Utterance placeholder line 0 4,
 WORKER_3,HIT_ID_3,blender_90M,2,9,1,Utterance placeholder line 1 4,
 WORKER_3,HIT_ID_3,blender_90M,2,10,0,Utterance placeholder line 0 5,
-WORKER_3,HIT_ID_3,blender_90M,2,11,1,Utterance placeholder line 1 5,1
+WORKER_3,HIT_ID_3,blender_90M,2,11,1,Utterance placeholder line 1 5,1
diff --git a/tests/crowdsourcing/tasks/model_chat/test_model_chat_analysis/basic__test_stdout.txt b/tests/crowdsourcing/tasks/model_chat/test_model_chat_analysis/basic__test_stdout.txt
@@ -22,4 +22,4 @@ WORKER_3
 1  WORKER_2             1
 2  WORKER_3             1
 
-Worker conversation counts: {'WORKER_1': 1, 'WORKER_2': 1, 'WORKER_3': 1}
+Worker conversation counts: {'WORKER_1': 1, 'WORKER_2': 1, 'WORKER_3': 1}
diff --git a/...sourcing/tasks/model_chat/test_model_chat_analysis/with_personas_and_buckets__results.txt b/...sourcing/tasks/model_chat/test_model_chat_analysis/with_personas_and_buckets__results.txt
@@ -48,4 +48,4 @@ WORKER_3,HIT_ID_3,blender_90M,2,9,1,Utterance placeholder line 1 4,False,False,F
 WORKER_3,HIT_ID_3,blender_90M,2,10,0,Utterance placeholder line 0 5,,,,,,,
 WORKER_3,HIT_ID_3,blender_90M,2,11,1,Utterance placeholder line 1 5,False,False,False,False,False,True,
 WORKER_3,HIT_ID_3,blender_90M,2,12,0,Utterance placeholder line 0 6,,,,,,,
-WORKER_3,HIT_ID_3,blender_90M,2,13,1,Utterance placeholder line 1 6,False,False,False,False,False,True,1
+WORKER_3,HIT_ID_3,blender_90M,2,13,1,Utterance placeholder line 1 6,False,False,False,False,False,True,1
diff --git a/...cing/tasks/model_chat/test_model_chat_analysis/with_personas_and_buckets__test_stdout.txt b/...cing/tasks/model_chat/test_model_chat_analysis/with_personas_and_buckets__test_stdout.txt
@@ -29,4 +29,4 @@ WORKER_3
 1  WORKER_2             1              1                    1.0
 2  WORKER_3             1              0                    0.0
 
-Worker conversation counts: {'WORKER_1': 1, 'WORKER_2': 1, 'WORKER_3': 1}
+Worker conversation counts: {'WORKER_1': 1, 'WORKER_2': 1, 'WORKER_3': 1}
Original file line number	Diff line number	Diff line change
Expand Up		@@ -3,4 +3,3 @@
		* This source code is licensed under the MIT license found in the
		* LICENSE file in the root directory of this source tree.
		*/
Original file line number	Diff line number	Diff line change
Expand Up		@@ -10,4 +10,4 @@ COMING SOON!

		## Paper

		[Link](https://arxiv.org/abs/2012.14983); accepted to TACL 2022.
		[Link](https://arxiv.org/abs/2012.14983); accepted to TACL 2022.