Skip to content
This repository was archived by the owner on Nov 3, 2023. It is now read-only.

[fbcode] Lint. #3621

Merged
merged 1 commit into from
Apr 27, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/pull_request_template.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ it is necessary. If your patch fixes an issue, please reference that issue here.

**Testing steps**
<!-- Enter steps to test your pull request. Give a clear and concise description of
what you expected to happen during testing. Include any logs in ```backticks``` if you have them.
what you expected to happen during testing. Include any logs in ```backticks``` if you have them.
Also make sure you have connected your account to CircleCI and those tests run successfully. -->

**Other information**
Expand Down
12 changes: 6 additions & 6 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,10 +8,10 @@ desired to increase the pool of tasks, models, and baselines.
## Pull Requests
We actively welcome your pull requests.

1. Fork the repo and then clone the forked repository. (See this [github guide](https://guides.github.com/activities/forking/) on forking for more info).
1. Fork the repo and then clone the forked repository. (See this [github guide](https://guides.github.com/activities/forking/) on forking for more info).
**If you have already cloned the repo directly and committed changes, follow the steps in the [section below](#moving-changes-youve-committed-to-a-fork)**
2. Create your branch from `master`. Set up your environment
and run `pre-commit install` once.
and run `pre-commit install` once.
3. Make your changes
4. If you've added code that should be tested, [add tests](http://parl.ai/docs/tutorial_tests.html).
5. If you've changed APIs, update the documentation.
Expand All @@ -20,15 +20,15 @@ We actively welcome your pull requests.
8. If you've added a new dataset, you should also run
`python -m pytest -m data`. Copy-paste the output into a comment in your PR.
9. If you haven't already, complete the Contributor License Agreement ("CLA").
10. Link [CircleCI](https://circleci.com/vcs-authorize/) to your github account
if you haven't done so previously (and make sure the CircleCI tests run
10. Link [CircleCI](https://circleci.com/vcs-authorize/) to your github account
if you haven't done so previously (and make sure the CircleCI tests run
successfully on the PR after you push your changes).
11. Push your changes!
12. Once the PR is accepted and CI is passing, we will merge the PR for you.

### Moving changes you've committed to a fork
1. Fork the repo
2. In your local repo, rename your origin remote to upstream
2. In your local repo, rename your origin remote to upstream
```
git remote rename origin upstream
```
Expand All @@ -40,7 +40,7 @@ We actively welcome your pull requests.
```
git fetch origin
```
5. Make your local branch track the remote branch (of the forked repo)
5. Make your local branch track the remote branch (of the forked repo)
```
git branch --set-upstream-to origin/master master
```
Expand Down
1 change: 0 additions & 1 deletion docs/source/mutators.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,4 +12,3 @@ their output, as well as their options when available.

```{include} mutators_list.inc
```

2 changes: 1 addition & 1 deletion docs/source/tutorial_basic.md
Original file line number Diff line number Diff line change
Expand Up @@ -409,7 +409,7 @@ parlai interactive --model-file zoo:pretrained_transformers/model_poly/model --t
parlai interactive --model projects:wizard_of_wikipedia:interactive_retrieval --task wizard_of_wikipedia
```

To view additional fields from the model output, try use the flag `--display-add-fields`. For example,
To view additional fields from the model output, try use the flag `--display-add-fields`. For example,
```
parlai interactive --model-file zoo:blender/blender_90M/model --task convai2 --display-add-fields beam_texts
```
Expand Down
7 changes: 3 additions & 4 deletions docs/source/tutorial_crowdsourcing.md
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,7 @@ A few things to keep in mind:
start returning `True` for the `episode_done()` function.
2. Make sure to test your dialog task using Mephisto's sandbox mode (enabled by default) before
pushing it live. See the [crowdsourcing README](https://github.com/facebookresearch/ParlAI/tree/master/parlai/crowdsourcing#running-tasks-live) for running live tasks.

Advanced Task Techniques
------------------------

Expand Down Expand Up @@ -127,7 +127,7 @@ Follow the steps below:
- Mephisto's default MTurk functionality requires a free Heroku account,
which can be obtained [here](https://signup.heroku.com/). Running
any Mephisto MTurk operation will walk you through linking the two.

To run a crowdsourcing task, launch its run file (typically `run.py`) with the proper flags, using a command like the following:

```bash
Expand Down Expand Up @@ -207,7 +207,7 @@ Mephisto MTurk Tips and Tricks
__quite__ getting it allows those workers to work on other tasks for
you in the future. You can soft-block workers by calling [`Worker.grant_qualification()`](https://github.com/facebookresearch/Mephisto/blob/master/mephisto/data_model/qualification.py) for a certain `qualification_name`, which is typically set by the `mephisto.blueprint.block_qualification` parameter. That worker will then not be able to work on any
tasks that use the same value for `mephisto.blueprint.block_qualification`.

### Preventing and Handling Crashes

- The `max_num_concurrent_units` argument when initializing [`TaskLauncher`](https://github.com/facebookresearch/Mephisto/blob/master/mephisto/operations/task_launcher.py) controls how many people can work on your task at any given time: set this sufficiently low for your task. Leaving this too high might cause your Heroku server to run into issues depending on how many messages per second it's trying to
Expand Down Expand Up @@ -251,4 +251,3 @@ Additional Credits
- Turker icon credit: [Amazon Mechanical
Turk](https://requester.mturk.com/).
- Robot icon credit: [Icons8](https://icons8.com/).

12 changes: 6 additions & 6 deletions parlai/crowdsourcing/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,13 @@ Code for crowdsourcing tasks that use Mephisto. See the [Mephisto quick start gu
## Running tasks

Tasks are launched by calling the appropriate run script: for instance, an ACUTE-Eval run can be launched with `python parlai/crowdsourcing/tasks/acute_eval/run.py`, followed by any appropriate flags. All run parameters are set using [Hydra](https://github.com/facebookresearch/hydra): append the flag `-c job` to your run command to see a list of all available parameters, grouped by their package name (`mephisto.blueprint`, `mephisto.task`, etc.), which determines how they are called. Each run script has a YAML file of default parameters that will be loaded, found in the `conf/` subfolder of each task.

### Specifying your own YAML file

The easiest way to specify a different YAML file is to create a new file, say, `my_params.yaml`, in the `conf/` subfolder of the task. Then, you can launch HITs with `python ${TASK_FOLDER}/run.py conf=my_params`.

You also can specify a path to a YAML file existing *outside* of `${TASK_FOLDER}`: you will need to have your YAML file stored at a location `${CUSTOM_FOLDER}/conf/my_params.yaml`, and then you can add a `--config-dir ${CUSTOM_FOLDER}` string to the launch command above.

### Setting parameters on the command line

Suppose that your YAML file has a `task_reward` parameter defined as follows:
Expand All @@ -31,11 +31,11 @@ Here is a partial list of MTurk-specific parameters that can be set in YAML file
- `mephisto.task.task_description`: Includes detailed information about the kind of task that the HIT contains. On the Amazon Mechanical Turk web site, the HIT description appears in the expanded view of search results, and in the HIT and assignment screens.
- `mephisto.task.task_tags`: One or more words or phrases that describe the HIT, separated by commas. On MTurk website, these words are used in searches to find HITs.
- `mturk.worker_blocklist_paths`: The path to a text file containing a list of IDs of MTurk workers to soft-block, separated by newlines. Multiple paths can be specified, delimited by commas (i.e. `path1,path2,path3`).

### Running tasks live

By default, HITs run locally in sandbox mode. To run live HITs, add `mephisto.provider.requester_name=${REQUESTER_NAME} mephisto/architect=heroku` to your launch command, where `${REQUESTER_NAME}` is the MTurk requester name that you specified when setting up Mephisto.

## Saving data

By default, Mephisto data is saved in the following directory:
Expand Down
2 changes: 1 addition & 1 deletion parlai/crowdsourcing/tasks/acute_eval/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,7 @@ python parlai/crowdsourcing/tasks/acute_eval/analysis.py \
--pairings-filepath ${PATH_TO_PAIRINGS_FILE} \
--outdir ${OUTPUT_FOLDER}
```
For analyzing results from a Fast ACUTE run (see below), use the `--root-dir` flag to specify the Fast ACUTE root directory (`mephisto.blueprint.root_dir`) instead of specifying the `--pairings-filepath` and `--outdir` flags.
For analyzing results from a Fast ACUTE run (see below), use the `--root-dir` flag to specify the Fast ACUTE root directory (`mephisto.blueprint.root_dir`) instead of specifying the `--pairings-filepath` and `--outdir` flags.

The script will analyze the results and save files with information such as the win/loss rate and significance scores.

Expand Down
2 changes: 1 addition & 1 deletion parlai/crowdsourcing/tasks/model_chat/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ In `worlds.py`, modify `ModelChatOnboardWorld.check_onboarding_answers()` to cha

`run_image_chat.py` can be run to chat with a model about an image: each conversation will begin with a selected image, and then the human and model will chat about it.

This code replaces the old `parlai/mturk/tasks/image_chat/` and `parlai/mturk/tasks/personality_captions/` tasks, which are deprecated and can be accessed with `git checkout v0.10.0`. Those tasks also featured the ability to compare two possible captions to an image and rate which one is more engaging: this functionality has now been replaced by the [ACUTE-Eval](https://github.com/facebookresearch/ParlAI/tree/master/parlai/crowdsourcing/tasks/acute_eval) task.
This code replaces the old `parlai/mturk/tasks/image_chat/` and `parlai/mturk/tasks/personality_captions/` tasks, which are deprecated and can be accessed with `git checkout v0.10.0`. Those tasks also featured the ability to compare two possible captions to an image and rate which one is more engaging: this functionality has now been replaced by the [ACUTE-Eval](https://github.com/facebookresearch/ParlAI/tree/master/parlai/crowdsourcing/tasks/acute_eval) task.

### Setup

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ Two variants of the blueprint are supported:
- `TurnAnnotationStaticInFlightQABlueprint`
- Includes the ability to add an additional in-flight (i.e. mid-HIT) quality assurance check
- Called with `python parlai/crowdsourcing/tasks/turn_annotations_static/run_in_flight_qa.py`

For both variants of the blueprint, it is required to pass in your own file of conversations with `mephisto.blueprint.data_jsonl=${PATH_TO_CONVERSATIONS}`.

See `turn_annotations_blueprint.py` for various parameters of this task, including passing in custom annotation bucket definitions using the `annotation_buckets` YAML flag, being able to group multiple conversations into one HIT using the `subtasks_per_unit` flag, passing in onboarding data with answers, and being able to ask only for the final utterance as an annotation.
Expand Down
2 changes: 1 addition & 1 deletion parlai/opt_presets/README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Option Aliases

This folder contains a set of "option aliases" that are automatically packaged
and provided with ParlAI. They are used as shorthand for
and provided with ParlAI. They are used as shorthand for

## Adding option aliases

Expand Down
2 changes: 1 addition & 1 deletion parlai/tasks/genderation_bias/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,4 +44,4 @@ that is a good show i watch that while drinking iced tea
i agree . what do you do for a living ? f0m0
i'm a researcher i'm researching the fact that mermaids are real
16:33:19 | loaded 131438 episodes with a total of 131438 examples
```
```
3 changes: 1 addition & 2 deletions parlai/tasks/huggingface/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
Task: HuggingFace
===============
Description: Can load HuggingFace datasets.
Description: Can load HuggingFace datasets.

Website: https://huggingface.co/

1 change: 0 additions & 1 deletion parlai/tasks/sensitive_topics_evaluation/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,4 +11,3 @@ Description: Task for evaluating a classifier trained to classify the following
Link: https://arxiv.org/abs/2010.07079

Tags: #All

2 changes: 1 addition & 1 deletion projects/anti_scaling/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ When performing distillation, terms are added for losses on the encoder output,

Distillation in the style of [Jiao, Xiaoqi, et al. "Tinybert: Distilling bert for natural language understanding." *arXiv preprint arXiv:1909.10351* (2019).](https://arxiv.org/abs/1909.10351)

With TinyBERT-style distillation, the student model can have smaller hidden and FFN dimensions than the teacher model, and projection matrices will be used to measure losses such as those between the hidden-layer outputs. Unlike with DistilBERT-style distillation, the weights of the teacher model cannot be used to initialize the student model.
With TinyBERT-style distillation, the student model can have smaller hidden and FFN dimensions than the teacher model, and projection matrices will be used to measure losses such as those between the hidden-layer outputs. Unlike with DistilBERT-style distillation, the weights of the teacher model cannot be used to initialize the student model.

In addition to the losses of DistilBERT-style distillation above, losses are also included on the embedding layer and on the per-layer query/key product matrices from encoder self-attention, decoder self-attention, and encoder/decoder attention. `DistillNarrowTransformerAgent` is used for distilling `transformer/generator` models, and `DistillNarrowBartAgent` is used for distilling `bart` models.

Expand Down
6 changes: 3 additions & 3 deletions projects/contradiction/README.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
# *I like fish <span>&#x1F41F;</span>, especially dolphins <span>&#x1F42C;</span>:*<sup>[∗](#dolphion)</sup> Addressing Contradictions in Dialogue Modeling

A study on *contradiction* detection and *non-contradiction* generation in dialogue modeling.
A study on *contradiction* detection and *non-contradiction* generation in dialogue modeling.
The paper can be found here: [Nie et al. (2020)](https://arxiv.org/abs/2012.13391).

## Abstract

To quantify how well natural language understanding models can capture consistency in a general conversation, we introduce the **D**ialogu**E** **CO**ntradiction **DE**tection task (**DECODE**) and a new conversational dataset containing both human-human and human-bot contradictory dialogues. We then compare a structured utterance-based approach of using pre-trained Transformer models for contradiction detection with the typical unstructured approach.
To quantify how well natural language understanding models can capture consistency in a general conversation, we introduce the **D**ialogu**E** **CO**ntradiction **DE**tection task (**DECODE**) and a new conversational dataset containing both human-human and human-bot contradictory dialogues. We then compare a structured utterance-based approach of using pre-trained Transformer models for contradiction detection with the typical unstructured approach.

Results reveal that:
<ol>
Expand Down Expand Up @@ -48,7 +48,7 @@ See [download data from s3 with raw format](https://github.com/facebookresearch/
If you use the dataset or models in your own work, please cite with the following BibTex entry:
```
@misc{nie2020i,
title={I like fish, especially dolphins: Addressing Contradictions in Dialogue Modelling},
title={I like fish, especially dolphins: Addressing Contradictions in Dialogue Modelling},
author={Yixin Nie and Mary Williamson and Mohit Bansal and Douwe Kiela and Jason Weston},
year={2020},
eprint={2012.13391},
Expand Down
42 changes: 21 additions & 21 deletions projects/contradiction/download_with_raw_format.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
## Directly Download Raw Data

The dataset (**DECODE**) can be download in [this_link](https://sharenlpfile-01.s3.amazonaws.com/data/decode_v0.1.zip).
As described in the paper, **DECODE** includes 6 groups of dialogues: *Train*, *Dev*, *Test*, *Human-Bot*, *A2T*, *RCT*.
The dataset (**DECODE**) can be download in [this_link](https://sharenlpfile-01.s3.amazonaws.com/data/decode_v0.1.zip).
As described in the paper, **DECODE** includes 6 groups of dialogues: *Train*, *Dev*, *Test*, *Human-Bot*, *A2T*, *RCT*.

| Group Name | Count | Description |
| ------------- |---------------| -------------|
Expand All @@ -15,7 +15,7 @@ As described in the paper, **DECODE** includes 6 groups of dialogues: *Train*, *
The details of each group can be found in the [Nie et al. (2020)](https://arxiv.org/abs/2012.13391).

### Format
The format of the file is `JSONL`. Each line in the file is one dialogue example saved in a `JSON`.
The format of the file is `JSONL`. Each line in the file is one dialogue example saved in a `JSON`.
Primary fields that are required for the contradiction detection task:
- `record_id`: It is the unique ID for the example.
- `turns`: The field contains a list of turns that presents a conversation between two speaker.
Expand All @@ -36,34 +36,34 @@ A example `JSON` is shown below:
"record_id": "1f47fe86-cfc3-469a-bae3-506c81871bf5",

"turns": [
{"turn_id": 0, "agent_id": 0, "text": "i've been to new york city once crazy place that city .", "turn_context": "", "auxiliary": {"contradiction": null}},
{"turn_id": 1, "agent_id": 1, "text": "i wish i could go there . i'm sure they have a place with great meatloaf !", "turn_context": "", "auxiliary": {"contradiction": null}},
{"turn_id": 2, "agent_id": 0, "text": "They probably do, somewhere! You can find nearly any cuisine there you want.", "turn_context": "", "auxiliary": {"contradiction": null}},
{"turn_id": 3, "agent_id": 1, "text": "I wonder if they have anything different, I wonder if anyone has tried to make meatloaf with tofu instead.", "turn_context": "", "auxiliary": {"contradiction": null}},
{"turn_id": 4, "agent_id": 0, "text": "I'm sure somebody has, though I am not sure how it would taste.", "turn_context": "", "auxiliary": {"contradiction": null}},
{"turn_id": 5, "agent_id": 1, "text": "I make tofu meatloaf all the time, it is delicious", "turn_context": "", "auxiliary": {"contradiction": true}}],
{"turn_id": 0, "agent_id": 0, "text": "i've been to new york city once crazy place that city .", "turn_context": "", "auxiliary": {"contradiction": null}},
{"turn_id": 1, "agent_id": 1, "text": "i wish i could go there . i'm sure they have a place with great meatloaf !", "turn_context": "", "auxiliary": {"contradiction": null}},
{"turn_id": 2, "agent_id": 0, "text": "They probably do, somewhere! You can find nearly any cuisine there you want.", "turn_context": "", "auxiliary": {"contradiction": null}},
{"turn_id": 3, "agent_id": 1, "text": "I wonder if they have anything different, I wonder if anyone has tried to make meatloaf with tofu instead.", "turn_context": "", "auxiliary": {"contradiction": null}},
{"turn_id": 4, "agent_id": 0, "text": "I'm sure somebody has, though I am not sure how it would taste.", "turn_context": "", "auxiliary": {"contradiction": null}},
{"turn_id": 5, "agent_id": 1, "text": "I make tofu meatloaf all the time, it is delicious", "turn_context": "", "auxiliary": {"contradiction": true}}],

"is_contradiction": true,
"aggregated_contradiction_indices": [3, 5],

"is_contradiction": true,
"aggregated_contradiction_indices": [3, 5],

# Other collection related field.
"num_of_turns_by_writer": 2
"writer_contradiction_indices": [3, 5],
"writer_contradiction_indices": [3, 5],
"verifications": [
{"verification_id": "36236842-98fc-495d-9c04-182f1b77c246", "is_contradiction": true, "verifier_contradiction_indices": [3, 5]},
{"verification_id": "fd25e3f4-7366-42a5-aed5-0d20082cc833", "is_contradiction": true, "verifier_contradiction_indices": [3, 5]},
{"verification_id": "36236842-98fc-495d-9c04-182f1b77c246", "is_contradiction": true, "verifier_contradiction_indices": [3, 5]},
{"verification_id": "fd25e3f4-7366-42a5-aed5-0d20082cc833", "is_contradiction": true, "verifier_contradiction_indices": [3, 5]},
{"verification_id": "c1648e0f-cd8d-408c-89b5-54a8dc7f522b", "is_contradiction": true, "verifier_contradiction_indices": [3, 5]}],

# Other field you normally wouldn't need.
"agents": {
"1": {"is_human": true, "persona_lines": []},
"1": {"is_human": true, "persona_lines": []},
"0": {"is_human": true, "persona_lines": []}
},
"conversation_contexts": null,
},
"conversation_contexts": null,
"is_truncated": true,
"auxiliary": {
"auxiliary": {
"source": "BST_test"
},
"conversation_id": "9cb462d9-86f1-4296-af36-009d2e4d90f8#truncated#4",
}
```
```
2 changes: 1 addition & 1 deletion projects/genderation_bias/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,4 +32,4 @@ By default, all mitigation methods are turned on at once. Use the flags `--add-c

## Models

TBD.
TBD.
Loading