Task tweaks (#1149)

* qasrl * set logic * update readme * ANLI documentation Co-authored-by: jeswan <57466294+jeswan@users.noreply.github.com>
nyu-mll · Oct 13, 2020 · 392976c · 392976c
1 parent 82ed396
commit 392976c
Show file tree

Hide file tree

Showing 4 changed files with 11 additions and 8 deletions.
diff --git a/README.md b/README.md
@@ -77,8 +77,8 @@ downloader.download_data(["mrpc"], "/content/data")
 # Set up the arguments for the Simple API
 args = run.RunConfiguration(
    run_name="simple",
-   exp_dir="/content/exp",
-   data_dir="/content/data",
+   exp_dir="/path/to/exp",
+   data_dir="/path/to/exp/tasks",
    model_type="roberta-base",
    tasks="mrpc",
    train_batch_size=16,
@@ -91,15 +91,16 @@ run.run_simple(args)
 
 Bash version:
 ```bash
+BASE_PATH=/path/to/exp
 python jiant/scripts/download_data/runscript.py \
     download \
     --tasks mrpc \
-    --output_path /content/data
+    --output_path /path/to/exp/tasks
 python jiant/proj/simple/runscript.py \
     run \
     --run_name simple \
-    --exp_dir /content/data \
-    --data_dir /content/data \
+    --exp_dir /path/to/exp \
+    --data_dir /path/to/exp/tasks \
     --model_type roberta-base \
     --tasks mrpc \
     --train_batch_size 16 \

diff --git a/guides/tasks/supported_tasks.md b/guides/tasks/supported_tasks.md
@@ -29,7 +29,7 @@
 | MultiRC | multirc | ✅ | ✅ | multirc | SuperGLUE |
 | MRPC | mrpc | ✅ | ✅ | mrpc | GLUE |
 | QAMR | qamr | ✅ | ✅ | qamr |  |
-| QA-SRL | qa-srl | ✅ | ✅ | qa-srl |  |
+| QA-SRL | qasrl | ✅ | ✅ | qasrl |  |
 | EP-NER | ner | ✅ |  | ner | Edge-Probing |
 | PAWS-X | `pawsx_{lang}` | ✅ | ✅ | pawsx | XTREME, multi-lang |
 | WikiAnn | `panx_{lang}` | ✅ | ✅ | panx | XTREME, multi-lang |

diff --git a/guides/tasks/task_specific.md b/guides/tasks/task_specific.md
@@ -2,7 +2,9 @@
 
 ### Adversarial NLI
 
-[Adversarial NLI](https://arxiv.org/pdf/1910.14599.pdf) has 3 rounds of adversarial data creation. A1/A2/A3 are expanding supersets of the previous round.
+[Adversarial NLI](https://arxiv.org/pdf/1910.14599.pdf) has 3 rounds of adversarial data creation. A1, A2 and A3 are different rounds of data creation. When downloading, you can use the task names `adversarial_nli_r1`, `adversarial_nli_r2`, `adversarial_nli_r3` to point the the different rounds. 
+
+When doing training on the full ANLI dataset, which is SNLI+MNLI+A1+A2+A3, perform training in a multi-task manner with proportional sampling, and be sure to set the `task_to_taskmodel_map` to have all tasks point to the same NLI head.
 
 
 ### Masked Language Modeling (MLM)

diff --git a/jiant/scripts/download_data/runscript.py b/jiant/scripts/download_data/runscript.py
@@ -16,7 +16,7 @@
 
 # DIRECT_DOWNLOAD_TASKS need to be directly downloaded because the nlp
 # implementation differs from the original dataset format
-NLP_DOWNLOADER_TASKS = GLUE_TASKS | SUPERGLUE_TASKS | OTHER_NLP_TASKS - DIRECT_DOWNLOAD_TASKS
+NLP_DOWNLOADER_TASKS = (GLUE_TASKS | SUPERGLUE_TASKS | OTHER_NLP_TASKS) - DIRECT_DOWNLOAD_TASKS
 SUPPORTED_TASKS = NLP_DOWNLOADER_TASKS | XTREME_TASKS | SQUAD_TASKS | DIRECT_DOWNLOAD_TASKS