SaferDialogues parallel test and valid set with BAD dataset #4623

meganung · 2022-06-21T14:54:30Z

Adding a new teacher SaferDialoguesBAD that has a different test and valid set that is parallel with the BAD dataset (train set is the same amongst both teachers)

Added test for the new teacher, pytest passes.

jxmsML · 2022-07-12T17:58:06Z

parlai/tasks/saferdialogues/build.py


 RESOURCES = [
    DownloadableFile(
        f'http://parl.ai/downloads/saferdialogues/saferdialogues_{SAFERDIALOGUES_VERSION}.tar.gz',
        f'saferdialogues_{SAFERDIALOGUES_VERSION}.tar.gz',
-        '7ecf062fdf3e0ddfd16c52998341f543331e0dbaa1fcdb27026e8b0b80446fed',
+        '3d1bc731fb0c63d9f61a52b3d5f1aab0911cda1ca38f49822c79c18a2dc8e834',


will the new teacher be overriding the previous version (the one with the old valid and test split), I wonder if it would be nicer to keep both splits (one for reproducibility in your paper, and one here)

no it's not overriding it, the new file just includes all the data (original for reproducibility) and the new data for the new teacher that this PR introduces. The default task points to the original teacher + data from the paper. To use the new data where the splits are parallel with the BAD dataset, need to explicitly use SaferDialoguesBAD teacher.

klshuster

looks great just one q

klshuster · 2022-07-13T00:23:40Z

parlai/tasks/saferdialogues/agents.py

@@ -1,9 +1,10 @@
 #!/usr/bin/env python3

-# Copyright (c) Meta Platforms, Inc. and affiliates.
+# Copyright (c) Facebook, Inc. and its affiliates.


do we need to update our copyrights @stephenroller ?

klshuster · 2022-07-13T00:26:38Z

parlai/tasks/saferdialogues/test/saferdialogues_test.yml

@@ -1,86 +1,66 @@
 acts:
 - - episode_done: true
    eval_labels:
-    - I agree with that, they're kids after all.
+    - i am also a fun maker


why did the train/val/test values change in the original dataset?

good catch, it was just shuffled but should be updated so it's identical

stephenroller · 2022-07-15T04:47:53Z

Can you rebase/merge main in?

jxmsML · 2022-07-20T19:12:19Z

parlai/tasks/saferdialogues/build.py

@@ -10,13 +10,13 @@
 import parlai.core.build_data as build_data
 import os

-SAFERDIALOGUES_VERSION = "v0.1"
+SAFERDIALOGUES_VERSION = "v0.4"


LOL what happened to version 0.2 ~ 0.3

jxmsML

LGTM!!! 🚀

meganung requested a review from jxmsML June 21, 2022 14:54

facebook-github-bot added the CLA Signed label Jun 21, 2022

meganung force-pushed the saferdialogues_bad_testset branch from 70875d8 to c458039 Compare June 27, 2022 19:18

meganung requested a review from klshuster July 7, 2022 17:52

jxmsML reviewed Jul 12, 2022

View reviewed changes

klshuster reviewed Jul 13, 2022

View reviewed changes

meganung added 9 commits July 15, 2022 10:57

add new test set parallel with bad dataset

da389ba

update docs

4161d65

valid and test set parallel to BAD and separate teacher for that

9d57468

add tests

a813995

remove unnecessary prints

77561f0

lint

ac88e01

lint

1576b31

original data is identical (was shuffled around)

0f47301

rebase and update

b8365cb

meganung force-pushed the saferdialogues_bad_testset branch from 77f513e to b8365cb Compare July 15, 2022 17:58

meganung requested review from stephenroller, jxmsML and klshuster July 15, 2022 19:46

jxmsML reviewed Jul 20, 2022

View reviewed changes

jxmsML approved these changes Jul 20, 2022

View reviewed changes

meganung merged commit bd8e01a into main Jul 26, 2022

meganung deleted the saferdialogues_bad_testset branch July 26, 2022 18:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SaferDialogues parallel test and valid set with BAD dataset #4623

SaferDialogues parallel test and valid set with BAD dataset #4623

meganung commented Jun 21, 2022

jxmsML Jul 12, 2022

meganung Jul 12, 2022

klshuster left a comment

klshuster Jul 13, 2022

stephenroller Jul 15, 2022

klshuster Jul 13, 2022

meganung Jul 14, 2022

stephenroller commented Jul 15, 2022

jxmsML Jul 20, 2022

jxmsML left a comment

SaferDialogues parallel test and valid set with BAD dataset #4623

SaferDialogues parallel test and valid set with BAD dataset #4623

Conversation

meganung commented Jun 21, 2022

jxmsML Jul 12, 2022

Choose a reason for hiding this comment

meganung Jul 12, 2022

Choose a reason for hiding this comment

klshuster left a comment

Choose a reason for hiding this comment

klshuster Jul 13, 2022

Choose a reason for hiding this comment

stephenroller Jul 15, 2022

Choose a reason for hiding this comment

klshuster Jul 13, 2022

Choose a reason for hiding this comment

meganung Jul 14, 2022

Choose a reason for hiding this comment

stephenroller commented Jul 15, 2022

jxmsML Jul 20, 2022

Choose a reason for hiding this comment

jxmsML left a comment

Choose a reason for hiding this comment