Skip to content
This repository has been archived by the owner on Nov 3, 2023. It is now read-only.

[TOD][Datasets][Easy] Taskmaster3 (TicketTalk) to TOD Structured format #4187

Merged
merged 169 commits into from
Dec 23, 2021
Merged
Changes from 1 commit
Commits
Show all changes
169 commits
Select commit Hold shift + click to select a range
e365e48
[TOD] Core converesation structure, serialization, const tokens
Nov 15, 2021
c939174
[Tod] Agents, teacher metrics, and tests for these
Nov 16, 2021
3bf655f
[TOD] Tod json structure to teacher task
Nov 16, 2021
6cb4b86
[TOD] Core converesation structure, serialization, const tokens
Nov 15, 2021
1480def
fix test by adding init folder
Nov 16, 2021
de84801
[Tod] Agents, teacher metrics, and tests for these
Nov 16, 2021
638eb28
[TOD] World, world metrics, script, tests
Nov 16, 2021
0e3f492
hmmm... hoping stacks don't bite me. (change that was kept in upper d…
Nov 16, 2021
0643a62
Merge branch 'simpler_tod_1_core_only' into simpler_tod_2_agents_teac…
Nov 16, 2021
37aced2
minor, remove commented out print
Nov 16, 2021
4f91279
Merge branch 'simpler_tod_2_agents_teachers' into simpler_tod_3_world
Nov 16, 2021
b05930f
comment
Nov 16, 2021
5086e85
more comment updates (not sure if it actually helps clarity..)
Nov 16, 2021
1e30035
Merge branch 'simpler_tod_3_world' into simpler_tod_4_tod_json
Nov 16, 2021
9a25fc5
[TOD][Dataset][Easy] Google SGD in TOD Conversations format
Nov 16, 2021
faa2356
[TOD][Dataset][Easyish] Google Simulation Splits
Nov 16, 2021
9426997
[TOD][Datasets][Easy] MetalWoz
Nov 16, 2021
9b2116c
[TOD][Datasets][Easy] MSR E2E into TOD Conversations format
Nov 16, 2021
61f7041
[TOD][Datasets][Easy] Multidogo -> TOD Conversations format
Nov 16, 2021
e8efc52
[TOD][Datasets][Easyish] MultiWoz V2.2 in Conversations Format
Nov 16, 2021
c17c3cc
lint
Nov 16, 2021
728c9bd
[TOD][Datasets][Easy] Taskmaster(1) in Conversations format
Nov 16, 2021
54890b4
Merge branch 'simpler_tod_5d_msr_e2e' into simpler_tod_5e_multidogo
Nov 16, 2021
45e9259
Merge branch 'simpler_tod_5e_multidogo' into simpler_tod_5f_multiwoz_v22
Nov 16, 2021
9284135
Merge branch 'simpler_tod_5f_multiwoz_v22' into simpler_tod_5g_taskma…
Nov 16, 2021
5c226bd
[TOD][Datasets][Easy] Taskmaster2 to TOD Conversations format
Nov 16, 2021
50f1aaf
[TOD][Datasets][Easy] Taskmaster3 (TicketTalk) to TOD Structured format
Nov 16, 2021
51ed1a9
Merge branch 'main' into simpler_tod_1_core_only
Nov 16, 2021
a6508be
Merge branch 'simpler_tod_1_core_only' into simpler_tod_2_agents_teac…
Nov 16, 2021
eebc36b
Merge branch 'simpler_tod_2_agents_teachers' into simpler_tod_3_world
Nov 16, 2021
3675781
use same version of black as in the pre-commit hook
Nov 16, 2021
086c91c
Merge branch 'simpler_tod_2_agents_teachers' into simpler_tod_3_world
Nov 16, 2021
0bc961e
use same version of black as in the pre-commit hook
Nov 16, 2021
ed26407
Merge branch 'simpler_tod_3_world' into simpler_tod_4_tod_json
Nov 16, 2021
677df09
Merge branch 'simpler_tod_4_tod_json' into simpler_tod_5a_google_sgd
Nov 16, 2021
24ee898
black with version from pre-commit hook
Nov 16, 2021
3ca7ae3
Merge branch 'simpler_tod_4_tod_json' into simpler_tod_5a_google_sgd
Nov 16, 2021
3145e0e
Shouldn't worry about tod_json being in task_list
Nov 16, 2021
1b2a3fb
Merge branch 'simpler_tod_4_tod_json' into simpler_tod_5a_google_sgd
Nov 16, 2021
f44b17b
add to task list; run lint with right version of black
Nov 16, 2021
43474c4
Merge branch 'simpler_tod_5a_google_sgd' into simpler_tod_5b_google_s…
Nov 16, 2021
d290ecd
Merge branch 'simpler_tod_5b_google_sgd_sim_splits' into simpler_tod_…
Nov 17, 2021
7c3ccf5
lint with right version
Nov 17, 2021
ab19cc2
lint
Nov 17, 2021
8c76a56
Merge branch 'simpler_tod_5c_metalwoz' into simpler_tod_5d_msr_e2e
Nov 17, 2021
724b255
add to task list
Nov 17, 2021
c79422c
lint right version
Nov 17, 2021
353fab1
Merge branch 'simpler_tod_5d_msr_e2e' into simpler_tod_5e_multidogo
Nov 17, 2021
c86760c
add to task list
Nov 17, 2021
625e632
right lint
Nov 17, 2021
ebd09b4
Merge branch 'simpler_tod_5e_multidogo' into simpler_tod_5f_multiwoz_v22
Nov 17, 2021
dcadadb
task_list
Nov 17, 2021
fff653f
right lint
Nov 17, 2021
406a84d
Merge branch 'simpler_tod_5f_multiwoz_v22' into simpler_tod_5g_taskma…
Nov 17, 2021
bbc10ca
right lint
Nov 17, 2021
14e0f7d
Merge branch 'simpler_tod_5g_taskmaster1' into simpler_tod_5h_taskmas…
Nov 17, 2021
4b01553
right lint
Nov 17, 2021
66f2695
Merge branch 'simpler_tod_5h_taskmaster2' into simpler_tod_5i_taskmas…
Nov 17, 2021
f0b387f
task_list + remove unnecessary line
Nov 17, 2021
e8e366f
add init
Nov 17, 2021
e354c39
Merge branch 'simpler_tod_5d_msr_e2e' into simpler_tod_5e_multidogo
Nov 17, 2021
f637a29
Merge branch 'simpler_tod_5e_multidogo' into simpler_tod_5f_multiwoz_v22
Nov 17, 2021
ca898c5
Merge branch 'simpler_tod_5f_multiwoz_v22' into simpler_tod_5g_taskma…
Nov 17, 2021
ded9298
Merge branch 'simpler_tod_5g_taskmaster1' into simpler_tod_5h_taskmas…
Nov 17, 2021
0354b94
Merge branch 'simpler_tod_5h_taskmaster2' into simpler_tod_5i_taskmas…
Nov 17, 2021
dfc4989
Merge branch 'main' into simpler_tod_2_agents_teachers
Nov 29, 2021
2f15448
address eric comments; add new readme + more documentation
Nov 30, 2021
abd1c7e
Merge branch 'simpler_tod_2_agents_teachers' into simpler_tod_3_world
Nov 30, 2021
5d0197d
minor wording change
Nov 30, 2021
39792a8
Merge branch 'simpler_tod_2_agents_teachers' into simpler_tod_3_world
Nov 30, 2021
76bfa89
add more documtnation to world tests (following comment on teacher te…
Nov 30, 2021
73c5c7a
minor comment update
Nov 30, 2021
f6acccb
Merge branch 'simpler_tod_3_world' into simpler_tod_4_tod_json
Nov 30, 2021
dc4b70e
Merge branch 'simpler_tod_4_tod_json' into simpler_tod_5a_google_sgd
Nov 30, 2021
1299b68
Merge branch 'simpler_tod_5a_google_sgd' into simpler_tod_5b_google_s…
Nov 30, 2021
58965d3
Merge branch 'simpler_tod_4_tod_json' into simpler_tod_5c_metalwoz
Nov 30, 2021
55aa3ca
Merge branch 'simpler_tod_5b_google_sgd_sim_splits' into simpler_tod_…
Nov 30, 2021
6517301
Merge branch 'simpler_tod_5c_metalwoz' into simpler_tod_5d_msr_e2e
Nov 30, 2021
c344740
Merge branch 'simpler_tod_5d_msr_e2e' into simpler_tod_5e_multidogo
Nov 30, 2021
0b846d7
make build file less dumb; minor bug in agents.py
Nov 30, 2021
65d0a42
remove + rerun regression test data for multidogo
Dec 1, 2021
1ae8fc6
Merge branch 'simpler_tod_5e_multidogo' into simpler_tod_5f_multiwoz_v22
Dec 1, 2021
7f5c171
Merge branch 'simpler_tod_5f_multiwoz_v22' into simpler_tod_5g_taskma…
Dec 1, 2021
cefb41a
Merge branch 'simpler_tod_5g_taskmaster1' into simpler_tod_5h_taskmas…
Dec 1, 2021
6e91da7
Merge branch 'simpler_tod_5h_taskmaster2' into simpler_tod_5i_taskmas…
Dec 1, 2021
a1aba6a
see what happens if I bump up the build # (hoping tests work)
Dec 1, 2021
c9ef957
make the multidogo test not take forever
Dec 1, 2021
7ab9d70
update to respect actual count of episodes (I think this might have i…
Dec 1, 2021
c6c728d
Merge branch 'main' into simpler_tod_2_agents_teachers
Dec 1, 2021
b3283d0
Merge branch 'simpler_tod_2_agents_teachers' into simpler_tod_3_world
Dec 1, 2021
85ab0fd
Merge branch 'simpler_tod_3_world' into simpler_tod_4_tod_json
Dec 1, 2021
0969aa1
Merge branch 'simpler_tod_4_tod_json' into simpler_tod_5a_google_sgd
Dec 1, 2021
1869cee
Merge branch 'simpler_tod_5a_google_sgd' into simpler_tod_5b_google_s…
Dec 1, 2021
609f930
Merge branch 'simpler_tod_5b_google_sgd_sim_splits' into simpler_tod_…
Dec 1, 2021
1da3858
Merge branch 'simpler_tod_5c_metalwoz' into simpler_tod_5d_msr_e2e
Dec 1, 2021
01f37d2
Merge branch 'simpler_tod_5d_msr_e2e' into simpler_tod_5e_multidogo
Dec 1, 2021
0e1251e
Merge branch 'simpler_tod_5e_multidogo' into simpler_tod_5f_multiwoz_v22
Dec 1, 2021
992980f
Merge branch 'simpler_tod_5f_multiwoz_v22' into simpler_tod_5g_taskma…
Dec 1, 2021
76ecf7a
Merge branch 'simpler_tod_5g_taskmaster1' into simpler_tod_5h_taskmas…
Dec 1, 2021
a1d0451
Merge branch 'simpler_tod_5h_taskmaster2' into simpler_tod_5i_taskmas…
Dec 1, 2021
0580ff0
Merge branch 'main' into simpler_tod_2_agents_teachers
Dec 2, 2021
e00accf
Merge branch 'simpler_tod_2_agents_teachers' into simpler_tod_3_world
Dec 2, 2021
701da8d
Merge branch 'simpler_tod_3_world' into simpler_tod_4_tod_json
Dec 2, 2021
d519dc2
Merge branch 'simpler_tod_4_tod_json' into simpler_tod_5a_google_sgd
Dec 2, 2021
c7c1c64
Merge branch 'simpler_tod_5a_google_sgd' into simpler_tod_5b_google_s…
Dec 2, 2021
828f44f
Merge branch 'simpler_tod_5b_google_sgd_sim_splits' into simpler_tod_…
Dec 2, 2021
3c209ab
Merge branch 'simpler_tod_5c_metalwoz' into simpler_tod_5d_msr_e2e
Dec 2, 2021
564f5c8
Merge branch 'simpler_tod_5d_msr_e2e' into simpler_tod_5e_multidogo
Dec 2, 2021
00f2c5f
Merge branch 'simpler_tod_5e_multidogo' into simpler_tod_5f_multiwoz_v22
Dec 2, 2021
c746e70
Merge branch 'simpler_tod_5f_multiwoz_v22' into simpler_tod_5g_taskma…
Dec 2, 2021
81f2b51
Merge branch 'simpler_tod_5g_taskmaster1' into simpler_tod_5h_taskmas…
Dec 2, 2021
82dcb1f
Merge branch 'simpler_tod_5h_taskmaster2' into simpler_tod_5i_taskmas…
Dec 2, 2021
9466144
regen after changing tod teacher logic to respect episode/examples le…
Dec 2, 2021
1392d99
regen after changing tod teacher logic to respect episode/examples le…
Dec 2, 2021
77dccb7
regen after changing tod teacher logic to respect episode/examples le…
Dec 2, 2021
98aa5f7
regen after changing tod teacher logic to respect episode/examples le…
Dec 2, 2021
8291321
regen after changing tod teacher logic to respect episode/examples le…
Dec 2, 2021
1d3d0c6
regen after changing tod teacher logic to respect episode/examples le…
Dec 2, 2021
66958f2
regen after changing tod teacher logic to respect episode/examples le…
Dec 2, 2021
71b5af8
Merge branch 'simpler_tod_5a_google_sgd' into simpler_tod_5b_google_s…
Dec 2, 2021
9da65a6
Merge branch 'simpler_tod_5b_google_sgd_sim_splits' into simpler_tod_…
Dec 2, 2021
8956db7
Merge branch 'simpler_tod_5c_metalwoz' into simpler_tod_5d_msr_e2e
Dec 2, 2021
e946dff
Merge branch 'simpler_tod_5d_msr_e2e' into simpler_tod_5e_multidogo
Dec 2, 2021
47651f0
Merge branch 'simpler_tod_5e_multidogo' into simpler_tod_5f_multiwoz_v22
Dec 2, 2021
a3d75b6
Merge branch 'simpler_tod_5f_multiwoz_v22' into simpler_tod_5g_taskma…
Dec 2, 2021
346f686
Merge branch 'simpler_tod_5g_taskmaster1' into simpler_tod_5h_taskmas…
Dec 2, 2021
a93523c
Merge branch 'simpler_tod_5h_taskmaster2' into simpler_tod_5i_taskmas…
Dec 2, 2021
7b24acf
Merge branch 'main' into simpler_tod_3_world
Dec 18, 2021
e3fa063
Merge branch 'simpler_tod_3_world' into simpler_tod_4_tod_json
Dec 18, 2021
2384563
Merge branch 'simpler_tod_4_tod_json' into simpler_tod_5a_google_sgd
Dec 18, 2021
d9ba7e4
Merge branch 'main' into simpler_tod_5a_google_sgd
Dec 22, 2021
acd6ffe
not sure why this comment keeps not being merged correctly ugh...
Dec 22, 2021
a753a6d
Merge branch 'simpler_tod_5a_google_sgd' into simpler_tod_5b_google_s…
Dec 22, 2021
66d8bf8
Merge branch 'simpler_tod_5b_google_sgd_sim_splits' into simpler_tod_…
Dec 22, 2021
6900517
Merge branch 'simpler_tod_5c_metalwoz' into simpler_tod_5d_msr_e2e
Dec 22, 2021
235fbee
Merge branch 'simpler_tod_5d_msr_e2e' into simpler_tod_5e_multidogo
Dec 22, 2021
ef899c3
Merge branch 'simpler_tod_5e_multidogo' into simpler_tod_5f_multiwoz_v22
Dec 22, 2021
c9c8f3a
Merge branch 'simpler_tod_5f_multiwoz_v22' into simpler_tod_5g_taskma…
Dec 22, 2021
d8ca410
Merge branch 'simpler_tod_5g_taskmaster1' into simpler_tod_5h_taskmas…
Dec 22, 2021
e42cee2
Merge branch 'simpler_tod_5h_taskmaster2' into simpler_tod_5i_taskmas…
Dec 22, 2021
0f49cb5
noticed a different in episode lengths between old version of this da…
Dec 22, 2021
0fb3ecb
Merge branch 'main' into simpler_tod_5b_google_sgd_sim_splits
Dec 22, 2021
c2c1fa6
Merge branch 'main' into simpler_tod_5g_taskmaster1
Dec 22, 2021
66e09ee
Merge branch 'simpler_tod_5b_google_sgd_sim_splits' into simpler_tod_…
Dec 22, 2021
f556628
Merge branch 'simpler_tod_5c_metalwoz' into simpler_tod_5d_msr_e2e
Dec 22, 2021
3e43ef9
Merge branch 'simpler_tod_5d_msr_e2e' into simpler_tod_5e_multidogo
Dec 22, 2021
a72fe98
Merge branch 'simpler_tod_5e_multidogo' into simpler_tod_5f_multiwoz_v22
Dec 22, 2021
f092c7f
Merge branch 'simpler_tod_5f_multiwoz_v22' into simpler_tod_5g_taskma…
Dec 22, 2021
0b6a8f7
Merge branch 'simpler_tod_5g_taskmaster1' into simpler_tod_5h_taskmas…
Dec 22, 2021
feb93d6
Merge branch 'simpler_tod_5h_taskmaster2' into simpler_tod_5i_taskmas…
Dec 22, 2021
adff949
regen after changing tod teacher logic to respect episode/examples le…
Dec 22, 2021
d724cd8
regen after changing tod teacher logic to respect episode/examples le…
Dec 22, 2021
b4e5c1f
regen after changing tod teacher logic to respect episode/examples le…
Dec 22, 2021
21a05d2
regen after changing tod teacher logic to respect episode/examples le…
Dec 22, 2021
e9ea6ac
regen after changing tod teacher logic to respect episode/examples le…
Dec 22, 2021
534b984
regen after changing tod teacher logic to respect episode/examples le…
Dec 22, 2021
89a58a3
Merge branch 'simpler_tod_5d_msr_e2e' into simpler_tod_5e_multidogo
Dec 22, 2021
b139888
Merge branch 'simpler_tod_5e_multidogo' into simpler_tod_5f_multiwoz_v22
Dec 22, 2021
f0ecc98
Merge branch 'simpler_tod_5f_multiwoz_v22' into simpler_tod_5g_taskma…
Dec 22, 2021
7c14c10
Merge branch 'simpler_tod_5g_taskmaster1' into simpler_tod_5h_taskmas…
Dec 22, 2021
f6af363
Merge branch 'simpler_tod_5h_taskmaster2' into simpler_tod_5i_taskmas…
Dec 22, 2021
00ae154
Merge branch 'main' into simpler_tod_5c_metalwoz
Dec 22, 2021
9816384
Merge branch 'simpler_tod_5c_metalwoz' into simpler_tod_5d_msr_e2e
Dec 22, 2021
1b4a0ca
Merge branch 'simpler_tod_5d_msr_e2e' into simpler_tod_5e_multidogo
Dec 22, 2021
69eeeff
Merge branch 'simpler_tod_5e_multidogo' into simpler_tod_5f_multiwoz_v22
Dec 22, 2021
7522f30
Merge branch 'simpler_tod_5f_multiwoz_v22' into simpler_tod_5g_taskma…
Dec 22, 2021
7d5c3d6
Merge branch 'simpler_tod_5g_taskmaster1' into simpler_tod_5h_taskmas…
Dec 22, 2021
71c4aac
Merge branch 'simpler_tod_5h_taskmaster2' into simpler_tod_5i_taskmas…
Dec 22, 2021
c863bec
Merge branch 'main' into simpler_tod_5i_taskmaster3
Dec 23, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
[TOD][Datasets][Easy] Taskmaster(1) in Conversations format
Title. I only include System + UserSimulator Teachers here since that's all we need right now from dataset.

There's non-fb people that made edits in the original version of Taskmaster, so keep those teachers around too.
---------------
Datasets added in this substack:
* Google SGD
   * Google SGD Simulation Splits (In-domain, Out-domain)
   * MetalWoz
   * MSR_E2E
   * Multidogo
   * MultiWoz V2.2
   * **Taskmaster**
   * Taskmaster2
   * Taskmaster3 (TicketTalk)

   Test plan:
   Regression test, `parlai dd` of dataset
  • Loading branch information
Moya Chen committed Nov 16, 2021
commit 728c9bd7d012a920481337a7ca8b9bf5ac3aa93e
248 changes: 246 additions & 2 deletions parlai/tasks/taskmaster/agents.py
Original file line number Diff line number Diff line change
@@ -4,16 +4,260 @@
# This source code is licensed under the MIT license found in the
# LICENSE file in the root directory of this source tree.

"""
Taskmaster-1 implementation for ParlAI.

Note that we have conversations structured both in the "TOD" format as well as those
from prior.
"""

from typing import Optional
from parlai.core.params import ParlaiParser
import os
import pandas as pd
from parlai.core.opt import Opt
from parlai.core.teachers import FixedDialogTeacher
import parlai.core.tod.tod_core as tod
from typing import Optional
from parlai.utils.data import DatatypeHelper
from parlai.utils.io import PathManager

import parlai.tasks.taskmaster.build as build_
import parlai.core.tod.tod_agents as tod_agents

# Following is for legacy format
from parlai.core.teachers import FixedDialogTeacher
from . import tm_utils
import json


################### TOD Conversation format

SILENCE_TOKEN = "__SILENCE__"

# Faster to copy/paste this than parse a json file
ONTOLOGY = {
"uber": {
"id": "uber_lyft",
"vertical": "ride_booking",
"required": ["location.from", "location.to", "type.ride", "num.people"],
"optional": [
"price.estimate",
"duration.estimate",
"time.pickup",
"time.dropoff",
],
},
"movie": {
"id": "movie_ticket",
"vertical": "ticket_booking",
"required": [
"name.movie",
"name.theater",
"num.tickets",
"time.start",
"location.theater",
"price.ticket",
],
"optional": ["type.screening", "time.end", "time.duration"],
},
"restaurant": {
"id": "restaurant_reservation",
"vertical": "reservation",
"required": [
"name.restaurant",
"name.reservation",
"num.guests",
"time.reservation",
],
"optional": ["type.seating", "location.restaurant"],
},
"coffee": {
"id": "coffee_ordering",
"vertical": "coffee_order",
"required": ["location.store", "name.drink", "size.drink"],
"optional": ["num.drink", "type.milk", "preference"],
},
"pizza": {
"id": "pizza_ordering",
"vertical": "pizza_order",
"required": ["name.store", "name.pizza", "size.pizza"],
"optional": ["type.topping", "type.crust", "preference", "location.store"],
},
"auto": {
"id": "auto_repair",
"vertical": "appointment",
"required": ["name.store", "name.customer", "date.appt", "time.appt"],
"optional": ["reason.appt", "name.vehicle", "year.vehicle", "location.store"],
},
}


class Taskmaster1Parser(tod_agents.TodStructuredDataParser):
"""
Abstract data loader.
"""

@classmethod
def add_cmdline_args(
cls, parser: ParlaiParser, partial_opt: Optional[Opt] = None
) -> ParlaiParser:
parser = super().add_cmdline_args(parser, partial_opt)
return parser

def __init__(self, opt: Opt, shared=None):
self.fold = DatatypeHelper.fold(opt["datatype"])
opt["datafile"] = self.fold
self.dpath = os.path.join(opt["datapath"], "taskmaster-1")
if shared is None:
build_.build(opt)
super().__init__(opt, shared)

def _load_data(self, fold):
chunks = []
with PathManager.open(os.path.join(self.dpath, f"self-dialogs.json")) as f:
subset = pd.read_json(f)
chunks.append(subset)
with PathManager.open(os.path.join(self.dpath, f"woz-dialogs.json")) as f:
subset = pd.read_json(f)
chunks.append(subset)
chunks = pd.concat(chunks, axis=0)
# deterministic shuffle data for splits
chunks = chunks.sample(frac=1.0, random_state=42)
split_size = len(chunks) // 10
if fold == "train":
chunks = chunks[: split_size * 8]
elif fold == "valid":
chunks = chunks[split_size * 8 : split_size * 9]
elif fold == "test":
chunks = chunks[split_size * 9 :]
return chunks, ONTOLOGY

def _parse_segment_to_slots(self, segment_list):
result = {}
for segment in segment_list:
slot_name = segment["annotations"][0]["name"]
slot_value = segment["text"]
prefix_split_idx = slot_name.find(".")
api_name = slot_name[:prefix_split_idx]
slot_name = slot_name[prefix_split_idx + 1 :]
result[slot_name] = slot_value
result[tod.STANDARD_API_NAME_SLOT] = api_name
return result

def _get_utterance_and_slots_for_speaker(self, speaker, utterances, idx):
utts = []
slots = {}
while idx < len(utterances):
here = utterances[idx]
if here["speaker"] != speaker:
break
utts.append(here["text"])
slots.update(self._parse_segment_to_slots(here.get("segments", [])))
idx += 1
return idx, "\n".join(utts), slots

def _parse_to_api_schema(self, raw):
"""
NOTE: Format of ontology in this is different from TM2 + TM3. Need to figure out which is relevant for the domain.
"""
result = {}
for key, val in raw.items():
here = {}
here[tod.STANDARD_API_NAME_SLOT] = val["id"]
here[tod.STANDARD_REQUIRED_KEY] = val.get("required", [])
here[tod.STANDARD_OPTIONAL_KEY] = val.get("optional", [])
result[key] = here
return result

def _get_turns_from_parsed(self, user_utt, api_calls, api_resps, sys_utt):
result = [
tod.TodStructuredRound(
user_utt=user_utt,
api_call_machine=api_calls,
api_resp_machine=api_resps,
sys_utt=sys_utt,
)
]
return result

def setup_episodes(self, fold):
"""
Parses into TodStructuredEpisode.
"""
chunks, api_schema_raw = self._load_data(fold)
api_schemas_machine = self._parse_to_api_schema(api_schema_raw)
episodes = []
for _, row in chunks.iterrows():
utterances = row["utterances"][:]
if not all(
[
x.get("speaker") == "ASSISTANT" or x.get("speaker") == "USER"
for x in utterances
]
):
# there's an example or two that causes things to infinite loop. >.>
continue
idx = 0
rounds = []
goal_calls = []
if len(utterances) > 0 and utterances[0]["speaker"] == "ASSISTANT":
(idx, sys_utt, _,) = self._get_utterance_and_slots_for_speaker(
"ASSISTANT", utterances, idx
)

turns = self._get_turns_from_parsed(SILENCE_TOKEN, {}, {}, sys_utt)
for t in turns:
rounds.append(t)

while idx < len(utterances):
(
idx,
user_utt,
user_slots,
) = self._get_utterance_and_slots_for_speaker("USER", utterances, idx)
(
idx,
sys_utt,
system_slots,
) = self._get_utterance_and_slots_for_speaker(
"ASSISTANT", utterances, idx
)
# The annotations in this dataset don't make sense as api responses but... we'll just roll.
turns = self._get_turns_from_parsed(
user_utt, user_slots, system_slots, sys_utt
)
for t in turns:
rounds.append(t)
apis = []
for candidate_api in api_schemas_machine:
if candidate_api in row["instruction_id"]:
apis.append(api_schemas_machine[candidate_api])
episode = tod.TodStructuredEpisode(
api_schemas_machine=apis,
goal_calls_machine=goal_calls,
rounds=rounds,
delex=self.opt.get("delex", False),
)
episodes.append(episode)
return episodes

def get_id_task_prefix(self):
return "Taskmaster1"

def _label_fold(self, chunks):
return chunks.conversation_id.apply(self._h)


class SystemTeacher(Taskmaster1Parser, tod_agents.TodSystemTeacher):
pass


class UserSimulatorTeacher(Taskmaster1Parser, tod_agents.TodUserSimulatorTeacher):
pass


############ Legacy defined teachers


class SelfDialogueTeacher(FixedDialogTeacher):
"""
Teacher for written two-person dialogues with labels being responses for the
8 changes: 8 additions & 0 deletions parlai/tasks/taskmaster/test.py
Original file line number Diff line number Diff line change
@@ -21,3 +21,11 @@ class TestWozDialogueTeacher(AutoTeacherTest):

class TestSelfDialogueSegmentTeacher(AutoTeacherTest):
task = "taskmaster:self_dialogue_segment"


class TestSystemTeacher(AutoTeacherTest):
task = "taskmaster:SystemTeacher"


class TestUserSimulatorTeacher(AutoTeacherTest):
task = "taskmaster:UserSimulatorTeacher"
45 changes: 45 additions & 0 deletions parlai/tasks/taskmaster/test/taskmaster_SystemTeacher_test.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
acts:
- - domain: ''
episode_done: false
eval_labels:
- 'APIS: '
id: Taskmaster1_SystemTeacher
slots: {}
text: 'APIS: '
type: 'APIS: '
- - domain: ''
episode_done: false
eval_labels:
- 'APICALL: '
id: Taskmaster1_SystemTeacher
slots: {}
text: 'USER: __SILENCE__'
type: 'APICALL: '
- - domain: ''
episode_done: false
eval_labels:
- 'SYSTEM: hey there, how can i help you?'
id: Taskmaster1_SystemTeacher
slots: {}
text: 'APIRESP: '
type: 'SYSTEM: '
- - domain: ''
episode_done: false
eval_labels:
- 'APICALL: '
id: Taskmaster1_SystemTeacher
slots: {}
text: 'USER: Hi. I want to ride Uber in car.'
type: 'APICALL: '
- - domain: ''
episode_done: false
eval_labels:
- 'SYSTEM: sure, where are you heading to?

where is your destination?'
id: Taskmaster1_SystemTeacher
slots: {}
text: 'APIRESP: '
type: 'SYSTEM: '
num_episodes: 1326
num_examples: 33198
49 changes: 49 additions & 0 deletions parlai/tasks/taskmaster/test/taskmaster_SystemTeacher_train.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
acts:
- - domain: ''
episode_done: false
id: Taskmaster1_SystemTeacher
labels:
- 'APIS: '
slots: {}
text: 'APIS: '
type: 'APIS: '
- - domain: ''
episode_done: false
id: Taskmaster1_SystemTeacher
labels:
- 'APICALL: '
slots: {}
text: 'USER: __SILENCE__'
type: 'APICALL: '
- - domain: ''
episode_done: false
id: Taskmaster1_SystemTeacher
labels:
- 'SYSTEM: hi, how can i assist you?'
slots: {}
text: 'APIRESP: '
type: 'SYSTEM: '
- - domain: ''
episode_done: false
id: Taskmaster1_SystemTeacher
labels:
- 'APICALL: api_name = uber_lyft ; location.from.accept = the Wichita Dwight D.
Eisenhower National Airport ; location.to.accept = the Wichita State University
campus'
slots:
api_name: uber_lyft
location.from.accept: the Wichita Dwight D. Eisenhower National Airport
location.to.accept: the Wichita State University campus
text: 'USER: Hi, I need an Uber pickup from the Wichita Dwight D. Eisenhower National
Airport to the Wichita State University campus.'
type: 'APICALL: '
- - domain: ''
episode_done: false
id: Taskmaster1_SystemTeacher
labels:
- 'SYSTEM: what kind of ride would you like?'
slots: {}
text: 'APIRESP: '
type: 'SYSTEM: '
num_episodes: 10555
num_examples: 262021
Loading