Skip to content
This repository has been archived by the owner on Nov 3, 2023. It is now read-only.

[TOD][Datasets][Easy] Taskmaster3 (TicketTalk) to TOD Structured format #4187

Merged
merged 169 commits into from
Dec 23, 2021

Conversation

moyapchen
Copy link
Contributor

Title. I only include System + UserSimulator Teachers here since that's all we need right now from dataset.

Sooo many datasets.


Datasets added in this substack:

  • Google SGD
    • Google SGD Simulation Splits (In-domain, Out-domain)
  • MetalWoz
  • MSR_E2E
  • Multidogo
  • MultiWoz V2.2
  • Taskmaster
  • Taskmaster2
  • Taskmaster3 (TicketTalk)

Test plan:
Regression test, parlai dd of dataset

Patch description

Testing steps

Other information

Moya Chen added 27 commits November 15, 2021 20:15
See documentation block in `tod_agents.py`
As noted in the README, this agent takes data generated from `tod_world_script.py` and dumps it out to a teacher.

(Note that I tried setting up a regression test for this teacher, but I ran into issues getting it to save the output directory to not be something that included my local homedir name in it..)
See documentation block in `tod_agents.py`

(I'm not 100% sure if `conftest.py` is a right file to change, though I did notice that `pytest.ini` was necessary to get pytest to run.)
See documentation in `tod_world_script.py` for usage.
Refactor Google SGD away from old format into TOD Conversations format.

Datasets added in this substack:
* *Google SGD*
   * Google SGD Simulation Splits (In-domain, Out-domain)
* MetalWoz
* MSR_E2E
* Multidogo
* MultiWoz V2.2
* Taskmaster
* Taskmaster2
* Taskmaster3 (TicketTalk)

Test plan:
Regression test, `parlai dd` of dataset
Code for processing Google SGD into In-domain and Out-domain data via `build.py`, using via agents.

Datasets added in this substack:
* Google SGD
   * **Google SGD Simulation Splits (In-domain, Out-domain)**
* MetalWoz
* MSR_E2E
* Multidogo
* MultiWoz V2.2
* Taskmaster
* Taskmaster2
* Taskmaster3 (TicketTalk)

Test plan:
Regression test, `parlai dd` of dataset
Code for process MetalWoz into System + User Simulator teachers

Getting it to be in the Conversations format is a pain, so I don't even try here. (It's documented this way in the paper as well)

----------------------------
Datasets added in this substack:
* Google SGD
   * Google SGD Simulation Splits (In-domain, Out-domain)
* **MetalWoz**
* MSR_E2E
* Multidogo
* MultiWoz V2.2
* Taskmaster
* Taskmaster2
* Taskmaster3 (TicketTalk)

Test plan:
Regression test, `parlai dd` of dataset
Title. I only include System + UserSimulator Teachers here since that's all we need right now from dataset.

Datasets added in this substack:
* Google SGD
   * Google SGD Simulation Splits (In-domain, Out-domain)
* MetalWoz
* **MSR_E2E**
* Multidogo
* MultiWoz V2.2
* Taskmaster
* Taskmaster2
* Taskmaster3 (TicketTalk)

Test plan:
Regression test, `parlai dd` of dataset
Title. I only include System + UserSimulator Teachers here since that's all we need right now from dataset.

Datasets added in this substack:
* Google SGD
   * Google SGD Simulation Splits (In-domain, Out-domain)
* MetalWoz
* **MSR_E2E**
* Multidogo
* MultiWoz V2.2
* Taskmaster
* Taskmaster2
* Taskmaster3 (TicketTalk)

Test plan:
Regression test, `parlai dd` of dataset
Title. I only include System + UserSimulator Teachers here since that's all we need right now from dataset.

There are so many versions of MultiWoz, but this one is closest to our simulator.

---------------------------------

Datasets added in this substack:
* Google SGD
   * Google SGD Simulation Splits (In-domain, Out-domain)
* MetalWoz
* **MSR_E2E**
* Multidogo
* MultiWoz V2.2
* Taskmaster
* Taskmaster2
* Taskmaster3 (TicketTalk)

Test plan:
Regression test, `parlai dd` of dataset
Title. I only include System + UserSimulator Teachers here since that's all we need right now from dataset.

There's non-fb people that made edits in the original version of Taskmaster, so keep those teachers around too.
---------------
Datasets added in this substack:
* Google SGD
   * Google SGD Simulation Splits (In-domain, Out-domain)
   * MetalWoz
   * MSR_E2E
   * Multidogo
   * MultiWoz V2.2
   * **Taskmaster**
   * Taskmaster2
   * Taskmaster3 (TicketTalk)

   Test plan:
   Regression test, `parlai dd` of dataset
Title. I only include System + UserSimulator Teachers here since that's all we need right now from dataset.

There's a legacy implementation of Taskmaster2 here, but doesn't seem to be anyone using, so clobbering.

---------------------------------

Datasets added in this substack:
* Google SGD
   * Google SGD Simulation Splits (In-domain, Out-domain)
* MetalWoz
* MSR_E2E
* Multidogo
* **MultiWoz V2.2**
* Taskmaster
* Taskmaster2
* Taskmaster3 (TicketTalk)

Test plan:
Regression test, `parlai dd` of dataset
Title. I only include System + UserSimulator Teachers here since that's all we need right now from dataset.

Sooo many datasets.

---------------------------------

Datasets added in this substack:
* Google SGD
   * Google SGD Simulation Splits (In-domain, Out-domain)
* MetalWoz
* MSR_E2E
* Multidogo
* MultiWoz V2.2
* Taskmaster
* Taskmaster2
* **Taskmaster3 (TicketTalk)**

Test plan:
Regression test, `parlai dd` of dataset
Moya Chen added 26 commits December 22, 2021 13:20
Base automatically changed from simpler_tod_5h_taskmaster2 to main December 23, 2021 02:46
@moyapchen moyapchen merged commit 177e97f into main Dec 23, 2021
@moyapchen moyapchen deleted the simpler_tod_5i_taskmaster3 branch December 23, 2021 02:49
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants