QA Data Collection Task #268

vaibhavad · 2020-09-26T10:07:44Z

Hi,

I want to port QA data collection task from ParlAI to Mephisto.

Here a single MTurk will question-answers pairs relevant to the paragraph. Will this require different Mephisto blueprints?
In ParlAI there were tasks for data loading like squad and Wikipedia. Any guidelines on how to include the same functionality in Mephisto?

JackUrb · 2020-09-26T11:13:00Z

Hi @vaibhavad - you should certainly be able to do this using the existing blueprint. The example's README has some details about what this looks like.

In short, you'll be copying over the old worlds.py file, and adding the required meta functions that Mephisto uses to build the world (Including the step of loading the QA agent from shared in make_world). Then you'll make a run file and config file that set the relevant parameters (specifying the task description, loading the model, passing the shared params to world_opt to be used in make_world, etc).

All of this should likely live in a folder in ParlAI/parlai/crowdsourcing. In there should be a worlds file, a run file, and a configuration .yaml file in a conf folder.

vaibhavad · 2020-09-26T13:16:43Z

Hi @JackUrb - Thanks a lot for the quick reply!

Following your advice, I have set up the example in my fork. I copied everything from parlai_chat_task_demo example and changed conf/ and demo_worlds.py according to the QA data collection task.

For some reason the first message from QACollector (The agent which provides paragraph and prompts to the turker) is displayed twice. I have verified that the parley function of the world is called only once. Any idea from where this error might be originating?

JackUrb · 2020-09-26T13:19:38Z

Ah I thought we had worked this bug out already! I'll address this Monday but it's just a minor display bug on our end. If you refresh you'll likely see just one message.

vaibhavad · 2020-09-26T13:31:58Z

Thanks a lot!

vaibhavad · 2020-09-28T16:39:18Z

Strangely the issue went away on its own! :D

I have implemented QA Collection Task (ignore the README.md) by defining a new dataloader configuration and using it as an agent in worlds.py file. I have defined configurations for Wikipedia and SQuAD.

Let me know if you want to include this as one of the examples. I'll then document the code and create a README.

JackUrb · 2020-09-28T17:02:11Z

The double-message issue is intermittent, though I haven't been able to track down why it happens sometimes (even though I have a good idea on how to prevent it).

But besides that, this implementation is really mostly good to go! There's some code that I can offer some suggestions for improving on, though I think the most appropriate place to submit this is as part of the ParlAI repo (as it will eventually replace the version of the task currently there). If you open a PR adding this as a new task to parlai/crowdsourcing/tasks in the ParlAI repo I'd be happy to leave comments (and note some things that can be trimmed) over there!

JackUrb · 2020-11-10T16:34:38Z

HI @vaibhavad - are you still interested in merging this in?

vaibhavad · 2020-11-10T18:46:50Z

Hi @JackUrb,

Thank you for all your help. I'll open a PR on parlai/crowdsourcing/tasks sometime later, as suggested by you. Has the double-message issue been fixed? If not, can you give an outline of how to prevent it?

JackUrb · 2020-11-10T19:04:47Z

Indeed - it was a regression, currently cleaning up a fix on the resolve-double-message branch. The fix will prevent the double send from the Mephisto backend side, so you shouldn't need to change anything.

vaibhavad · 2020-11-10T19:32:27Z

Thanks a lot, I'll close this!

EricMichaelSmith · 2020-12-17T20:14:18Z

Hi @vaibhavad - do you still have your code for this around somewhere? We're working to finish porting this task over to Mephisto now, and it'd be super useful to have your contribution :)

vaibhavad · 2020-12-18T09:52:46Z

Hi @EricMichaelSmith,

Yes, I'll be very happy to contribute. Will this go in parlai/crowdsourcing/tasks or mephisto/examples? Currently, I have opened a pull request (#344 ) for Mephisto. Let me know if I should open a PR in parlai instead.

EricMichaelSmith · 2020-12-18T13:48:23Z

Hi @EricMichaelSmith,

Yes, I'll be very happy to contribute. Will this go in parlai/crowdsourcing/tasks or mephisto/examples? Currently, I have opened a pull request (#344 ) for Mephisto. Let me know if I should open a PR in parlai instead.

Hi @vaibhavad - that's great to hear. I think it'd make more sense for it to go in parlai/crowdsourcing/tasks, yeah - we're trying to consolidate all of our ParlAI-related tasks there. If you open your PR in that repo, I'd be happy to review it!

vaibhavad · 2020-12-20T20:09:53Z

Opened a PR here

EricMichaelSmith · 2020-12-21T14:04:17Z

Opened a PR here

Great, thanks! Will take a look there

vaibhavad closed this as completed Nov 10, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

QA Data Collection Task #268

QA Data Collection Task #268

vaibhavad commented Sep 26, 2020

JackUrb commented Sep 26, 2020

vaibhavad commented Sep 26, 2020 •

edited

Loading

JackUrb commented Sep 26, 2020

vaibhavad commented Sep 26, 2020

vaibhavad commented Sep 28, 2020 •

edited

Loading

JackUrb commented Sep 28, 2020

JackUrb commented Nov 10, 2020

vaibhavad commented Nov 10, 2020

JackUrb commented Nov 10, 2020

vaibhavad commented Nov 10, 2020

EricMichaelSmith commented Dec 17, 2020

vaibhavad commented Dec 18, 2020 •

edited

Loading

EricMichaelSmith commented Dec 18, 2020

vaibhavad commented Dec 20, 2020

EricMichaelSmith commented Dec 21, 2020

QA Data Collection Task #268

QA Data Collection Task #268

Comments

vaibhavad commented Sep 26, 2020

JackUrb commented Sep 26, 2020

vaibhavad commented Sep 26, 2020 • edited Loading

JackUrb commented Sep 26, 2020

vaibhavad commented Sep 26, 2020

vaibhavad commented Sep 28, 2020 • edited Loading

JackUrb commented Sep 28, 2020

JackUrb commented Nov 10, 2020

vaibhavad commented Nov 10, 2020

JackUrb commented Nov 10, 2020

vaibhavad commented Nov 10, 2020

EricMichaelSmith commented Dec 17, 2020

vaibhavad commented Dec 18, 2020 • edited Loading

EricMichaelSmith commented Dec 18, 2020

vaibhavad commented Dec 20, 2020

EricMichaelSmith commented Dec 21, 2020

vaibhavad commented Sep 26, 2020 •

edited

Loading

vaibhavad commented Sep 28, 2020 •

edited

Loading

vaibhavad commented Dec 18, 2020 •

edited

Loading