Updating to Mephisto 1.0 #4426

JackUrb · 2022-03-16T18:46:38Z

Patch description
First set of steps that get crowdsourcing tests running (no longer breaking the newer mephisto conventions), but still not passing. More work is to be done there, leaving this open as a starting point for others to make comments and take over.

(@EricMichaelSmith : all crowdsourcing tests are passing now as of March 30th)

Testing steps

pytest tests/crowdsourcing

JackUrb · 2022-03-16T19:03:52Z

parlai/crowdsourcing/utils/tests.py

-            max_num_tries = 6
+            mock_worker_registration_name = f"MOCK_WORKER_{idx:d}"
+            mock_worker_name = f"{mock_worker_registration_name}_sandbox"
+            max_num_tries = 3


As an aside, retries should no longer be necessary with assert_sandbox_worker_created and await_channel_requests which run the async loop until pending things are processed.

Just removed this retry loop without breaking the tests

parlai/crowdsourcing/utils/tests.py

EricMichaelSmith · 2022-03-17T20:12:57Z

@JackUrb Most of the crowdsourcing tests don't seem to be running currently due to ImportErrors - disabling the try/except blocks around them to see what's going on

EricMichaelSmith

Thanks for this PR and for doing these refactors - yeah, it looks like it's useful to have a few pieces of boilerplate code abstracted away, and not needing to do retries would be useful. Am trying to get all 43 crowdsourcing tests to run to get a better sense of what needs to be done with this

parlai/crowdsourcing/utils/tests.py

tests/crowdsourcing/tasks/test_chat_demo.py

parlai/crowdsourcing/utils/tests.py

EricMichaelSmith · 2022-03-17T21:05:17Z

Okay, I got all 43 crowdsourcing checks to run so that we can debug them. @JackUrb 27 of them, the Fast-Acute ones, are currently failing with a "No live runs present" error due to no LiveTaskRuns being found when calling Operator.get_running_task_runs() within ParlAI's AbstractCrowdsourcingTest._get_live_run(). Do you know what might cause no task runs to be found?

JackUrb · 2022-03-17T21:14:57Z

Hadn't noticed that the blueprint was broken under-the-hood, leading to a blueprint launch error (and thus no running tasks), can make a quick change for this. (running locally identified the issue in logs)

EricMichaelSmith · 2022-03-17T21:42:50Z

Hadn't noticed that the blueprint was broken under-the-hood, leading to a blueprint launch error (and thus no running tasks), can make a quick change for this. (running locally identified the issue in logs)

Great, thanks! Hmm, now I'm seeing an AgentTimeoutError in the CI check logs when cleaning up the test unit: https://app.circleci.com/pipelines/github/facebookresearch/ParlAI/11125/workflows/8f87deb8-70aa-47d3-86a7-dd9b474a900b/jobs/91214?invite=true#step-111-5068 The .wait() method of the agent here seems to be preventing the Fast-Acute checks from completing after erroring out

JackUrb · 2022-03-17T21:46:37Z

That would imply to me that the unit was still running when it was shutdown, and thus the shutdown waited for the timeout. You may need to examine this locally to see what was coming through the agent and what wasn't, as I'm unclear why this happened from the given info.

EricMichaelSmith · 2022-03-17T22:15:29Z

That would imply to me that the unit was still running when it was shutdown, and thus the shutdown waited for the timeout. You may need to examine this locally to see what was coming through the agent and what wasn't, as I'm unclear why this happened from the given info.

Hmm, on my devfair, it looks like this issue with the unit being left hanging came from a ValueError due to the test unit not being registered in the data browser correctly:

self = <parlai.crowdsourcing.tasks.acute_eval.analysis.AcuteAnalyzer object at 0x7fc491f6e670>

    def _extract_to_dataframe(self) -> pd.DataFrame:
        """
        Extract the data from the run to a pandas dataframe.
        """
        units = self.mephisto_data_browser.get_units_for_task_name(self.run_id)
        responses: List[Dict[str, Any]] = []
        for unit in units:
            unit_details = self._parse_unit(unit)
            if unit_details is None:
                continue
            for idx in range(len(unit_details['data'])):
                response = self._extract_response_by_index(unit_details, idx)
                if response is not None:
                    responses.append(response)

        if len(responses) == 0:
>           raise ValueError('No valid results found!')
E           ValueError: No valid results found!

parlai/crowdsourcing/tasks/acute_eval/analysis.py:251: ValueError

So I suppose the question now is (1) whether the unit got saved in the data browser correctly, and if so, (2) why it's not being loaded back in

JackUrb · 2022-03-17T22:30:51Z

So I suppose the question now is (1) whether the unit got saved in the data browser correctly, and if so, (2) why it's not being loaded back in

My bet is still not have been marked as completed, which would happen in another thread (and I imagine if this script is launched before the unit is completed, you won't get result data). I expect this to more likely be (1) than (2). You'd likely want to dig into the TaskRunner to be sure that your TaskRunner's run_unit function completes.

Actually this is likely it, we've changed the semantics for live acts to be different from task submission. See the new StaticTaskRunner:

    def run_unit(self, unit: "Unit", agent: "Agent") -> None:
        """
        Static runners will get the task data, send it to the user, then
        wait for the agent to act (the data to be completed)
        """
        agent.await_submit(self.assignment_duration_in_seconds)

.circleci/config.yml

parlai/crowdsourcing/utils/tests.py

tests/crowdsourcing/tasks/test_chat_demo.py

EricMichaelSmith

All crowdsourcing tests seem to be passing now. No remaining issues that I can see

JackUrb added 4 commits March 15, 2022 10:51

Version bumps

b3ced3b

Correct version

0aad5af

Correct beta release

98638c7

Tests should _run_, but fail

010bb83

JackUrb requested a review from EricMichaelSmith March 16, 2022 18:46

facebook-github-bot added the CLA Signed label Mar 16, 2022

JackUrb commented Mar 16, 2022

View reviewed changes

parlai/crowdsourcing/utils/tests.py Outdated Show resolved Hide resolved

Temporarily force run all crowdsourcing tests

4c4fb96

EricMichaelSmith reviewed Mar 17, 2022

View reviewed changes

EricMichaelSmith added 5 commits March 17, 2022 16:20

Forcerun crowdsourcing checks

54da0aa

Typo

ce540c6

Update config.yml

1b5bc77

Minor

1aaad9c

Update acute_eval_runner.py

5260a3b

Simple fix, more tests run

8389394

stephenroller reviewed Mar 22, 2022

View reviewed changes

.circleci/config.yml Outdated Show resolved Hide resolved

stephenroller and others added 6 commits March 22, 2022 14:08

Merge branch 'main' into mephisto-1.0

ff08a40

Force cleanup of the server.

5bd316a

Correct AcuteEvalRunner.run_unit

f90e020

using v1.0.0 from Mephisto github

0463621

Rolling back temp solution

a31cf33

Fix model chat test?

7f1ff31

EricMichaelSmith and others added 4 commits March 26, 2022 12:26

chat demo fix

ba3fb09

QA fix

7573f3c

model image chat

9b87422

Only add sleep time to model image chat

6c9fac1

EricMichaelSmith reviewed Mar 28, 2022

View reviewed changes

parlai/crowdsourcing/utils/tests.py Show resolved Hide resolved

EricMichaelSmith reviewed Mar 28, 2022

View reviewed changes

tests/crowdsourcing/tasks/test_chat_demo.py Show resolved Hide resolved

EricMichaelSmith and others added 5 commits March 28, 2022 09:35

Remove agent retry loop

0cb8e08

Remove 2 lines

011e424

Lengthen time

ab22167

Fixing model chat onboarding and task frontend

385e97b

fix acute eval test

a4bf65b

dexterju27 mentioned this pull request Mar 29, 2022

Fix acute eval test for mephisto-1.0 #4462

Closed

EricMichaelSmith added 2 commits March 29, 2022 17:36

Remove TODOs

99a604f

Merge branch 'main' into mephisto-1.0

fc18ad5

EricMichaelSmith marked this pull request as ready for review March 30, 2022 13:43

EricMichaelSmith approved these changes Mar 30, 2022

View reviewed changes

JackUrb merged commit c946fb3 into main Mar 30, 2022

JackUrb deleted the mephisto-1.0 branch March 30, 2022 14:06

jxmsML mentioned this pull request Apr 5, 2022

[Mephisto Upgrade] to 1.0.1, mephisto-task upgrade to 2.0.1 #4483

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Updating to Mephisto 1.0 #4426

Updating to Mephisto 1.0 #4426

JackUrb commented Mar 16, 2022 •

edited by EricMichaelSmith

Loading

JackUrb Mar 16, 2022

EricMichaelSmith Mar 28, 2022

EricMichaelSmith commented Mar 17, 2022

EricMichaelSmith left a comment

EricMichaelSmith commented Mar 17, 2022

JackUrb commented Mar 17, 2022 •

edited

Loading

EricMichaelSmith commented Mar 17, 2022 •

edited

Loading

JackUrb commented Mar 17, 2022

EricMichaelSmith commented Mar 17, 2022

JackUrb commented Mar 17, 2022 •

edited

Loading

EricMichaelSmith left a comment

Updating to Mephisto 1.0 #4426

Updating to Mephisto 1.0 #4426

Conversation

JackUrb commented Mar 16, 2022 • edited by EricMichaelSmith Loading

JackUrb Mar 16, 2022

Choose a reason for hiding this comment

EricMichaelSmith Mar 28, 2022

Choose a reason for hiding this comment

EricMichaelSmith commented Mar 17, 2022

EricMichaelSmith left a comment

Choose a reason for hiding this comment

EricMichaelSmith commented Mar 17, 2022

JackUrb commented Mar 17, 2022 • edited Loading

EricMichaelSmith commented Mar 17, 2022 • edited Loading

JackUrb commented Mar 17, 2022

EricMichaelSmith commented Mar 17, 2022

JackUrb commented Mar 17, 2022 • edited Loading

EricMichaelSmith left a comment

Choose a reason for hiding this comment

JackUrb commented Mar 16, 2022 •

edited by EricMichaelSmith

Loading

JackUrb commented Mar 17, 2022 •

edited

Loading

EricMichaelSmith commented Mar 17, 2022 •

edited

Loading

JackUrb commented Mar 17, 2022 •

edited

Loading