-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Create Per-Turn Evaluation Folder in ParlAI #4323
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah looks great, thanks for adding this @Rebecca-Qian ! So I know I have a ton of comments - most of these are little peculiarities of the code that you would have had no way of knowing. And, before merging, it'll be useful to test out that compile_results.py
script on dummy data so we know that it works correctly =) And don't worry about the unittests_osx check failing - if you look at https://github.com/facebookresearch/ParlAI/commits/main you'll see that that test has been failing for a week now =/
.../crowdsourcing/tasks/pairwise_per_turn_eval/hydra_configs/conf/example_model_comparison.yaml
Outdated
Show resolved
Hide resolved
.../crowdsourcing/tasks/pairwise_per_turn_eval/hydra_configs/conf/example_model_comparison.yaml
Outdated
Show resolved
Hide resolved
.../crowdsourcing/tasks/pairwise_per_turn_eval/hydra_configs/conf/example_model_comparison.yaml
Outdated
Show resolved
Hide resolved
parlai/crowdsourcing/tasks/pairwise_per_turn_eval/task_config/model_opts.yaml
Outdated
Show resolved
Hide resolved
parlai/crowdsourcing/tasks/pairwise_per_turn_eval/task_config/onboard_task_data.json
Outdated
Show resolved
Hide resolved
parlai/crowdsourcing/tasks/pairwise_per_turn_eval/task_config/onboard_task_data__humanness.json
Outdated
Show resolved
Hide resolved
parlai/crowdsourcing/tasks/pairwise_per_turn_eval/task_config/task_description.html
Outdated
Show resolved
Hide resolved
@@ -0,0 +1,29 @@ | |||
# Per-turn Evaluation Crowdsourcing Task |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh ha, almost forgot - would be good to link to the paper itself at the top of this README :P I'll be doing that too with my SM-Turn README
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Include a bibtex too please
Patch description
This PR creates the initial open-sourcing of the Per-Turn Evaluation project from the
parlai-internal
repo. The PR creates a newpairwise_per_turn_eval
directory inparlai/crowdsourcing/tasks
.Refactors and further cleanup will come in a separate PR.
File structure
analysis/
: Analysis code for compiling per-turn evaluation experiment results.frontend/
: All task UX components used to render the chat task, including onboarding and error panes.hydra_configs/conf/
: Path to hydra configs used to specify individual experiment run parameters. Initialized with an exampleexample_model_comparison.yaml
file. Some task specific parameters have been removed or stubbed out.task_config/
: Task data and configs, eg. onboarding data, JSON configs.README.md
: Shortened README giving an overview of the per-turn eval project.bot_agent.py
impl.py
per_turn_eval_blueprint.py
run.py
utils.py
worlds.py
Testing steps
Ran HITs end-to-end in MTurk sandbox to verify correctness.
Onboarding:
Chat task:
Analysis script run: