Skip to content
This repository was archived by the owner on Nov 3, 2023. It is now read-only.

Update wizard_of_wikipedia agents.py #3585

Merged
merged 5 commits into from
Apr 17, 2021
Merged

Update wizard_of_wikipedia agents.py #3585

merged 5 commits into from
Apr 17, 2021

Conversation

Reveyer
Copy link
Contributor

@Reveyer Reveyer commented Apr 13, 2021

Patch description

When wizard_first is True, the len_episode is incorrectly determined, resulting in some data not being used.

Testing steps

I tested it using command

parlai display_data --task wizard_of_wikipedia --num-examples 100000

Before the change

loaded 18430 episodes with a total of 74092 examples

After the change

loaded 18430 episodes with a total of 83247 examples

Other information

When wizard_first, the len_episode is incorrectly determined, resulting in some data not being used.
@facebook-github-bot
Copy link

Hi @Reveyer!

Thank you for your pull request and welcome to our community.

Action Required

In order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you.

Process

In order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA.

Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with CLA signed. The tagging process may take up to 1 hour after signing. Please give it that time before contacting us about it.

If you have received this in error or have any questions, please contact us at cla@fb.com. Thanks!

@Reveyer Reveyer changed the title Update agents.py Update wizard_of_wikipedia agents.py Apr 13, 2021
@facebook-github-bot
Copy link

Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Facebook open source project. Thanks!

@klshuster
Copy link
Contributor

Hi @Reveyer, thank you for flagging this bug! In the interest of reproducibility, could you please make some minor modifications to the PR:

  1. Create a subclass of WizardDialogKnowledgeTeacher, perhaps named LongTeacher, that overrides len_episode to fix the bug.
  2. Somewhere in the __init__ for the WizardDialogKnowledgeTeacher, please loudly warn that it is advised to use -t wizard_of_wikipedia:long to include the missing examples from the default teacher.

Thanks again!

@klshuster
Copy link
Contributor

Additionally, it would be great if we could limit this change to --datatype train (so, valid and test sets remain the same)

Add flag `--add-missing-turns`  for the previous work that can be rewired, and the ability to add missing data for the training set or the entire set.
@Reveyer
Copy link
Contributor Author

Reveyer commented Apr 14, 2021

Additionally, it would be great if we could limit this change to --datatype train (so, valid and test sets remain the same)

I think adding an argument would better achieve what you said.

Copy link
Contributor

@klshuster klshuster left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you! yes, an argument works great here

Copy link
Contributor

@stephenroller stephenroller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greatly appreciated

Co-authored-by: Stephen Roller <roller@fb.com>
@stephenroller stephenroller merged commit 36bc4ae into facebookresearch:master Apr 17, 2021
@Reveyer Reveyer deleted the patch-2 branch April 17, 2021 05:12
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants