-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix some bugs in test case prompts/ground truths #608
Fix some bugs in test case prompts/ground truths #608
Conversation
5ad4947
to
ccd28be
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @aw632,
Thanks for the PR!
Regarding simple_5
: The question did not specify which kind of roots it is looking for, so the default value "real"
should be used. Thus, the possible answer should be ["", "real"]
.
Same thing with multiple_88
.
The rest looks good to me.
Hi, thanks for your comment. W.r.t |
Fair point. I will update the prompt to eliminate ambiguity. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Total number of entries affected: 21
- Simple: 16 entry
simple_5, simple_13, simple_96, simple_122, simple_156, simple_183, simple_235, simple_238, simple_267, simple_308, simple_309, simple_316, simple_375, simple_379, simple_389, simple_398
- Multiple: 5 entry
multiple_74, multiple_88, multiple_119, multiple_153, multiple_183
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the PR @aw632 and welcome!
Summary of the changes:
This PR does change the leaderboard values and will be updated in a separate PR.