Skip to content

Vs/fix crows pairs duplicate UUID #789

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 22 commits into from

Conversation

VictorSanh
Copy link
Member

Fixing a uuid duplicate in crows pairs
cc @oskarvanderwal #781

RosenZhang and others added 22 commits April 27, 2022 12:07
* Add GEM/xsum prompts

* uncommit this hack

* Add GEM in INCLUDED_USERS

Co-authored-by: Albert Webson <awebson@cs.brown.edu>
* First 2 prompts for blimp

* Fixed some template issues that led to conflicting details

* Updated templates to avoid packing errors

* Added 2 new prompts with a different style of asking question

* Add Najoung\'s templates

* Added prompts for complex_NP_island

* Updated the templates after the requested fixes

* Added templates for complex_NP_island via copy scripts

* Updated templates for all subsets

* Switched to randomized prompts.

* Minor changes to prompt names

* Fix answer_choices for some templates + better template names

* Fixed extra spaces.

* Added null prompts (true null and single quotation mark versions)

* Minor cleanup

* Prompt cleanups: true null prompt, choice randomization + dropping order swapped prompts.

Co-authored-by: cookie <nk>
Co-authored-by: najoungkim <najoungk@gmail.com>
* adding GEM/webnlg prompts and Russian prompts, so can use challenge sets.


Adding Russian versions of Prompts.
…740)

* Adding GEM Simplification prompts. To include challenge sets, and additional prompts for Turk & ASSET.

Adding Multiple Prompts + PALM , commenting out non-original tasks.
* add prompts for lince sa task

* add period, fix answer choices, add more prompts with diverse phrasing
* Add prompts for BioASQ task b

Following [BLURB](https://microsoft.github.io/BLURB/) benchmark we use
task 7b and only yes/no question in that specifically to be comparable
to other models.

We use prompts similar to BoolQ prompts

* Fix cache inconsistency and expanduser by default
* add username to INCLUDED_USERS list

* add 5 prompts

* change prompt name

* add choices_in_prompt and fix minor issues

* add labels and fix a prompt

* title case answer choices

* fix grammatical error

* add other metrics

* remove file

* fix issues

* Update templates.py

* Update templates.py

* set choices_in_prompt to false

* add answer_choices in the prompts

* fix minor issues

* make prompts more natural
* templates for tweets_hate_speech

* Update templates.yaml

* Create templates.yaml

* deleting tweets template

* adding crd3 prompts

* Update templates.py

* putting instructions at the top

Co-authored-by: Shanya Sharma - s0s0cr3 <Shanya.Sharma@walmartlabs.com>
Co-authored-by: Stephen Bach <stephenhbach@gmail.com>
* added first templates for DiaBLa dataset rbawden/DiaBLa

* declared previous_ref at beginning of templates

* declared more variables at beginning of templates

* declared more variables at beginning of templates

* declared more variables at beginning of templates

* declared more variables at beginning of templates

* corrected ref for mt in one template

* corrected ref for mt in one template

* moved condition to just target side rather than around entire prompt

* allow template even when no context (beginning of dialogue)

* corrected templates that use past history

* corrected multiple choice answer field

* updated templates - simplified some targets and added translation completion tasks (simpler?)

* corrected duplicate name

* updated duplicate definition

* corrected error of only two pipes in answer choices

* corrected -2 index to -1 - duplicate defintiions

* merged with eval-hackathon updates

* corrected discriminate mt ref

* removed directional templates and only keep both directions (analysis to be done separately later. Also some slight modifs such as removing excess words and quotes

* simplified names, changed random to choice

Co-authored-by: Rachel Bawden <rachel.bawden@inria.fr@users.noreply.github.com>
Co-authored-by: Stephen Bach <stephenhbach@gmail.com>
* add original prompts

* update metrics
* Add templates for schema_guided_dstc8 response generation

* Remove extra newlines at the end of targets for schema_guided_dstc8

* Accelerate `get_infos` by caching the `DataseInfoDict`s (#778)

* accelerate `get_infos` by caching the `DataseInfoDict`s

* quality

* consistency

* Revert changes to app.py.

* Update promptsource/app.py

Co-authored-by: Victor SANH <victorsanh@gmail.com>
Co-authored-by: Stephen Bach <stephenhbach@gmail.com>
* Added prompts for English crows_pairs_multilingual

* Added prompts for English crows_pairs_multilingual minor change

* Added prompts for English crows_pairs_multilingual minor change

* Added prompts for English crows_pairs_multilingual change target label

* Added prompts for English crows_pairs_multilingual fix target

* Added prompts for English crows_pairs_multilingual added A. prompts

* Added prompts for French crows_pairs_multilingual added A. prompts

* Change crows_pairs_multilingual metric to Accuracy

* Added randomness to CrowsPairsMultilingual prompts choice order+integrated other suggestions

* Fixed removed newlines from prompts

* Adding extra prompts for CrowS-Pairs French

* Update templates.py

* Indicate which prompts are reflecting the original task

* Moved CrowS-Pairs-Multilingual to Bias WG organisation

* Accelerate `get_infos` by caching the `DataseInfoDict`s (#778)

* accelerate `get_infos` by caching the `DataseInfoDict`s

* quality

* consistency

Co-authored-by: Victor SANH <victorsanh@gmail.com>
Co-authored-by: J Forde <jzf2101@users.noreply.github.com>
* Added prompts for English crows_pairs_multilingual

* Added prompts for English crows_pairs_multilingual minor change

* Added prompts for English crows_pairs_multilingual minor change

* Added prompts for English crows_pairs_multilingual change target label

* Added prompts for English crows_pairs_multilingual fix target

* Added prompts for English crows_pairs_multilingual added A. prompts

* Added prompts for French crows_pairs_multilingual added A. prompts

* Change crows_pairs_multilingual metric to Accuracy

* Added randomness to CrowsPairsMultilingual prompts choice order+integrated other suggestions

* Fixed removed newlines from prompts

* Adding extra prompts for CrowS-Pairs French

* Update templates.py

* Indicate which prompts are reflecting the original task

* Moved CrowS-Pairs-Multilingual to Bias WG organisation

* Accelerate `get_infos` by caching the `DataseInfoDict`s (#778)

* accelerate `get_infos` by caching the `DataseInfoDict`s

* quality

* consistency

* Make targets one-token answers

* Make targets one-token answers for FR

Co-authored-by: Victor SANH <victorsanh@gmail.com>
Co-authored-by: J Forde <jzf2101@users.noreply.github.com>
* add arabic prompts

* add one more original task prompt

* fix minor issue

* fix minor issues

* remove unnecessary space

* update
* add english prompts for GEM/wikilingua (18 languages)

* (typo) add missing target to 'summarize_above_ar'

* (fix) update target formatting

* make language more explicit in prompts

* fix typo

Co-authored-by: Stephen Bach <stephenhbach@gmail.com>

Co-authored-by: Stephen Bach <stephenhbach@gmail.com>
* handle blank results

* unify how we detect blank results
* fixing templates.py for line length issue

* make style

Co-authored-by: Victor Sanh <victorsanh@gmail.com>
@VictorSanh
Copy link
Member Author

mistake - wrong branch - deleting this PR

@VictorSanh VictorSanh closed this Jun 25, 2022
@VictorSanh VictorSanh deleted the vs/fix_crows_pairs_duplicate_uuid branch June 25, 2022 21:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.