Vs/fix crows pairs duplicate UUID #789

VictorSanh · 2022-06-25T19:13:05Z

Fixing a uuid duplicate in crows pairs
cc @oskarvanderwal #781

* Add GEM/xsum prompts * uncommit this hack * Add GEM in INCLUDED_USERS Co-authored-by: Albert Webson <awebson@cs.brown.edu>

* First 2 prompts for blimp * Fixed some template issues that led to conflicting details * Updated templates to avoid packing errors * Added 2 new prompts with a different style of asking question * Add Najoung\'s templates * Added prompts for complex_NP_island * Updated the templates after the requested fixes * Added templates for complex_NP_island via copy scripts * Updated templates for all subsets * Switched to randomized prompts. * Minor changes to prompt names * Fix answer_choices for some templates + better template names * Fixed extra spaces. * Added null prompts (true null and single quotation mark versions) * Minor cleanup * Prompt cleanups: true null prompt, choice randomization + dropping order swapped prompts. Co-authored-by: cookie <nk> Co-authored-by: najoungkim <najoungk@gmail.com>

* adding GEM/webnlg prompts and Russian prompts, so can use challenge sets. Adding Russian versions of Prompts.

…740) * Adding GEM Simplification prompts. To include challenge sets, and additional prompts for Turk & ASSET. Adding Multiple Prompts + PALM , commenting out non-original tasks.

* add prompts for lince sa task * add period, fix answer choices, add more prompts with diverse phrasing

* Add prompts for BioASQ task b Following [BLURB](https://microsoft.github.io/BLURB/) benchmark we use task 7b and only yes/no question in that specifically to be comparable to other models. We use prompts similar to BoolQ prompts * Fix cache inconsistency and expanduser by default

* add username to INCLUDED_USERS list * add 5 prompts * change prompt name * add choices_in_prompt and fix minor issues * add labels and fix a prompt * title case answer choices * fix grammatical error * add other metrics * remove file * fix issues * Update templates.py * Update templates.py * set choices_in_prompt to false * add answer_choices in the prompts * fix minor issues * make prompts more natural

* templates for tweets_hate_speech * Update templates.yaml * Create templates.yaml * deleting tweets template * adding crd3 prompts * Update templates.py * putting instructions at the top Co-authored-by: Shanya Sharma - s0s0cr3 <Shanya.Sharma@walmartlabs.com> Co-authored-by: Stephen Bach <stephenhbach@gmail.com>

* added first templates for DiaBLa dataset rbawden/DiaBLa * declared previous_ref at beginning of templates * declared more variables at beginning of templates * declared more variables at beginning of templates * declared more variables at beginning of templates * declared more variables at beginning of templates * corrected ref for mt in one template * corrected ref for mt in one template * moved condition to just target side rather than around entire prompt * allow template even when no context (beginning of dialogue) * corrected templates that use past history * corrected multiple choice answer field * updated templates - simplified some targets and added translation completion tasks (simpler?) * corrected duplicate name * updated duplicate definition * corrected error of only two pipes in answer choices * corrected -2 index to -1 - duplicate defintiions * merged with eval-hackathon updates * corrected discriminate mt ref * removed directional templates and only keep both directions (analysis to be done separately later. Also some slight modifs such as removing excess words and quotes * simplified names, changed random to choice Co-authored-by: Rachel Bawden <rachel.bawden@inria.fr@users.noreply.github.com> Co-authored-by: Stephen Bach <stephenhbach@gmail.com>

* add original prompts * update metrics

* Add templates for schema_guided_dstc8 response generation * Remove extra newlines at the end of targets for schema_guided_dstc8 * Accelerate `get_infos` by caching the `DataseInfoDict`s (#778) * accelerate `get_infos` by caching the `DataseInfoDict`s * quality * consistency * Revert changes to app.py. * Update promptsource/app.py Co-authored-by: Victor SANH <victorsanh@gmail.com> Co-authored-by: Stephen Bach <stephenhbach@gmail.com>

* Added prompts for English crows_pairs_multilingual * Added prompts for English crows_pairs_multilingual minor change * Added prompts for English crows_pairs_multilingual minor change * Added prompts for English crows_pairs_multilingual change target label * Added prompts for English crows_pairs_multilingual fix target * Added prompts for English crows_pairs_multilingual added A. prompts * Added prompts for French crows_pairs_multilingual added A. prompts * Change crows_pairs_multilingual metric to Accuracy * Added randomness to CrowsPairsMultilingual prompts choice order+integrated other suggestions * Fixed removed newlines from prompts * Adding extra prompts for CrowS-Pairs French * Update templates.py * Indicate which prompts are reflecting the original task * Moved CrowS-Pairs-Multilingual to Bias WG organisation * Accelerate `get_infos` by caching the `DataseInfoDict`s (#778) * accelerate `get_infos` by caching the `DataseInfoDict`s * quality * consistency Co-authored-by: Victor SANH <victorsanh@gmail.com> Co-authored-by: J Forde <jzf2101@users.noreply.github.com>

* Added prompts for English crows_pairs_multilingual * Added prompts for English crows_pairs_multilingual minor change * Added prompts for English crows_pairs_multilingual minor change * Added prompts for English crows_pairs_multilingual change target label * Added prompts for English crows_pairs_multilingual fix target * Added prompts for English crows_pairs_multilingual added A. prompts * Added prompts for French crows_pairs_multilingual added A. prompts * Change crows_pairs_multilingual metric to Accuracy * Added randomness to CrowsPairsMultilingual prompts choice order+integrated other suggestions * Fixed removed newlines from prompts * Adding extra prompts for CrowS-Pairs French * Update templates.py * Indicate which prompts are reflecting the original task * Moved CrowS-Pairs-Multilingual to Bias WG organisation * Accelerate `get_infos` by caching the `DataseInfoDict`s (#778) * accelerate `get_infos` by caching the `DataseInfoDict`s * quality * consistency * Make targets one-token answers * Make targets one-token answers for FR Co-authored-by: Victor SANH <victorsanh@gmail.com> Co-authored-by: J Forde <jzf2101@users.noreply.github.com>

* add arabic prompts * add one more original task prompt * fix minor issue * fix minor issues * remove unnecessary space * update

* add english prompts for GEM/wikilingua (18 languages) * (typo) add missing target to 'summarize_above_ar' * (fix) update target formatting * make language more explicit in prompts * fix typo Co-authored-by: Stephen Bach <stephenhbach@gmail.com> Co-authored-by: Stephen Bach <stephenhbach@gmail.com>

* handle blank results * unify how we detect blank results

* fixing templates.py for line length issue * make style Co-authored-by: Victor Sanh <victorsanh@gmail.com>

VictorSanh · 2022-06-25T19:13:47Z

mistake - wrong branch - deleting this PR

RosenZhang and others added 22 commits April 27, 2022 12:07

update actions to run tests for eval-hackathon branch (#751)

c25d5c1

Add GEM/xsum prompts (#745)

06bd60d

* Add GEM/xsum prompts * uncommit this hack * Add GEM in INCLUDED_USERS Co-authored-by: Albert Webson <awebson@cs.brown.edu>

Remove English-only filter. (#755)

5dea218

Initial support for multiple targets. (#747)

f6a0e21

adding GEM/webnlg prompts (#743)

4927189

* adding GEM/webnlg prompts and Russian prompts, so can use challenge sets. Adding Russian versions of Prompts.

Adding GEM Simplification prompts. Includes challenge sets, and add… (#…

dfbb18c

…740) * Adding GEM Simplification prompts. To include challenge sets, and additional prompts for Turk & ASSET. Adding Multiple Prompts + PALM , commenting out non-original tasks.

Add LinCE sentiment analysis prompts (#757)

dcff8f6

* add prompts for lince sa task * add period, fix answer choices, add more prompts with diverse phrasing

Handle changed apply method. (#773)

9dcd89e

Add PIAF (#774)

f96566a

* add original prompts * update metrics

Add XQuAD [ more prompts + Arabic prompts ] (#770)

4695489

* add arabic prompts * add one more original task prompt * fix minor issue * fix minor issues * remove unnecessary space * update

Fix blank results (#788)

1dc66cf

* handle blank results * unify how we detect blank results

fixing templates.py for line length issue (#782)

c7399d5

* fixing templates.py for line length issue * make style Co-authored-by: Victor Sanh <victorsanh@gmail.com>

fix crows pairs duplicate uuid

cfae53e

VictorSanh closed this Jun 25, 2022

VictorSanh deleted the vs/fix_crows_pairs_duplicate_uuid branch June 25, 2022 21:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Vs/fix crows pairs duplicate UUID #789

Vs/fix crows pairs duplicate UUID #789

VictorSanh commented Jun 25, 2022

VictorSanh commented Jun 25, 2022

Vs/fix crows pairs duplicate UUID #789

Vs/fix crows pairs duplicate UUID #789

Conversation

VictorSanh commented Jun 25, 2022

VictorSanh commented Jun 25, 2022