Adds Global MLMU #426

hynky1999 · 2024-12-06T13:27:43Z

Supports 3 version (ca (cult. agnostic), cs (cult sensitive), all (everything)):
Example how to run:
lighteval vllm pretrained="Qwen/Qwen2.5-7B,dtype=bfloat16,max_model_length=4096,pairwise_tokenization=True,tensor_parallel_size=1,gpu_memory_utilisation=0.6" --custom-tasks lighteval.tasks.multilingual.tasks --max-samples 100 --output-dir ./results "lighteval|global_mmlu_all_ces_cf|3|1"

Results for Qwen2.5-7B:

|                                 Task                                 |Version|    Metric    |Value |   |Stderr|
|----------------------------------------------------------------------|------:|--------------|-----:|---|-----:|
|all                                                                   |       |acc_norm_token|0.3775|±  |0.0479|
|                                                                      |       |acc_norm      |0.3811|±  |0.0480|
|                                                                      |       |acc_norm_pmi  |0.3905|±  |0.0483|

Examples of prompts (ces):

-- sample 7 --
{ 'choices': [' 0', ' 10 %', ' 25 %', ' 50 %'],
  'ctx': '',
  'fewshot_sorting_class': None,
  'gold_index': [2],
  'instruction': '',
  'num_asked_few_shots': -1,
  'num_effective_few_shots': -1,
  'original_query': '',
  'query': 'Otázka: Uvažujme segment o délce 10. Body A a B jsou vybrány '
           'náhodně tak, že A a B rozdělují segment na tři menší segmenty. '
           'Jaká je pravděpodobnost, že tři menší segmenty mohou tvořit strany '
           'trojúhelníku?\n'
           'Odpověď:',
  'specific': None,
  'task_name': 'lighteval|global_mmlu_all_ces_cf:college_mathematics',
  'unconditioned_query': 'Odpověď:'}
-- sample 8 --
{ 'choices': [' Žádný', ' Jen já', ' Pouze II', ' Pouze III'],
  'ctx': '',
  'fewshot_sorting_class': None,
  'gold_index': [2],
  'instruction': '',
  'num_asked_few_shots': -1,
  'num_effective_few_shots': -1,
  'original_query': '',
  'query': 'Otázka: Nechť V je konečnorozměrný reálný vektorový prostor a '
           'nechť P je lineární transformace V tak, že P^2 = P. Která z '
           'následujících musí platit?\n'
           'I. P je invertibilní.\n'
           'II. P je diagonální.\n'
           'III. P je buď transformace identity nebo nulová transformace.\n'
           'Odpověď:',
  'specific': None,
  'task_name': 'lighteval|global_mmlu_all_ces_cf:college_mathematics',
  'unconditioned_query': 'Odpověď:'}
-- sample 9 --
{ 'choices': [ ' Každý kompaktní prostor je kompletní',
               ' Každý kompletní prostor je kompaktní',
               ' Ani (a) ani (b).',
               ' Oba (a) i (b).'],
  'ctx': '',
  'fewshot_sorting_class': None,
  'gold_index': [0],
  'instruction': '',
  'num_asked_few_shots': -1,
  'num_effective_few_shots': -1,
  'original_query': '',
  'query': 'Otázka: Která z následujících skutečností je pravdivá?\nOdpověď:',
  'specific': None,
  'task_name': 'lighteval|global_mmlu_all_ces_cf:college_mathematics',
  'unconditioned_query': 'Odpověď:'}

cc @clefourrier, as you had opinion on subset

HuggingFaceDocBuilderDev · 2024-12-06T13:29:35Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

clefourrier · 2024-12-09T07:56:18Z

Hi! It could also make sense to support "Unannotated" (which is all - CS- CA)

hynky1999 · 2024-12-09T13:20:53Z

I don't think that's particulary intersting for anyone.
If you don't care about labels: use all
If you only want CA: use CA
if you only want CS: use CS

I don't see a reason in suporting unannotated

clefourrier · 2024-12-09T13:22:18Z

Comparative analysis, to compare un-annotated to CS or CA and therefore get a better idea of which of the subsets it's closer to

clefourrier · 2024-12-09T13:23:26Z

It's a very minor change to add no?

hynky1999 · 2024-12-09T14:39:21Z

Comparative analysis, to compare un-annotated to CS or CA and therefore get a better idea of which of the subsets it's closer to
Fair enough, added "UNK" subset.

src/lighteval/tasks/multilingual/tasks.py

clefourrier · 2024-12-09T16:12:43Z

Just one nit in the doc, if it works for you merge it and then you can squash and merge

Co-authored-by: Clémentine Fourrier <22726840+clefourrier@users.noreply.github.com>

hynky1999 added 3 commits December 6, 2024 14:20

add global mmlu + zulu

88f793e

add global mmlu + zulu

ebb0f7b

fix translatin literals

58a9062

hynky1999 requested a review from clefourrier December 6, 2024 13:29

clefourrier approved these changes Dec 9, 2024

View reviewed changes

Merge branch 'main' into global_mmlu

255465c

add unk for global mmlu

b1a7c5e

clefourrier reviewed Dec 9, 2024

View reviewed changes

src/lighteval/tasks/multilingual/tasks.py Show resolved Hide resolved

Update src/lighteval/tasks/multilingual/tasks.py

f59084f

Co-authored-by: Clémentine Fourrier <22726840+clefourrier@users.noreply.github.com>

hynky1999 merged commit 412ccfc into main Dec 9, 2024
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adds Global MLMU #426

Adds Global MLMU #426

hynky1999 commented Dec 6, 2024 •

edited

Loading

HuggingFaceDocBuilderDev commented Dec 6, 2024

clefourrier commented Dec 9, 2024

hynky1999 commented Dec 9, 2024

clefourrier commented Dec 9, 2024

clefourrier commented Dec 9, 2024

hynky1999 commented Dec 9, 2024 •

edited

Loading

clefourrier commented Dec 9, 2024

Adds Global MLMU #426

Adds Global MLMU #426

Conversation

hynky1999 commented Dec 6, 2024 • edited Loading

HuggingFaceDocBuilderDev commented Dec 6, 2024

clefourrier commented Dec 9, 2024

hynky1999 commented Dec 9, 2024

clefourrier commented Dec 9, 2024

clefourrier commented Dec 9, 2024

hynky1999 commented Dec 9, 2024 • edited Loading

clefourrier commented Dec 9, 2024

hynky1999 commented Dec 6, 2024 •

edited

Loading

hynky1999 commented Dec 9, 2024 •

edited

Loading