1810 move gemma evaluation #1819

malinjawi · 2024-10-12T12:46:48Z

Context

tracker: #1810
What is the purpose of this PR? Is it to

add a new feature
fix a bug
update tests and/or documentation
clean up

Please link to any issues this PR addresses.

Changelog

What are the changes made in this PR?

Copied evaluation.yaml to gemma/ directory
Updated evaluation.yaml to point to gemma 2b model instantiations
Updated the recipe registry to pick up the new config

Test plan

Please make sure to do each of the following if applicable to your PR. If you're unsure about any one of these just ask and we will happily help. We also have a contributing page for some guidance on contributing.

run pre-commit hooks and linters (make sure you've first installed via pre-commit install)
add unit tests for any new functionality
update docstrings for any new or updated methods or classes
run unit tests via pytest tests
run recipe tests via pytest tests -m integration_test
manually run any new or modified recipes with sufficient proof of correctness
include relevant commands and any other artifacts in this summary (pastes of loss curves, eval results, etc.)

gemma Eleuther eval recipe output:

(torchtune_rosetta) linjaboy@Mohammads-MacBook-Pro 9cf48e52b224239de00d483ec8eb84fb8d0f3a3a % tune run eleuther_eval --config gemma/evaluation
W1012 01:59:27.011000 8088854208 torch/distributed/elastic/multiprocessing/redirects.py:28] NOTE: Redirects are currently not supported in Windows or MacOs.
INFO:torchtune.utils._logging:Running EleutherEvalRecipe with resolved config:

batch_size: 8
checkpointer:
  _component_: torchtune.training.FullModelHFCheckpointer
  checkpoint_dir: /tmp/gemma-2b/models--google--gemma-2b/snapshots/9cf48e52b224239de00d483ec8eb84fb8d0f3a3a
  checkpoint_files:
  - model-00001-of-00002.safetensors
  - model-00002-of-00002.safetensors
  model_type: GEMMA
  output_dir: ./
device: cpu
dtype: bf16
enable_kv_cache: true
limit: null
max_seq_length: 4096
model:
  _component_: torchtune.models.gemma.gemma_2b
quantizer: null
seed: 1234
tasks:
- truthfulqa_mc2
tokenizer:
  _component_: torchtune.models.gemma.gemma_tokenizer
  path: /tmp/gemma-2b/models--google--gemma-2b/snapshots/9cf48e52b224239de00d483ec8eb84fb8d0f3a3a/tokenizer.model

INFO:torchtune.utils._logging:Model is initialized with precision torch.bfloat16.
config.json: 100%|███████████████████████████████████████████████████████████████████████████| 665/665 [00:00<00:00, 1.06MB/s]
tokenizer_config.json: 100%|████████████████████████████████████████████████████████████████| 26.0/26.0 [00:00<00:00, 119kB/s]
vocab.json: 100%|████████████████████████████████████████████████████████████████████████| 1.04M/1.04M [00:00<00:00, 1.14MB/s]
merges.txt: 100%|██████████████████████████████████████████████████████████████████████████| 456k/456k [00:00<00:00, 1.57MB/s]
tokenizer.json: 100%|████████████████████████████████████████████████████████████████████| 1.36M/1.36M [00:00<00:00, 2.14MB/s]
model.safetensors: 100%|███████████████████████████████████████████████████████████████████| 548M/548M [01:32<00:00, 5.95MB/s]
generation_config.json: 100%|█████████████████████████████████████████████████████████████████| 124/124 [00:00<00:00, 712kB/s]
README.md: 100%|█████████████████████████████████████████████████████████████████████████| 9.59k/9.59k [00:00<00:00, 5.65MB/s]
validation-00000-of-00001.parquet: 100%|███████████████████████████████████████████████████| 271k/271k [00:00<00:00, 2.79MB/s]
Generating validation split: 100%|████████████████████████████████████████████████| 817/817 [00:00<00:00, 26416.89 examples/s]
INFO:torchtune.utils._logging:Running evaluation on the following tasks: ['truthfulqa_mc2']
INFO:lm-eval:Building contexts for truthfulqa_mc2 on rank 0...
100%|█████████████████████████████████████████████████████████████████████████████████████| 817/817 [00:00<00:00, 2121.83it/s]
INFO:lm-eval:Running loglikelihood requests
Running loglikelihood requests: 100%|███████████████████████████████████████████████████| 5882/5882 [8:26:40<00:00,  5.17s/it]
INFO:torchtune.utils._logging:Eval completed in 30404.48 seconds.
INFO:torchtune.utils._logging:Max memory allocated: 0.00 GB
INFO:torchtune.utils._logging:

|    Tasks     |Version|Filter|n-shot|Metric|   |Value |   |Stderr|
|--------------|------:|------|-----:|------|---|-----:|---|-----:|
|truthfulqa_mc2|      2|none  |     0|acc   |↑  |0.3995|±  |0.0152|

merge

pytorch-bot · 2024-10-12T12:46:51Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/1819

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit d1d1233 with merge base 7744608 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

joecummings

A couple of very small changes, but otherwise looks great!

joecummings · 2024-10-12T19:07:05Z

recipes/configs/gemma/evaluation.yaml

+  path: /tmp/gemma-2b//tokenizer.model
+
+# Environment
+device: gpu


I believe this should default to cuda, not gpu.

should be fixed now

joecummings · 2024-10-12T19:07:12Z

recipes/configs/gemma/evaluation.yaml

+    model-00001-of-00002.safetensors,
+    model-00002-of-00002.safetensors,
+  ]
+  #recipe_checkpoint: null


You can delete this line.

Deleted it now

joecummings · 2024-10-12T19:07:25Z

recipes/configs/gemma/evaluation.yaml

+# Tokenizer
+tokenizer:
+  _component_: torchtune.models.gemma.gemma_tokenizer
+  path: /tmp/gemma-2b//tokenizer.model


Extra forward-slash here.

Nice catch will delete

malinjawi · 2024-10-12T20:47:02Z

A couple of very small changes, but otherwise looks great!

Hey @joecummings thanks for the review. I have addressed these the comments now.

joecummings

THANKS!

malinjawi added 3 commits October 12, 2024 15:32

refactor gemma evaluation

736f31e

Merge pull request #1 from pytorch/main

256e41a

merge

Merge branch 'main' into 1810-move-gemma-eval-and-generate

673f665

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 12, 2024

malinjawi changed the title ~~1810 move gemma eval and generate~~ 1810 move gemma evaluation Oct 12, 2024

joecummings reviewed Oct 12, 2024

View reviewed changes

cleaned up code for comments

d1d1233

malinjawi requested a review from joecummings October 13, 2024 08:48

joecummings approved these changes Oct 13, 2024

View reviewed changes

RdoubleA merged commit c70ad29 into pytorch:main Oct 13, 2024
17 checks passed

joecummings mentioned this pull request Oct 14, 2024

[Clean up] Move evaluation configs under model directories #1810

Open

12 tasks

mori360 pushed a commit to mori360/torchtune that referenced this pull request Oct 14, 2024

1810 move gemma evaluation (pytorch#1819)

b8397f7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

1810 move gemma evaluation #1819

1810 move gemma evaluation #1819

Uh oh!

malinjawi commented Oct 12, 2024 •

edited

Loading

Uh oh!

pytorch-bot bot commented Oct 12, 2024 •

edited

Loading

Uh oh!

joecummings left a comment

Uh oh!

joecummings Oct 12, 2024

Uh oh!

malinjawi Oct 12, 2024 •

edited

Loading

Uh oh!

joecummings Oct 12, 2024

Uh oh!

malinjawi Oct 12, 2024

Uh oh!

joecummings Oct 12, 2024

Uh oh!

malinjawi Oct 12, 2024

Uh oh!

malinjawi commented Oct 12, 2024

Uh oh!

joecummings left a comment

Uh oh!

Uh oh!

Uh oh!

1810 move gemma evaluation #1819

1810 move gemma evaluation #1819

Uh oh!

Conversation

malinjawi commented Oct 12, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Context

Changelog

Test plan

gemma Eleuther eval recipe output:

Uh oh!

pytorch-bot bot commented Oct 12, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/1819

✅ No Failures

Uh oh!

joecummings left a comment

Choose a reason for hiding this comment

Uh oh!

joecummings Oct 12, 2024

Choose a reason for hiding this comment

Uh oh!

malinjawi Oct 12, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

joecummings Oct 12, 2024

Choose a reason for hiding this comment

Uh oh!

malinjawi Oct 12, 2024

Choose a reason for hiding this comment

Uh oh!

joecummings Oct 12, 2024

Choose a reason for hiding this comment

Uh oh!

malinjawi Oct 12, 2024

Choose a reason for hiding this comment

Uh oh!

malinjawi commented Oct 12, 2024

Uh oh!

joecummings left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

malinjawi commented Oct 12, 2024 •

edited

Loading

pytorch-bot bot commented Oct 12, 2024 •

edited

Loading

malinjawi Oct 12, 2024 •

edited

Loading