Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1810 move gemma evaluation #1819

Merged
merged 4 commits into from
Oct 13, 2024

Conversation

malinjawi
Copy link
Contributor

@malinjawi malinjawi commented Oct 12, 2024

Context

tracker: #1810
What is the purpose of this PR? Is it to

  • add a new feature
  • fix a bug
  • update tests and/or documentation
  • clean up

Please link to any issues this PR addresses.

Changelog

What are the changes made in this PR?

  • Copied evaluation.yaml to gemma/ directory
  • Updated evaluation.yaml to point to gemma 2b model instantiations
  • Updated the recipe registry to pick up the new config

Test plan

Please make sure to do each of the following if applicable to your PR. If you're unsure about any one of these just ask and we will happily help. We also have a contributing page for some guidance on contributing.

  • run pre-commit hooks and linters (make sure you've first installed via pre-commit install)
  • add unit tests for any new functionality
  • update docstrings for any new or updated methods or classes
  • run unit tests via pytest tests
  • run recipe tests via pytest tests -m integration_test
  • manually run any new or modified recipes with sufficient proof of correctness
  • include relevant commands and any other artifacts in this summary (pastes of loss curves, eval results, etc.)

gemma Eleuther eval recipe output:

(torchtune_rosetta) linjaboy@Mohammads-MacBook-Pro 9cf48e52b224239de00d483ec8eb84fb8d0f3a3a % tune run eleuther_eval --config gemma/evaluation
W1012 01:59:27.011000 8088854208 torch/distributed/elastic/multiprocessing/redirects.py:28] NOTE: Redirects are currently not supported in Windows or MacOs.
INFO:torchtune.utils._logging:Running EleutherEvalRecipe with resolved config:

batch_size: 8
checkpointer:
  _component_: torchtune.training.FullModelHFCheckpointer
  checkpoint_dir: /tmp/gemma-2b/models--google--gemma-2b/snapshots/9cf48e52b224239de00d483ec8eb84fb8d0f3a3a
  checkpoint_files:
  - model-00001-of-00002.safetensors
  - model-00002-of-00002.safetensors
  model_type: GEMMA
  output_dir: ./
device: cpu
dtype: bf16
enable_kv_cache: true
limit: null
max_seq_length: 4096
model:
  _component_: torchtune.models.gemma.gemma_2b
quantizer: null
seed: 1234
tasks:
- truthfulqa_mc2
tokenizer:
  _component_: torchtune.models.gemma.gemma_tokenizer
  path: /tmp/gemma-2b/models--google--gemma-2b/snapshots/9cf48e52b224239de00d483ec8eb84fb8d0f3a3a/tokenizer.model

INFO:torchtune.utils._logging:Model is initialized with precision torch.bfloat16.
config.json: 100%|███████████████████████████████████████████████████████████████████████████| 665/665 [00:00<00:00, 1.06MB/s]
tokenizer_config.json: 100%|████████████████████████████████████████████████████████████████| 26.0/26.0 [00:00<00:00, 119kB/s]
vocab.json: 100%|████████████████████████████████████████████████████████████████████████| 1.04M/1.04M [00:00<00:00, 1.14MB/s]
merges.txt: 100%|██████████████████████████████████████████████████████████████████████████| 456k/456k [00:00<00:00, 1.57MB/s]
tokenizer.json: 100%|████████████████████████████████████████████████████████████████████| 1.36M/1.36M [00:00<00:00, 2.14MB/s]
model.safetensors: 100%|███████████████████████████████████████████████████████████████████| 548M/548M [01:32<00:00, 5.95MB/s]
generation_config.json: 100%|█████████████████████████████████████████████████████████████████| 124/124 [00:00<00:00, 712kB/s]
README.md: 100%|█████████████████████████████████████████████████████████████████████████| 9.59k/9.59k [00:00<00:00, 5.65MB/s]
validation-00000-of-00001.parquet: 100%|███████████████████████████████████████████████████| 271k/271k [00:00<00:00, 2.79MB/s]
Generating validation split: 100%|████████████████████████████████████████████████| 817/817 [00:00<00:00, 26416.89 examples/s]
INFO:torchtune.utils._logging:Running evaluation on the following tasks: ['truthfulqa_mc2']
INFO:lm-eval:Building contexts for truthfulqa_mc2 on rank 0...
100%|█████████████████████████████████████████████████████████████████████████████████████| 817/817 [00:00<00:00, 2121.83it/s]
INFO:lm-eval:Running loglikelihood requests
Running loglikelihood requests: 100%|███████████████████████████████████████████████████| 5882/5882 [8:26:40<00:00,  5.17s/it]
INFO:torchtune.utils._logging:Eval completed in 30404.48 seconds.
INFO:torchtune.utils._logging:Max memory allocated: 0.00 GB
INFO:torchtune.utils._logging:

|    Tasks     |Version|Filter|n-shot|Metric|   |Value |   |Stderr|
|--------------|------:|------|-----:|------|---|-----:|---|-----:|
|truthfulqa_mc2|      2|none  |     0|acc   ||0.3995|±  |0.0152|


Copy link

pytorch-bot bot commented Oct 12, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/1819

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit d1d1233 with merge base 7744608 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 12, 2024
@malinjawi malinjawi changed the title 1810 move gemma eval and generate 1810 move gemma evaluation Oct 12, 2024
Copy link
Contributor

@joecummings joecummings left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of very small changes, but otherwise looks great!

path: /tmp/gemma-2b//tokenizer.model

# Environment
device: gpu
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this should default to cuda, not gpu.

Copy link
Contributor Author

@malinjawi malinjawi Oct 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be fixed now

model-00001-of-00002.safetensors,
model-00002-of-00002.safetensors,
]
#recipe_checkpoint: null
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can delete this line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deleted it now

# Tokenizer
tokenizer:
_component_: torchtune.models.gemma.gemma_tokenizer
path: /tmp/gemma-2b//tokenizer.model
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra forward-slash here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch will delete

@malinjawi
Copy link
Contributor Author

A couple of very small changes, but otherwise looks great!

Hey @joecummings thanks for the review. I have addressed these the comments now.

Copy link
Contributor

@joecummings joecummings left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

THANKS!

@RdoubleA RdoubleA merged commit c70ad29 into pytorch:main Oct 13, 2024
17 checks passed
mori360 pushed a commit to mori360/torchtune that referenced this pull request Oct 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants