1810 Add evaluation configs under phi3 dir #1822

Harthi7 · 2024-10-12T21:30:39Z

Context

tracker: #1810
What is the purpose of this PR? Is it to

add a new feature
fix a bug
update tests and/or documentation
clean up

Please link to any issues this PR addresses.

Changelog

What are the changes made in this PR?

Copied evaluation.yaml to Phi3/ directory
Updated evaluation.yaml to point to Phi3 2b model instantiations
Updated the recipe registry to pick up the new config

Test plan

Please make sure to do each of the following if applicable to your PR. If you're unsure about any one of these just ask and we will happily help. We also have a contributing page for some guidance on contributing.

run pre-commit hooks and linters (make sure you've first installed via pre-commit install)
add unit tests for any new functionality
update docstrings for any new or updated methods or classes
run unit tests via pytest tests
run recipe tests via pytest tests -m integration_test
manually run any new or modified recipes with sufficient proof of correctness
include relevant commands and any other artifacts in this summary (pastes of loss curves, eval results, etc.)

Phi3 Eleuther eval recipe output:

(torchtune) Abdullahs-MacBook-Pro:Phi-3-mini-4k-instruct abdullah$ tune run eleuther_eval --config phi3/evaluation
W1012 00:32:22.842000 8517586752 torch/distributed/elastic/multiprocessing/redirects.py:28] NOTE: Redirects are currently not supported in Windows or MacOs.
INFO:torchtune.utils._logging:Running EleutherEvalRecipe with resolved config:

batch_size: 8
checkpointer:
  _component_: torchtune.training.FullModelHFCheckpointer
  checkpoint_dir: /tmp/Phi-3-mini-4k-instruct/models--microsoft--Phi-3-mini-4k-instruct/snapshots/0a67737cc96d2554230f90338b163bc6380a2a85
  checkpoint_files:
  - model-00001-of-00002.safetensors
  - model-00002-of-00002.safetensors
  model_type: PHI3_MINI
  output_dir: /tmp/Phi-3-mini-4k-instruct/models--microsoft--Phi-3-mini-4k-instruct/snapshots/0a67737cc96d2554230f90338b163bc6380a2a85
  recipe_checkpoint: null
device: cpu
dtype: bf16
enable_kv_cache: true
limit: null
max_seq_length: 4096
model:
  _component_: torchtune.models.phi3.phi3_mini
quantizer: null
resume_from_checkpoint: false
seed: 1234
tasks:
- truthfulqa_mc2
tokenizer:
  _component_: torchtune.models.phi3.phi3_mini_tokenizer
  max_seq_len: null
  path: /tmp/Phi-3-mini-4k-instruct/tokenizer.model

INFO:torchtune.utils._logging:Converting Phi-3 Mini weights from HF format.Note that conversion of adapter weights into PEFT format is not supported.
INFO:torchtune.utils._logging:Model is initialized with precision torch.bfloat16.
config.json: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 665/665 [00:00<00:00, 769kB/s]
tokenizer_config.json: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 26.0/26.0 [00:00<00:00, 106kB/s]
vocab.json: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1.04M/1.04M [00:00<00:00, 1.16MB/s]
merges.txt: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 456k/456k [00:00<00:00, 1.26MB/s]
tokenizer.json: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1.36M/1.36M [00:00<00:00, 1.90MB/s]
model.safetensors: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 548M/548M [03:43<00:00, 2.45MB/s]
generation_config.json: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 124/124 [00:00<00:00, 2.14MB/s]
README.md: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 9.59k/9.59k [00:00<00:00, 16.1MB/s]
validation-00000-of-00001.parquet: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████| 271k/271k [00:00<00:00, 1.36MB/s]
Generating validation split: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████| 817/817 [00:00<00:00, 45492.21 examples/s]
INFO:torchtune.utils._logging:Running evaluation on the following tasks: ['truthfulqa_mc2']
INFO:lm-eval:Building contexts for truthfulqa_mc2 on rank 0...
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 817/817 [00:00<00:00, 2446.11it/s]
INFO:lm-eval:Running loglikelihood requests
Running loglikelihood requests: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████| 5882/5882 [22:42:29<00:00, 13.90s/it]
INFO:torchtune.utils._logging:Eval completed in 81751.63 seconds.
INFO:torchtune.utils._logging:Max memory allocated: 0.00 GB
INFO:torchtune.utils._logging:

|    Tasks     |Version|Filter|n-shot|Metric|   |Value |   |Stderr|
|--------------|------:|------|-----:|------|---|-----:|---|-----:|
|truthfulqa_mc2|      2|none  |     0|acc   |↑  |0.5456|±  |0.0151|

pytorch-bot · 2024-10-12T21:30:42Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/1822

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 604d2e6 with merge base 78ceee6 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

RdoubleA

Excellent, just need to fix lint with pre-commit run --all-files

joecummings · 2024-10-14T13:19:44Z

@Harthi7 Would you mind merging main branch and running the linter? Then, we can go ahead and get this merged :)

spzala

@Harthi7 nice, thank you! Please see linting instructions here to take care of lint related failure, https://github.com/pytorch/pytorch/blob/main/CONTRIBUTING.md#local-linting

SalmanMohammadi · 2024-10-14T14:31:50Z

@Harthi7 nice, thank you! Please see linting instructions here to take care of lint related failure, https://github.com/pytorch/pytorch/blob/main/CONTRIBUTING.md#local-linting

We actually use separate linting tools from pytorch core : ) see here https://github.com/pytorch/torchtune/blob/main/CONTRIBUTING.md#coding-style

spzala · 2024-10-14T14:34:30Z

@Harthi7 nice, thank you! Please see linting instructions here to take care of lint related failure, https://github.com/pytorch/pytorch/blob/main/CONTRIBUTING.md#local-linting

We actually use separate linting tools from pytorch core : ) see here https://github.com/pytorch/torchtune/blob/main/CONTRIBUTING.md#coding-style

@SalmanMohammadi aha, nice to know :) Thank you so much for sharing!

Harthi7 · 2024-10-14T16:46:20Z

Hello @joecummings and @RdoubleA, I merged with main and ran the lint command, Please review the changes and let me know if there is anything I have missed

abdullah-ibm and others added 2 commits October 13, 2024 00:15

Add evaluation configs under phi3 dir

Unverified

This commit is not signed, but one or more authors requires that any commit attributed to them is signed.

Learn about vigilant mode

4b1f247

Merge branch 'pytorch:main' into Add-evaluation-configs-under-phi3-dir

Unverified

This commit is not signed, but one or more authors requires that any commit attributed to them is signed.

Learn about vigilant mode

Loading
Loading status checks…

9566a73

facebook-github-bot added the CLA Signed label Oct 12, 2024

RdoubleA approved these changes Oct 13, 2024

View reviewed changes

spzala reviewed Oct 14, 2024

View reviewed changes

abdullah-ibm added 2 commits October 14, 2024 19:34

Merge branch 'main' into Add-evaluation-configs-under-phi3-dir

Unverified

This commit is not signed, but one or more authors requires that any commit attributed to them is signed.

Learn about vigilant mode

Loading
Loading status checks…

31d723c

Lint Changes

Unverified

This commit is not signed, but one or more authors requires that any commit attributed to them is signed.

Learn about vigilant mode

Loading
Loading status checks…

604d2e6

joecummings merged commit 918c053 into pytorch:main Oct 14, 2024
17 checks passed

joecummings mentioned this pull request Oct 14, 2024

[Clean up] Move evaluation configs under model directories #1810

Open

12 tasks

Harthi7 changed the title ~~Add evaluation configs under phi3 dir~~ 1810 Add evaluation configs under phi3 dir Oct 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

1810 Add evaluation configs under phi3 dir #1822

1810 Add evaluation configs under phi3 dir #1822

Harthi7 commented Oct 12, 2024

pytorch-bot bot commented Oct 12, 2024 •

edited

Loading

RdoubleA left a comment

joecummings commented Oct 14, 2024

spzala left a comment

SalmanMohammadi commented Oct 14, 2024

spzala commented Oct 14, 2024

Harthi7 commented Oct 14, 2024 •

edited

Loading

1810 Add evaluation configs under phi3 dir #1822

1810 Add evaluation configs under phi3 dir #1822

Conversation

Harthi7 commented Oct 12, 2024

Context

Changelog

Test plan

pytorch-bot bot commented Oct 12, 2024 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchtune/1822

✅ No Failures

RdoubleA left a comment

Choose a reason for hiding this comment

joecummings commented Oct 14, 2024

spzala left a comment

Choose a reason for hiding this comment

SalmanMohammadi commented Oct 14, 2024

spzala commented Oct 14, 2024

Harthi7 commented Oct 14, 2024 • edited Loading

pytorch-bot bot commented Oct 12, 2024 •

edited

Loading

Harthi7 commented Oct 14, 2024 •

edited

Loading