Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MM Eval tests #1887

Closed
wants to merge 25 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
5cb9140
mm eval tests
SalmanMohammadi Oct 23, 2024
63ba175
mm eval tests
SalmanMohammadi Oct 23, 2024
0331778
Merge branch 'main' into mm_tests
SalmanMohammadi Nov 7, 2024
578aa48
adding test values
SalmanMohammadi Nov 8, 2024
f0a94d7
reverting changes
SalmanMohammadi Nov 8, 2024
df3402c
Merge branch 'main' into mm_tests
SalmanMohammadi Nov 8, 2024
60bccc6
whoops
SalmanMohammadi Nov 8, 2024
6681749
whoops 2
SalmanMohammadi Nov 8, 2024
d214f52
tidy tidy tidy tidy fresh clean
SalmanMohammadi Nov 8, 2024
e3155a1
what is this rounding nonesense?
SalmanMohammadi Nov 8, 2024
7add9af
fixing values
SalmanMohammadi Nov 9, 2024
c3246c0
fixing parameterize
SalmanMohammadi Nov 9, 2024
e3f8178
just put it on teh gpu?
SalmanMohammadi Nov 11, 2024
acd6763
Merge branch 'mm_tests' of github.com:SalmanMohammadi/torchtune into …
SalmanMohammadi Nov 11, 2024
ed3f02e
what a silly billy I am oh boy
SalmanMohammadi Nov 12, 2024
8de3350
is it a python version thing?
SalmanMohammadi Nov 12, 2024
3424c32
it is NOT. BACK TO THE CPU
SalmanMohammadi Nov 12, 2024
abca4d1
back to gpu.. it's a max_seq_len thing??
SalmanMohammadi Nov 12, 2024
5ab8f83
that didn't work...
SalmanMohammadi Nov 12, 2024
19c029e
this is a terrible experience for me
SalmanMohammadi Nov 12, 2024
a691a08
stg if this doesn't work
SalmanMohammadi Nov 12, 2024
e7018fa
Merge branch 'main' into mm_tests
SalmanMohammadi Nov 12, 2024
3bb57fa
I don't even know at this point
SalmanMohammadi Nov 12, 2024
76ff0fd
OKAY this should work right?
SalmanMohammadi Nov 12, 2024
24e24b5
????
SalmanMohammadi Nov 12, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,6 @@ torchtune currently supports the following models.
| [Code-Llama2](https://ai.meta.com/blog/code-llama-large-language-model-coding/) | 7B, 13B, 70B [[models](torchtune/models/code_llama2/_model_builders.py), [configs](recipes/configs/code_llama2/)] |
| [Mistral](https://huggingface.co/mistralai) | 7B [[models](torchtune/models/mistral/_model_builders.py), [configs](recipes/configs/mistral/)] |
| [Gemma](https://huggingface.co/collections/google/gemma-release-65d5efbccdbb8c4202ec078b) | 2B, 7B [[models](torchtune/models/gemma/_model_builders.py), [configs](recipes/configs/gemma/)] |
| [Gemma2](https://huggingface.co/docs/transformers/main/en/model_doc/gemma2) | 2B, 9B, 27B [[models](torchtune/models/gemma2/_model_builders.py), [configs](recipes/configs/gemma2/)] |
| [Microsoft Phi3](https://huggingface.co/collections/microsoft/phi-3-6626e15e9585a200d2d761e3) | Mini [[models](torchtune/models/phi3/), [configs](recipes/configs/phi3/)]
| [Qwen2](https://qwenlm.github.io/blog/qwen2/) | 0.5B, 1.5B, 7B [[models](torchtune/models/qwen2/), [configs](recipes/configs/qwen2/)]

Expand Down
31 changes: 0 additions & 31 deletions docs/source/api_ref_models.rst
Original file line number Diff line number Diff line change
Expand Up @@ -361,37 +361,6 @@ To download the Gemma 7B model:
gemma.gemma_tokenizer


gemma2 :
--------

Models of size 2B, 9B, 27B from the `Gemma family <https://blog.google/technology/developers/gemma-open-models/>`_.

Important: You need to request access on `Hugging Face <https://huggingface.co/google/gemma-2-2b>`__ to use this model.

To download the Gemma2 2B, 9B, 27B models :

.. code-block:: bash

tune download google/gemma-2-<MODEL_SIZE>b --ignore-patterns "gemma-2-<MODEL_SIZE>b.gguf" --hf-token <HF_TOKEN>


.. autosummary::
:toctree: generated/
:nosignatures:

gemma2.gemma2
gemma2.lora_gemma2
gemma2.gemma2_2b
gemma2.lora_gemma2_2b
gemma2.qlora_gemma2_2b
gemma2.gemma2_9b
gemma2.lora_gemma2_9b
gemma2.qlora_gemma2_9b
gemma2.gemma2_27b
gemma2.lora_gemma2_27b
gemma2.qlora_gemma2_27b
gemma.gemma_tokenizer

clip
----

Expand Down
4 changes: 2 additions & 2 deletions docs/source/tutorials/memory_optimizations.rst
Original file line number Diff line number Diff line change
Expand Up @@ -167,7 +167,7 @@ In addition to :ref:`reducing model and optimizer precision <glossary_precision>
All of our recipes support lower-precision optimizers from the `torchao <https://github.com/pytorch/ao/tree/main/torchao/prototype/low_bit_optim>`_ library.
For single device recipes, we also support `bitsandbytes <https://huggingface.co/docs/bitsandbytes/main/en/index>`_.

A good place to start might be the :class:`torchao.prototype.low_bit_optim.AdamW8bit` and :class:`bitsandbytes.optim.PagedAdamW8bit` optimizers.
A good place to start might be the :class:`torchao.prototype.low_bit_optim.torchao.AdamW8bit` and :class:`bitsandbytes.optim.PagedAdamW8bit` optimizers.
Both reduce memory by quantizing the optimizer state dict. Paged optimizers will also offload to CPU if there isn't enough GPU memory available. In practice,
you can expect higher memory savings from bnb's PagedAdamW8bit but higher training speed from torchao's AdamW8bit.

Expand All @@ -180,7 +180,7 @@ a low precision optimizer using the :ref:`cli_label`:
.. code-block:: bash

tune run <RECIPE> --config <CONFIG> \
optimizer=torchao.prototype.low_bit_optim.AdamW8bit
optimizer=torchao.prototype.low_bit_optim.torchao.AdamW8bit

.. code-block:: bash

Expand Down
74 changes: 0 additions & 74 deletions recipes/configs/gemma2/27B_full.yaml

This file was deleted.

86 changes: 0 additions & 86 deletions recipes/configs/gemma2/27B_lora.yaml

This file was deleted.

112 changes: 0 additions & 112 deletions recipes/configs/gemma2/27B_lora_single_device.yaml

This file was deleted.

Loading