Skip to content

Commit 1c3b4ef

Browse files
committed
merge
2 parents ec6449f + ac14e96 commit 1c3b4ef

File tree

262 files changed

+13682
-3100
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

262 files changed

+13682
-3100
lines changed

.github/workflows/build_linux_wheels.yaml

Lines changed: 2 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -27,37 +27,8 @@ jobs:
2727
with-cuda: enable
2828
with-rocm: enable
2929
build-python-only: enable
30-
# TODO: Remove `filter-python-version` after PyArrow releases v18
31-
filter-python-versions:
32-
needs: generate-matrix
33-
runs-on: ubuntu-latest
34-
outputs:
35-
matrix: ${{ steps.set-matrix.outputs.matrix }}
36-
steps:
37-
- name: Filter matrix to exclude Python 3.13
38-
id: set-matrix
39-
shell: python
40-
env:
41-
input-matrix: ${{ needs.generate-matrix.outputs.matrix }}
42-
run: |
43-
import os
44-
import json
45-
46-
# Grab environment variables
47-
input_matrix = json.loads(os.environ["input-matrix"])
48-
github_output_file = os.environ["GITHUB_OUTPUT"]
49-
50-
# Filter out any builds for 3.13
51-
filtered_matrix = {"include": []}
52-
for build in input_matrix["include"]:
53-
if build["python_version"] != "3.13":
54-
filtered_matrix["include"].append(build)
55-
56-
# Write the new matrix to the default outputs file
57-
with open(github_output_file, "w") as handle:
58-
handle.write(f"matrix={json.dumps(filtered_matrix)}")
5930
build:
60-
needs: filter-python-versions
31+
needs: generate-matrix
6132
name: ${{ matrix.repository }}
6233
uses: pytorch/test-infra/.github/workflows/build_wheels_linux.yml@main
6334
strategy:
@@ -66,7 +37,7 @@ jobs:
6637
repository: pytorch/torchtune
6738
ref: ""
6839
package-name: torchtune
69-
build-matrix: ${{ needs.filter-python-versions.outputs.matrix }}
40+
build-matrix: ${{ needs.generate-matrix.outputs.matrix }}
7041
pre-script: .github/scripts/pre_build_script.sh
7142
trigger-event: ${{ github.event_name }}
7243
build-platform: 'python-build-package'

.github/workflows/regression_test.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,8 @@ jobs:
2626
python-version: ['3.11']
2727
torch-version: ["stable", "nightly"]
2828
fail-fast: false
29+
env:
30+
PYTORCH_CUDA_ALLOC_CONF: expandable_segments:True
2931
steps:
3032
- name: Check out repo
3133
uses: actions/checkout@v3

README.md

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -9,8 +9,11 @@
99

1010
[**Introduction**](#introduction) | [**Installation**](#installation) | [**Get Started**](#get-started) | [**Documentation**](https://pytorch.org/torchtune/main/index.html) | [**Community**](#community) | [**License**](#license) | [**Citing torchtune**](#citing-torchtune)
1111

12-
> [!IMPORTANT]
13-
> Update September 25, 2024: torchtune has support for **Llama 3.2 11B Vision**, **Llama 3.2 3B**, and **Llama 3.2 1B** models! Try them out by following our installation instructions [here](#Installation), then run any of the text configs [here](recipes/configs/llama3_2) or vision configs [here](recipes/configs/llama3_2_vision).
12+
### 📣 Recent updates 📣
13+
* *November 2024*: torchtune has released [v0.4.0](https://github.com/pytorch/torchtune/releases/tag/v0.4.0) which includes stable support for exciting features like activation offloading and multimodal QLoRA
14+
* *November 2024*: torchtune has added [Gemma2](recipes/configs/gemma2) to its models!
15+
* *October 2024*: torchtune added support for Qwen2.5 models - find the recipes [here](recipes/configs/qwen2_5/)
16+
* *September 2024*: torchtune has support for **Llama 3.2 11B Vision**, **Llama 3.2 3B**, and **Llama 3.2 1B** models! Try them out by following our installation instructions [here](#Installation), then run any of the text configs [here](recipes/configs/llama3_2) or vision configs [here](recipes/configs/llama3_2_vision).
1417

1518

1619
 
@@ -44,8 +47,10 @@ torchtune currently supports the following models.
4447
| [Code-Llama2](https://ai.meta.com/blog/code-llama-large-language-model-coding/) | 7B, 13B, 70B [[models](torchtune/models/code_llama2/_model_builders.py), [configs](recipes/configs/code_llama2/)] |
4548
| [Mistral](https://huggingface.co/mistralai) | 7B [[models](torchtune/models/mistral/_model_builders.py), [configs](recipes/configs/mistral/)] |
4649
| [Gemma](https://huggingface.co/collections/google/gemma-release-65d5efbccdbb8c4202ec078b) | 2B, 7B [[models](torchtune/models/gemma/_model_builders.py), [configs](recipes/configs/gemma/)] |
50+
| [Gemma2](https://huggingface.co/docs/transformers/main/en/model_doc/gemma2) | 2B, 9B, 27B [[models](torchtune/models/gemma2/_model_builders.py), [configs](recipes/configs/gemma2/)] |
4751
| [Microsoft Phi3](https://huggingface.co/collections/microsoft/phi-3-6626e15e9585a200d2d761e3) | Mini [[models](torchtune/models/phi3/), [configs](recipes/configs/phi3/)]
4852
| [Qwen2](https://qwenlm.github.io/blog/qwen2/) | 0.5B, 1.5B, 7B [[models](torchtune/models/qwen2/), [configs](recipes/configs/qwen2/)]
53+
| [Qwen2.5](https://qwenlm.github.io/blog/qwen2.5/) | 0.5B, 1.5B, 3B, 7B, 14B, 32B, 72B [[models](torchtune/models/qwen2_5/), [configs](recipes/configs/qwen2_5/)]
4954

5055
We're always adding new models, but feel free to [file an issue](https://github.com/pytorch/torchtune/issues/new) if there's a new one you would like to see in torchtune.
5156

@@ -162,6 +167,7 @@ To download Llama3.1, you can run:
162167
```bash
163168
tune download meta-llama/Meta-Llama-3.1-8B-Instruct \
164169
--output-dir /tmp/Meta-Llama-3.1-8B-Instruct \
170+
--ignore-patterns "original/consolidated.00.pth" \
165171
--hf-token <HF_TOKEN> \
166172
```
167173

@@ -258,6 +264,7 @@ We really value our community and the contributions made by our wonderful users.
258264
- [@fyabc](https://github.com/fyabc) for adding Qwen2 models, tokenizer, and recipe integration to torchtune
259265
- [@solitude-alive](https://github.com/solitude-alive) for adding the [Gemma 2B model](torchtune/models/gemma/) to torchtune, including recipe changes, numeric validations of the models and recipe correctness
260266
- [@yechenzhi](https://github.com/yechenzhi) for adding [Direct Preference Optimization (DPO)](recipes/lora_dpo_single_device.py) to torchtune, including the recipe and config along with correctness checks
267+
- [@Optimox](https://github.com/Optimox) for adding all the [Gemma2 variants](torchtune/models/gemma2) to torchtune!
261268

262269

263270
&nbsp;

docs/source/_templates/layout.html

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,13 @@
1515
var collapsedSections = ['Introduction', 'Getting Started', 'Tutorials']
1616
</script> -->
1717

18-
<script script type="text/javascript">
18+
<script type="text/javascript">
1919
var collapsedSections = []
2020
</script>
21+
{{ super() }}
22+
<script type="text/javascript">
23+
$(document).ready(function() {
24+
$(".main-menu-item a[href='https://github.com/pytorch/pytorch']").attr("href", "https://github.com/pytorch/torchtune");
25+
});
26+
</script>
2127
{% endblock %}

docs/source/api_ref_data.rst

Lines changed: 0 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -6,8 +6,6 @@ torchtune.data
66

77
.. currentmodule:: torchtune.data
88

9-
.. _chat_formats:
10-
119
Text templates
1210
--------------
1311

@@ -18,14 +16,12 @@ and models.
1816
:toctree: generated/
1917
:nosignatures:
2018

21-
InstructTemplate
2219
GrammarErrorCorrectionTemplate
2320
SummarizeTemplate
2421
QuestionAnswerTemplate
2522
PromptTemplate
2623
PromptTemplateInterface
2724
ChatMLTemplate
28-
ChatFormat
2925

3026
Types
3127
-----
@@ -37,18 +33,6 @@ Types
3733
Message
3834
Role
3935

40-
Converters
41-
----------
42-
43-
Converts data from common JSON formats into a torchtune :class:`Message`.
44-
45-
.. autosummary::
46-
:toctree: generated/
47-
:nosignatures:
48-
49-
get_sharegpt_messages
50-
get_openai_messages
51-
5236
.. _message_transforms_ref:
5337

5438
Message transforms

docs/source/api_ref_datasets.rst

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,11 +6,11 @@ torchtune.datasets
66

77
.. currentmodule:: torchtune.datasets
88

9-
For a detailed general usage guide, please see our :ref:`datasets tutorial <dataset_tutorial_label>`.
9+
For a detailed general usage guide, please see :ref:`datasets_overview`.
1010

1111

1212
Text datasets
13-
------------------
13+
-------------
1414

1515
torchtune supports several widely used text-only datasets to help quickly bootstrap your fine-tuning.
1616

@@ -37,6 +37,7 @@ Image + Text datasets
3737

3838
multimodal.llava_instruct_dataset
3939
multimodal.the_cauldron_dataset
40+
multimodal.vqa_dataset
4041

4142
.. _dataset_builders:
4243

docs/source/api_ref_models.rst

Lines changed: 76 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -208,6 +208,47 @@ To download the CodeLlama-7B model:
208208
code_llama2.lora_code_llama2_70b
209209
code_llama2.qlora_code_llama2_70b
210210

211+
qwen-2.5
212+
--------
213+
214+
Models of size 0.5B, 1.5B, 3B, 7B, 14B, 32B, 72B from the `Qwen2.5 family <https://huggingface.co/collections/Qwen/qwen25-66e81a666513e518adb90d9e>`_.
215+
216+
To download the Qwen2.5 1.5B model, for example:
217+
218+
.. code-block:: bash
219+
220+
tune download Qwen/Qwen2.5-1.5B-Instruct --output-dir /tmp/Qwen2_5-1_5B-Instruct --ignore-patterns None
221+
222+
.. autosummary::
223+
:toctree: generated/
224+
:nosignatures:
225+
226+
qwen2_5.qwen2_5_0_5b
227+
qwen2_5.lora_qwen2_5_0_5b
228+
qwen2_5.qwen2_5_1_5b_base
229+
qwen2_5.qwen2_5_1_5b_instruct
230+
qwen2_5.lora_qwen2_5_1_5b_base
231+
qwen2_5.lora_qwen2_5_1_5b_instruct
232+
qwen2_5.qwen2_5_3b
233+
qwen2_5.lora_qwen2_5_3b
234+
qwen2_5.qwen2_5_7b_base
235+
qwen2_5.qwen2_5_7b_instruct
236+
qwen2_5.lora_qwen2_5_7b_base
237+
qwen2_5.lora_qwen2_5_7b_instruct
238+
qwen2_5.qwen2_5_14b_base
239+
qwen2_5.qwen2_5_14b_instruct
240+
qwen2_5.lora_qwen2_5_14b_base
241+
qwen2_5.lora_qwen2_5_14b_instruct
242+
qwen2_5.qwen2_5_32b_base
243+
qwen2_5.qwen2_5_32b_instruct
244+
qwen2_5.lora_qwen2_5_32b_base
245+
qwen2_5.lora_qwen2_5_32b_instruct
246+
qwen2_5.qwen2_5_72b_base
247+
qwen2_5.qwen2_5_72b_instruct
248+
qwen2_5.lora_qwen2_5_72b_base
249+
qwen2_5.lora_qwen2_5_72b_instruct
250+
qwen2_5.qwen2_5_tokenizer
251+
211252
qwen-2
212253
------
213254

@@ -225,12 +266,12 @@ To download the Qwen2 1.5B model, for example:
225266

226267
qwen2.qwen2
227268
qwen2.lora_qwen2
228-
qwen2.qwen2_7b
229269
qwen2.qwen2_0_5b
230-
qwen2.qwen2_1_5b
231-
qwen2.lora_qwen2_7b
232270
qwen2.lora_qwen2_0_5b
271+
qwen2.qwen2_1_5b
233272
qwen2.lora_qwen2_1_5b
273+
qwen2.qwen2_7b
274+
qwen2.lora_qwen2_7b
234275
qwen2.qwen2_tokenizer
235276

236277
phi-3
@@ -320,8 +361,39 @@ To download the Gemma 7B model:
320361
gemma.gemma_tokenizer
321362

322363

364+
gemma2 :
365+
--------
366+
367+
Models of size 2B, 9B, 27B from the `Gemma family <https://blog.google/technology/developers/gemma-open-models/>`_.
368+
369+
Important: You need to request access on `Hugging Face <https://huggingface.co/google/gemma-2-2b>`__ to use this model.
370+
371+
To download the Gemma2 2B, 9B, 27B models :
372+
373+
.. code-block:: bash
374+
375+
tune download google/gemma-2-<MODEL_SIZE>b --ignore-patterns "gemma-2-<MODEL_SIZE>b.gguf" --hf-token <HF_TOKEN>
376+
377+
378+
.. autosummary::
379+
:toctree: generated/
380+
:nosignatures:
381+
382+
gemma2.gemma2
383+
gemma2.lora_gemma2
384+
gemma2.gemma2_2b
385+
gemma2.lora_gemma2_2b
386+
gemma2.qlora_gemma2_2b
387+
gemma2.gemma2_9b
388+
gemma2.lora_gemma2_9b
389+
gemma2.qlora_gemma2_9b
390+
gemma2.gemma2_27b
391+
gemma2.lora_gemma2_27b
392+
gemma2.qlora_gemma2_27b
393+
gemma.gemma_tokenizer
394+
323395
clip
324-
-----
396+
----
325397

326398
Vision components to support multimodality using `CLIP encoder <https://arxiv.org/abs/2103.00020>`_.
327399

docs/source/api_ref_modules.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -71,9 +71,11 @@ PEFT Components
7171
:nosignatures:
7272

7373
peft.LoRALinear
74+
peft.DoRALinear
7475
peft.AdapterModule
7576
peft.get_adapter_params
7677
peft.set_trainable_params
78+
peft.get_adapter_state_dict
7779
peft.validate_missing_and_unexpected_for_lora
7880
peft.disable_adapter
7981

docs/source/api_ref_training.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -53,6 +53,7 @@ Utilities for enabling and working with distributed training.
5353
init_distributed
5454
is_distributed
5555
get_world_size_and_rank
56+
gather_cpu_state_dict
5657

5758
.. _ac_label:
5859

0 commit comments

Comments
 (0)