Skip to content

Commit

Permalink
updating tutorials
Browse files Browse the repository at this point in the history
  • Loading branch information
SalmanMohammadi committed Aug 24, 2024
1 parent cdbe0d9 commit 3d06179
Show file tree
Hide file tree
Showing 3 changed files with 43 additions and 138 deletions.
98 changes: 0 additions & 98 deletions docs/source/_templates/_recipe_template.rst

This file was deleted.

55 changes: 31 additions & 24 deletions docs/source/recipes/lora_finetune_single_device.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,25 +4,13 @@
LoRA Single Device Finetuning
=============================

This recipe supports finetuning on next-token prediction tasks using `LoRA <https://arxiv.org/abs/2106.09685>`_,
a technique to significantly reduce memory consumption during training whilst still maintaining competitive performance.
This recipe supports finetuning on next-token prediction tasks using parameter efficient fine-tuning techniques (PEFT)
such as `LoRA <https://arxiv.org/abs/2106.09685>`_ and `QLoRA <https://arxiv.org/abs/2305.14314>`_. These techniques
significantly reduce memory consumption during training whilst still maintaining competitive performance.

Interested in using this recipe? Check out some of our awesome tutorials to show off how it can be used:

* :ref:`Finetuning Llama2 with LoRA<lora_finetune_label>`
* :ref:`End-to-End Workflow with torchtune<dataset_tutorial_label>`
* :ref:`Fine-tuning Llama3 with Chat Data<chat_tutorial_label>`
* :ref:`Meta Llama3 in torchtune<llama3_label>`
* :ref:`Fine-Tune Your First LLM<finetune_llama_label>`

The best way to get started with our recipes is through the :ref:`cli_label`, which allows you to
list all our recipes and configs, run recipes, copy configs and recipes, and validate configs
without touching a line of code!

For example, if you're interested in using this recipe with the latest `Llama models <https://llama.meta.com/>`_, you can fine-tune
We provide pre-tested out-of-the-box configs which you can get up and running with the latest `Llama models <https://llama.meta.com/>`_
in just two steps:


.. note::

You may need to be granted access to the Llama model you're interested in. See
Expand All @@ -38,17 +26,28 @@ in just two steps:
tune run lora_finetune_single_device \
--config llama3_1/8B_lora_single_device
You can quickly customize this recipe through the :ref:`cli_label`. For example, when fine-tuning with LoRA, you can adjust the layers which LoRA are applied to,
and the scale of the imapct of LoRA during training:

.. code-block:: bash
tune run lora_finetune_single_device \
--config llama3_1/8B_lora_single_device \
--model.lora_attn_modules=["q_proj", "k_proj", "v_proj"] \
--model.apply_lora_to_mlp=True \
--model.lora_rank=64 \
--model.lora_alpha=128
Most of you will want to twist, pull, and bop all the different levers, buttons, and knobs we expose in our recipes. Check out our
:ref:`configs tutorial <config_tutorial_label>` to learn how to customize recipes to suit your needs.
This configuration in particular results in a aggressive LoRA policy which
will tradeoff higher training accuracy with increased memory usage and slower training.

This recipe is an example of parameter efficient fine-tuning (PEFT). To understand the different
levers you can pull, see our documentation for the different PEFT training paradigms we support:
For a deeper understanding of the different levers you can pull when using this recipe,
see our documentation for the different PEFT training paradigms we support:

* :ref:`glossary_lora`.
* :ref:`glossary_qlora`.
* :ref:`glossary_lora`
* :ref:`glossary_qlora`

As with all of our recipes, you can also:
Many of our other memory optimization features can be used in this recipe, too:

* Adjust :ref:`model precision <glossary_precision>`.
* Use :ref:`activation checkpointing <glossary_act_ckpt>`.
Expand All @@ -57,4 +56,12 @@ As with all of our recipes, you can also:
significantly reduces memory usage due to gradient state, you will likely not need this
feature.

If you're interested in an overview of our memory optimization features, check out our :ref:`memory optimization overview<memory_optimization_overview_label>`!
You can learn more about all of our memory optimization features in our :ref:`memory optimization overview<memory_optimization_overview_label>`.

Interested in seeing this recipe in action? Check out some of our tutorials to show off how it can be used:

* :ref:`Finetuning Llama2 with LoRA<lora_finetune_label>`
* :ref:`End-to-End Workflow with torchtune<dataset_tutorial_label>`
* :ref:`Fine-tuning Llama3 with Chat Data<chat_tutorial_label>`
* :ref:`Meta Llama3 in torchtune<llama3_label>`
* :ref:`Fine-Tune Your First LLM<finetune_llama_label>`
28 changes: 12 additions & 16 deletions docs/source/recipes/qat_distributed.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ Distributed Quantization-Aware Training (QAT)
=============================================

QAT allows for taking advantage of memory-saving optimizations from quantization at inference time, without significantly
degrading model performance. In torchtune, we use `torchao <https://github.com/pytorch/ao>`_ to implement QAT during training.
degrading model performance. In torchtune, we use `torchao <https://github.com/pytorch/ao>`_ to implement QAT.
This works by :ref:`simulating quantization numerics during fine-tuning <what_is_qat_label>`. While this may introduce memory and
compute overheads during training, our tests found that QAT significantly reduced performance degradation in evaluations of
quantized model, without compromising on model size reduction gains.
Expand All @@ -15,23 +15,14 @@ quantized model, without compromising on model size reduction gains.
The `PyTorch blogpost <https://pytorch.org/blog/quantization-aware-training/>`_ on QAT provides further insight into how QAT works.


Interested in using this recipe? Check out some of our tutorials which show how it is used:

* :ref:`qat_finetune_label`

The best way to get started with our recipes is through the :ref:`cli_label`, which allows you to
list all our recipes and configs, run recipes, copy configs and recipes, and validate configs
without touching a line of code!

For example, if you're interested in using this recipe with the latest `Llama models <https://llama.meta.com/>`_, you can fine-tune
We provide pre-tested out-of-the-box configs which you can get up and running with the latest `Llama models <https://llama.meta.com/>`_
in just two steps:

.. note::

You may need to be granted access to the Llama model you're interested in. See
:ref:`here <download_llama_label>` for details on accessing gated repositories.


.. code-block:: bash
tune download meta-llama/Meta-Llama-3-8B-Instruct \
Expand All @@ -46,9 +37,6 @@ in just two steps:
This workload requires at least 6 GPUs, each with VRAM of at least 80GB.


Most of you will want to twist, pull, and bop all the different levers, buttons, and knobs we expose in our recipes. Check out our
:ref:`configs tutorial <config_tutorial_label>` to learn how to customize recipes to suit your needs.

Currently, the main lever you can pull for QAT is by using *delayed fake quantization*.
Delayed fake quantization allows for control over the step after which fake quantization occurs.
Empirically, allowing the model to finetune without fake quantization initially allows the
Expand Down Expand Up @@ -81,12 +69,20 @@ strategy. Generally, the pipeline for training, quantizing, and evaluating a mod
_component_: torchtune.utils.quantization.Int8DynActInt4WeightQuantizer
groupsize: 256
As with all of our recipes, you can also:
.. note::

We're using config files to show how to customize the recipe in these examples. Check out the
:ref:`configs tutorial <config_tutorial_label>` to learn more.

Many of our other memory optimization features can be used in this recipe, too:

* Adjust :ref:`model precision <glossary_precision>`.
* Use :ref:`activation checkpointing <glossary_act_ckpt>`.
* Enable :ref:`gradient accumulation <glossary_grad_accm>`.
* Use :ref:`lower precision optimizers <glossary_low_precision_opt>`.

You can learn more about all of our memory optimization features in our :ref:`memory optimization overview<memory_optimization_overview_label>`.

If you're interested in an overview of our memory optimization features, check out our :ref:`memory optimization overview<memory_optimization_overview_label>`!
Interested in seeing this recipe in action? Check out some of our tutorials to show off how it can be used:

* :ref:`qat_finetune_label`

0 comments on commit 3d06179

Please sign in to comment.