Skip to content

Commit

Permalink
comments
Browse files Browse the repository at this point in the history
  • Loading branch information
RdoubleA committed Oct 29, 2024
1 parent b2b01d8 commit b01ff58
Show file tree
Hide file tree
Showing 2 changed files with 15 additions and 10 deletions.
23 changes: 14 additions & 9 deletions docs/source/basics/custom_components.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,17 +4,15 @@
Custom Components and Recipes
=============================

torchtune lets you launch fine-tuning jobs directly from the command-line using both library and custom datasets,
models, recipes, and configs. This is done with the ``tune run`` command (see :ref:`cli_label`), which can also be
used from your project folder.
torchtune lets you launch fine-tuning jobs directly from the command-line using both built-in and custom components,
such as datasets, models, recipes, and configs. This is done with the ``tune run`` command (see :ref:`cli_label`),
which can also be used from your project folder.

Setting up your torchtune project
---------------------------------
First, ensure that you have torchtune installed, see :ref:`install_label`. This will also install the ``tune`` command
to your environment.

You can launch with ``tune run`` from any directory. Let's create a new project directory and ensure we can launch a built-in
library recipe with a library config from that folder.
First, ensure that you have torchtune installed - see :ref:`install_label`. This will install the ``tune`` command
in your environment, so you can launch ``tune run`` from any directory. Let's create a new project directory and ensure
we can launch a built-in library recipe with a library config from that folder.

.. code-block:: bash
Expand Down Expand Up @@ -108,6 +106,10 @@ model and dataset builders in our project directory:
# Return the module you want to train
return CustomTransformerDecoder(...)
This allows us to expose our custom model in a config friendly manner - rather than having to define every argument needed to
construct our custom model in our config, we only expose the arguments which we care about modifying. This is how we implement
our models in torchtune - see :func:`~torchtune.models.llama3_2_vision.llama3_2_vision_11b` as an example.

.. code-block:: python
#
Expand Down Expand Up @@ -141,7 +143,7 @@ model and dataset builders in our project directory:

If you are using a default torchtune recipe with a custom dataset, you must define the first
positional argument to be the tokenizer or model transform. These are automatically passed into
dataset instantiation and are defined separately in the config, not under the dataset field.
dataset during instantiation and are defined separately in the config, not under the dataset field.

You can define the custom model and custom dataset in the config using the relative import path from where
you are launching with ``tune run``. It is best to define the path relative to your project root directory
Expand All @@ -153,9 +155,12 @@ and launch from there.
model:
_component_: models.custom_decoder.custom_model
num_layers: 32
# this is an optional param, so you can also omit this from the config
classification_head: False
dataset:
_component_: datasets.custom_dataset.tiny_codes
# we don't need to define a tokenizer here as it's automatically passed in
packed: True
.. code-block:: bash
Expand Down
2 changes: 1 addition & 1 deletion docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -131,8 +131,8 @@ torchtune tutorials.
basics/message_transforms
basics/tokenizers
basics/prompt_templates
basics/custom_components
basics/packing
basics/custom_components

.. toctree::
:glob:
Expand Down

0 comments on commit b01ff58

Please sign in to comment.