Skip to content

Commit

Permalink
Update PEFT Doc (NVIDIA#8501)
Browse files Browse the repository at this point in the history
* update peft doc

Signed-off-by: Chen Cui <chcui@nvidia.com>

* remove old prompt learning doc and notebook

Signed-off-by: Chen Cui <chcui@nvidia.com>

* fix table

Signed-off-by: Chen Cui <chcui@nvidia.com>

* fix table

Signed-off-by: Chen Cui <chcui@nvidia.com>

* fix table

Signed-off-by: Chen Cui <chcui@nvidia.com>

* revert accidental commit

Signed-off-by: Chen Cui <chcui@nvidia.com>

* revert accidental commit

Signed-off-by: Chen Cui <chcui@nvidia.com>

---------

Signed-off-by: Chen Cui <chcui@nvidia.com>
Signed-off-by: Zeeshan Patel <zeeshanp@berkeley.edu>
  • Loading branch information
cuichenx authored and zpx01 committed Mar 8, 2024
1 parent 7f236b3 commit 2b8e54e
Show file tree
Hide file tree
Showing 5 changed files with 18 additions and 1,192 deletions.
12 changes: 6 additions & 6 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -57,19 +57,19 @@ such as FSDP, Mixture-of-Experts, and RLHF with TensorRT-LLM to provide speedups
Introduction
------------

NVIDIA NeMo Framework is a generative AI framework built for researchers and pytorch developers
NVIDIA NeMo Framework is a generative AI framework built for researchers and pytorch developers
working on large language models (LLMs), multimodal models (MM), automatic speech recognition (ASR),
and text-to-speech synthesis (TTS).
The primary objective of NeMo is to provide a scalable framework for researchers and developers from industry and academia
The primary objective of NeMo is to provide a scalable framework for researchers and developers from industry and academia
to more easily implement and design new generative AI models by being able to leverage existing code and pretrained models.

For technical documentation, please see the `NeMo Framework User Guide <https://docs.nvidia.com/nemo-framework/user-guide/latest/playbooks/index.html>`_.

All NeMo models are trained with `Lightning <https://github.com/Lightning-AI/lightning>`_ and
training is automatically scalable to 1000s of GPUs.

When applicable, NeMo models take advantage of the latest possible distributed training techniques,
including parallelism strategies such as
When applicable, NeMo models take advantage of the latest possible distributed training techniques,
including parallelism strategies such as

* data parallelism
* tensor parallelism
Expand All @@ -84,7 +84,7 @@ and mixed precision training recipes with bfloat16 and FP8 training.
NeMo's Transformer based LLM and Multimodal models leverage `NVIDIA Transformer Engine <https://github.com/NVIDIA/TransformerEngine>`_ for FP8 training on NVIDIA Hopper GPUs
and leverages `NVIDIA Megatron Core <https://github.com/NVIDIA/Megatron-LM/tree/main/megatron/core>`_ for scaling transformer model training.

NeMo LLMs can be aligned with state of the art methods such as SteerLM, DPO and Reinforcement Learning from Human Feedback (RLHF),
NeMo LLMs can be aligned with state of the art methods such as SteerLM, DPO and Reinforcement Learning from Human Feedback (RLHF),
see `NVIDIA NeMo Aligner <https://github.com/NVIDIA/NeMo-Aligner>`_ for more details.

NeMo LLM and Multimodal models can be deployed and optimized with `NVIDIA Inference Microservices (Early Access) <https://developer.nvidia.com/nemo-microservices-early-access>`_.
Expand All @@ -93,7 +93,7 @@ NeMo ASR and TTS models can be optimized for inference and deployed for producti

For scaling NeMo LLM and Multimodal training on Slurm clusters or public clouds, please see the `NVIDIA Framework Launcher <https://github.com/NVIDIA/NeMo-Megatron-Launcher>`_.
The NeMo Framework launcher has extensive recipes, scripts, utilities, and documentation for training NeMo LLMs and Multimodal models and also has an `Autoconfigurator <https://github.com/NVIDIA/NeMo-Megatron-Launcher#53-using-autoconfigurator-to-find-the-optimal-configuration>`_
which can be used to find the optimal model parallel configuration for training on a specific cluster.
which can be used to find the optimal model parallel configuration for training on a specific cluster.
To get started quickly with the NeMo Framework Launcher, please see the `NeMo Framework Playbooks <https://docs.nvidia.com/nemo-framework/user-guide/latest/playbooks/index.html>`_
The NeMo Framework Launcher does not currently support ASR and TTS training but will soon.

Expand Down
16 changes: 8 additions & 8 deletions docs/source/nlp/nemo_megatron/peft/landing_page.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,14 +12,14 @@ fraction of the computational and storage costs.
NeMo supports four PEFT methods which can be used with various
transformer-based models.

==================== ===== ===== ========= ==
\ GPT 3 NvGPT LLaMa 1/2 T5
==================== ===== ===== ========= ==
Adapters (Canonical) ✅ ✅ ✅ ✅
LoRA ✅ ✅
IA3
P-Tuning ✅ ✅
==================== ===== ===== ========= ==
==================== ===== ======== ========= ====== ==
\ GPT 3 Nemotron LLaMa 1/2 Falcon T5
==================== ===== ======== ========= ====== ==
LoRA ✅ ✅
P-Tuning
Adapters (Canonical) ✅ ✅ ✅
IA3 ✅ ✅
==================== ===== ======== ========= ====== ==

Learn more about PEFT in NeMo with the :ref:`peftquickstart` which provides an overview on how PEFT works
in NeMo. Read about the supported PEFT methods
Expand Down
6 changes: 4 additions & 2 deletions docs/source/nlp/nemo_megatron/peft/quick_start.rst
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ Base model classes
PEFT in NeMo is built with a mix-in class that does not belong to any
model in particular. This means that the same interface is available to
different NeMo models. Currently, NeMo supports PEFT for GPT-style
models such as GPT 3, NvGPT, LLaMa 1/2 (``MegatronGPTSFTModel``), as
models such as GPT 3, Nemotron, LLaMa 1/2 (``MegatronGPTSFTModel``), as
well as T5 (``MegatronT5SFTModel``).

Full finetuning vs PEFT
Expand All @@ -78,11 +78,13 @@ PEFT.
trainer = MegatronTrainerBuilder(config).create_trainer()
model_cfg = MegatronGPTSFTModel.merge_cfg_with(config.model.restore_from_path, config)
### Training API ###
model = MegatronGPTSFTModel.restore_from(restore_path, model_cfg, trainer) # restore from pretrained ckpt
+ peft_cfg = LoRAPEFTConfig(model_cfg)
+ peft_cfg = LoraPEFTConfig(model_cfg)
+ model.add_adapter(peft_cfg)
trainer.fit(model) # saves adapter weights only
### Inference API ###
# Restore from base then load adapter API
model = MegatronGPTSFTModel.restore_from(restore_path, trainer, model_cfg)
+ model.load_adapters(adapter_save_path, peft_cfg)
Expand Down
Loading

0 comments on commit 2b8e54e

Please sign in to comment.