Add tutorial for Llama-3-8B lora training and deployment #9359

shashank3959 · 2024-06-03T04:37:32Z

Adds a notebook for Llama-3-8b LoRA PEFT with NeMo FW
Adds a notebook for sending multi-LoRA inference request to NVIDIA NIM
Adds README that includes instructions fore context and set up

* Adds a notebook for Llama-3-8b LORA PEFT with NeMo FW * Adds a notebook for sending multi-LoRA inference request to NIM * Adds README that includes instructions fore context and set up Signed-off-by: Shashank Verma <shashank3959@gmail.com>

Signed-off-by: Shashank Verma <shashank3959@gmail.com>

shashank3959 · 2024-06-03T15:50:30Z

@chrisalexiuk-nvidia @vinhngx could you please review?

tutorials/llm/llama-3/llama3-lora-nemofw.ipynb

tutorials/llm/llama-3/README.md

Signed-off-by: Shashank Verma <shashank3959@gmail.com>

chrisalexiuk-nvidia · 2024-06-05T18:02:52Z

There's some formatting mismatch here for the Fig Text.

shashank3959 · 2024-06-05T20:13:09Z

Adding @nealvaidya for any review comments covering the NIM part

chrisalexiuk-nvidia · 2024-06-05T21:03:05Z

First notebook workin' like a charm.

tutorials/llm/llama-3/llama3-lora-nemofw.ipynb

nealvaidya

Few things, mostly in readme. NIM notebook looks good

tutorials/llm/llama-3/README.rst

tutorials/llm/llama-3/llama3-lora-deploy-nim.ipynb

Signed-off-by: Shashank Verma <shashank3959@gmail.com>

jgerh · 2024-06-06T17:28:08Z

tutorials/llm/llama-3/README.rst

+`2. Multi-LoRA inference with NVIDIA NIM <./llama3-lora-deploy-nim.ipynb>`__
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+This is a demonstration of deploying multiple LoRA adapters with NVIDIA


This procedure demonstrates how to deploy multiple LoRA adapters with NVIDIA . . .

jgerh · 2024-06-06T17:31:32Z

tutorials/llm/llama-3/README.rst

+Hugging Face model formats. We will deploy the PubMedQA LoRA adapter
+from the first notebook, alongside two other already trained LoRA adapters
+(`GSM8K <https://github.com/openai/grade-school-math>`__,
+`SQuAD <https://rajpurkar.github.io/SQuAD-explorer/>`__) that are
+available on NVIDIA NGC as examples.


You will deploy the PubMedQA LoRA adapter
from the first notebook, alongside two previously trained LoRA adapters
(GSM8K <https://github.com/openai/grade-school-math>,
SQuAD <https://rajpurkar.github.io/SQuAD-explorer/>) that are
available on NVIDIA NGC as examples.

jgerh · 2024-06-06T17:37:00Z

tutorials/llm/llama-3/README.rst

+``NOTE``: While it’s not necessary to complete the LoRA training and
+obtain the adapter from the previous notebook (“Creating a LoRA adapter
+with NeMo Framework”) to follow along with this one, it is recommended
+if possible. You can still learn about LoRA deployment with NIM using
+the other adapters downloaded from NGC.


.. note::
Although it’s not necessary that you complete the LoRA training and secure the adapter from the preceding notebook (“Creating a LoRA adapter with NeMo Framework”) to proceed with this one, it is advisable. Regardless, you can continue to learn about LoRA deployment with NIM using other adapters that you’ve downloaded from NVIDIA NGC.

jgerh · 2024-06-06T17:40:27Z

tutorials/llm/llama-3/README.rst

+1. Download example LoRA adapters
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^


Download the example LoRA adapters.

Comment on edits: Changes this step to a sentence and remove the Heading 3 tag ^^^^^^^^^^^^^^^^^^

jgerh · 2024-06-06T17:42:35Z

tutorials/llm/llama-3/README.rst

+1. Download example LoRA adapters
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The following steps assume that you have authenticated with NGC and downloaded the CLI tool, as mentioned in pre-requisites.


The following steps assume that you have authenticated with NGC and downloaded the CLI tool, as listed in the Requirements section.

jgerh · 2024-06-06T17:43:16Z

tutorials/llm/llama-3/README.rst

+2. Prepare the LoRA model store
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^


Prepare the LoRA model store.

Comment on edits: Changes this step to a sentence and remove the Heading 3 tag ^^^^^^^^^^^^^^^^^^

jgerh · 2024-06-06T17:44:03Z

tutorials/llm/llama-3/README.rst

+3. Set-up NIM
+^^^^^^^^^^^^^


Set Up NIM.

Comment on edits: Changes this step to a sentence and remove the Heading 3 tag ^^^^^^^^^^^^^^^^^^

jgerh · 2024-06-06T17:46:38Z

tutorials/llm/llama-3/README.rst

+       -p 8000:8000 \
+       nvcr.io/nim/meta/llama3-8b-instruct:1.0.0
+
+The first time you run the command, it will download the model and cache it in ``$NIM_CACHE_PATH`` so subsequent deployments are even faster. There are several options to configure NIM other than the ones listed above, and you can find a full list in `NIM configuration <https://docs.nvidia.com/nim/large-language-models/latest/configuration.html>`__ documentation.


There are several options to configure NIM other than the ones listed above. You can find a full list in NIM configuration documentation.

jgerh · 2024-06-06T17:47:10Z

tutorials/llm/llama-3/README.rst

+4. Start the notebook
+^^^^^^^^^^^^^^^^^^^^^


Start the notebook.

Comment on edits: Changes this step to a sentence and remove the Heading 3 tag ^^^^^^^^^^^^^^^^^^

jgerh · 2024-06-06T17:47:42Z

tutorials/llm/llama-3/README.rst

+From another terminal, follow the same instructions as the previous
+notebook to launch Jupyter Lab, and navigate to `this notebook <./llama3-lora-deploy-nim.ipynb>`__.
+
+You may use the same NeMo


You can use the same NeMo

jgerh

Completed review of ReadMe.rst

Signed-off-by: Shashank Verma <shashank3959@gmail.com>

jgerh · 2024-06-06T19:39:28Z

tutorials/llm/llama-3/README.rst

@@ -1,63 +1,47 @@
 Llama 3 LoRA fine-tuning and deployment with NeMo Framework and NVIDIA NIM


Llama 3 LoRA Fine-Tuning and Deployment with NeMo Framework and NVIDIA NIM

Comment on edits: Missed this change. Title should be capitalized.

jgerh

I missed one change. Title should be capitalized: Llama 3 LoRA Fine-Tuning and Deployment with NeMo Framework and NVIDIA NIM

Signed-off-by: Shashank Verma <shashank3959@gmail.com>

jgerh · 2024-06-06T21:13:59Z

Approved comments

oyilmaz-nvidia

Approving based on other reviews.

* Add tutorial for Llama-3-8B lora training and deployment * Adds a notebook for Llama-3-8b LORA PEFT with NeMo FW * Adds a notebook for sending multi-LoRA inference request to NIM * Adds README that includes instructions fore context and set up Signed-off-by: Shashank Verma <shashank3959@gmail.com> * Add inference for other LoRAs in deployment notebook Signed-off-by: Shashank Verma <shashank3959@gmail.com> * Fix typo in path in LoRA training notebook Signed-off-by: Shashank Verma <shashank3959@gmail.com> * Fix typos and add end-2-end diagram Signed-off-by: Shashank Verma <shashank3959@gmail.com> * Fix minor issue in architecture diagram Signed-off-by: Shashank Verma <shashank3959@gmail.com> * Convert README from .md to .rst Signed-off-by: Shashank Verma <shashank3959@gmail.com> * Minor updates to README Signed-off-by: Shashank Verma <shashank3959@gmail.com> * Fix typo in deployment notebook Signed-off-by: Shashank Verma <shashank3959@gmail.com> * Incorporate review suggestions Signed-off-by: Shashank Verma <shashank3959@gmail.com> * Minor updates to README Signed-off-by: Shashank Verma <shashank3959@gmail.com> * Remove access token Invaidate and removes HF access token Signed-off-by: Shashank Verma <shashank3959@gmail.com> * Fix broken link to NIM docs Signed-off-by: Shashank Verma <shashank3959@gmail.com> * Fix minor typo in README parameter name Signed-off-by: Shashank Verma <shashank3959@gmail.com> * Fix gramma and inconsistencies in style and formatting Signed-off-by: Shashank Verma <shashank3959@gmail.com> * Capitalize Title Signed-off-by: Shashank Verma <shashank3959@gmail.com> --------- Signed-off-by: Shashank Verma <shashank3959@gmail.com> Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Add tutorial for Llama-3-8B lora training and deployment * Adds a notebook for Llama-3-8b LORA PEFT with NeMo FW * Adds a notebook for sending multi-LoRA inference request to NIM * Adds README that includes instructions fore context and set up Signed-off-by: Shashank Verma <shashank3959@gmail.com> * Add inference for other LoRAs in deployment notebook Signed-off-by: Shashank Verma <shashank3959@gmail.com> * Fix typo in path in LoRA training notebook Signed-off-by: Shashank Verma <shashank3959@gmail.com> * Fix typos and add end-2-end diagram Signed-off-by: Shashank Verma <shashank3959@gmail.com> * Fix minor issue in architecture diagram Signed-off-by: Shashank Verma <shashank3959@gmail.com> * Convert README from .md to .rst Signed-off-by: Shashank Verma <shashank3959@gmail.com> * Minor updates to README Signed-off-by: Shashank Verma <shashank3959@gmail.com> * Fix typo in deployment notebook Signed-off-by: Shashank Verma <shashank3959@gmail.com> * Incorporate review suggestions Signed-off-by: Shashank Verma <shashank3959@gmail.com> * Minor updates to README Signed-off-by: Shashank Verma <shashank3959@gmail.com> * Remove access token Invaidate and removes HF access token Signed-off-by: Shashank Verma <shashank3959@gmail.com> * Fix broken link to NIM docs Signed-off-by: Shashank Verma <shashank3959@gmail.com> * Fix minor typo in README parameter name Signed-off-by: Shashank Verma <shashank3959@gmail.com> * Fix gramma and inconsistencies in style and formatting Signed-off-by: Shashank Verma <shashank3959@gmail.com> * Capitalize Title Signed-off-by: Shashank Verma <shashank3959@gmail.com> --------- Signed-off-by: Shashank Verma <shashank3959@gmail.com>

shashank3959 and others added 3 commits June 3, 2024 04:35

Add inference for other LoRAs in deployment notebook

ef80f48

Signed-off-by: Shashank Verma <shashank3959@gmail.com>

Fix typo in path in LoRA training notebook

3376844

Signed-off-by: Shashank Verma <shashank3959@gmail.com>

shashank3959 force-pushed the dev/shashank-llama3-nb branch from 052b5b4 to 3376844 Compare June 3, 2024 11:47

shashank3959 changed the title ~~[DRAFT] Add tutorial for Llama-3-8B lora training and deployment~~ Add tutorial for Llama-3-8B lora training and deployment Jun 3, 2024

chrisalexiuk-nvidia reviewed Jun 3, 2024

View reviewed changes

tutorials/llm/llama-3/llama3-lora-nemofw.ipynb Outdated Show resolved Hide resolved

chrisalexiuk-nvidia reviewed Jun 3, 2024

View reviewed changes

tutorials/llm/llama-3/README.md Outdated Show resolved Hide resolved