diff --git a/demo2.ipynb b/demo2.ipynb
new file mode 100644
index 0000000000..009164652d
--- /dev/null
+++ b/demo2.ipynb
@@ -0,0 +1,2258 @@
+{
+ "cells": [
+  {
+   "attachments": {},
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# How to Fine-Tune LLMs with TRL\n",
+    "\n",
+    "_Authored by Philipp Schmid_\n",
+    "Original post: [How to Fine-Tune LLMs in 2024 with Hugging Face](https://www.philschmid.de/fine-tune-llms-in-2024-with-trl)\n",
+    "\n",
+    "_Adapted for by Quentin Gallouédec_\n",
+    "\n",
+    "Large Language Models or LLMs have seen a lot of progress in the last year. We went from now ChatGPT competitor to a whole zoo of LLMs, including Meta AI's [Llama 3](https://huggingface.co/blog/llama31), Mistrals [Mistral](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) & [Mixtral](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1) models, TII [Falcon](https://huggingface.co/tiiuae/falcon-40b), and many more. \n",
+    "Those LLMs can be used for a variety of tasks, including chatbots, question answering, summarization without any additional training. However, if you want to customize a model for your application. You may need to fine-tune the model on your data to achieve higher quality results than prompting or saving cost by training smaller models more efficient model.\n",
+    "\n",
+    " \n",
+    "This blog post walks you thorugh how to fine-tune open LLMs using Hugging Face [TRL](https://huggingface.co/docs/trl/index), [Transformers](https://huggingface.co/docs/transformers/index) & [Datasets](https://huggingface.co/docs/datasets/index). In the blog, we are going to:\n",
+    "\n",
+    "1. Define our use case \n",
+    "2. Setup development environment\n",
+    "3. Create and prepare the dataset\n",
+    "4. Fine-tune LLM using `trl` and the `SFTTrainer` \n",
+    "5. Test and evaluate the LLM\n",
+    "6. Deploy the LLM for Production\n",
+    "\n",
+    "_Note: This blog was created to run on consumer size GPUs (24GB), e.g. NVIDIA A10G or RTX 4090/3090, but can be easily adapted to run on bigger GPUs._\n",
+    "\n",
+    "\n",
+    "## 1. Define our use case \n",
+    "\n",
+    "When fine-tuning LLMs, it is important you know your use case and the task you want to solve. This will help you to choose the right model or help you to create a dataset to fine-tune your model. If you haven't defined your use case yet. You might want to go back to the drawing board.\n",
+    "I want to mention that not all use cases require fine-tuning and it is always recommended to evaluate and try out already fine-tuned models or API-based models before fine-tuning your own model. \n",
+    "\n",
+    "As an example, we are going to use the following use case:\n",
+    "\n",
+    "> We want to fine-tune a model, which can generate SQL queries based on a natural language instruction, which can then be integrated into our BI tool. The goal is to reduce the time it takes to create a SQL query and make it easier for non-technical users to create SQL queries.\n",
+    "\n",
+    "Text to SQL can be a good use case for fine-tuning LLMs, as it is a complex task that requires a lot of (internal) knowledge about the data and the SQL language. \n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 2. Setup development environment\n",
+    "\n",
+    "Our first step is to install Hugging Face Libraries and Pyroch, including trl, transformers and datasets. If you haven't heard of trl yet, don't worry. It is a new library on top of transformers and datasets, which makes it easier to fine-tune, rlhf, align open LLMs. \n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 53,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...\n",
+      "To disable this warning, you can either:\n",
+      "\t- Avoid using `tokenizers` before the fork if possible\n",
+      "\t- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)\n"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Requirement already satisfied: trl in ./nb/lib/python3.11/site-packages (0.12.1)\n",
+      "Requirement already satisfied: hf_transfer in ./nb/lib/python3.11/site-packages (0.1.8)\n",
+      "Requirement already satisfied: accelerate>=0.34.0 in ./nb/lib/python3.11/site-packages (from trl) (1.1.1)\n",
+      "Requirement already satisfied: datasets>=2.21.0 in ./nb/lib/python3.11/site-packages (from trl) (3.1.0)\n",
+      "Requirement already satisfied: rich in ./nb/lib/python3.11/site-packages (from trl) (13.9.4)\n",
+      "Requirement already satisfied: transformers>=4.46.0 in ./nb/lib/python3.11/site-packages (from trl) (4.46.3)\n",
+      "Requirement already satisfied: huggingface-hub>=0.21.0 in ./nb/lib/python3.11/site-packages (from accelerate>=0.34.0->trl) (0.26.2)\n",
+      "Requirement already satisfied: numpy<3.0.0,>=1.17 in ./nb/lib/python3.11/site-packages (from accelerate>=0.34.0->trl) (2.1.3)\n",
+      "Requirement already satisfied: packaging>=20.0 in ./nb/lib/python3.11/site-packages (from accelerate>=0.34.0->trl) (24.2)\n",
+      "Requirement already satisfied: psutil in ./nb/lib/python3.11/site-packages (from accelerate>=0.34.0->trl) (6.1.0)\n",
+      "Requirement already satisfied: pyyaml in ./nb/lib/python3.11/site-packages (from accelerate>=0.34.0->trl) (6.0.2)\n",
+      "Requirement already satisfied: safetensors>=0.4.3 in ./nb/lib/python3.11/site-packages (from accelerate>=0.34.0->trl) (0.4.5)\n",
+      "Requirement already satisfied: torch>=1.10.0 in ./nb/lib/python3.11/site-packages (from accelerate>=0.34.0->trl) (2.5.1)\n",
+      "Requirement already satisfied: filelock in ./nb/lib/python3.11/site-packages (from datasets>=2.21.0->trl) (3.16.1)\n",
+      "Requirement already satisfied: pyarrow>=15.0.0 in ./nb/lib/python3.11/site-packages (from datasets>=2.21.0->trl) (18.1.0)\n",
+      "Requirement already satisfied: dill<0.3.9,>=0.3.0 in ./nb/lib/python3.11/site-packages (from datasets>=2.21.0->trl) (0.3.8)\n",
+      "Requirement already satisfied: pandas in ./nb/lib/python3.11/site-packages (from datasets>=2.21.0->trl) (2.2.3)\n",
+      "Requirement already satisfied: requests>=2.32.2 in ./nb/lib/python3.11/site-packages (from datasets>=2.21.0->trl) (2.32.3)\n",
+      "Requirement already satisfied: tqdm>=4.66.3 in ./nb/lib/python3.11/site-packages (from datasets>=2.21.0->trl) (4.67.1)\n",
+      "Requirement already satisfied: xxhash in ./nb/lib/python3.11/site-packages (from datasets>=2.21.0->trl) (3.5.0)\n",
+      "Requirement already satisfied: multiprocess<0.70.17 in ./nb/lib/python3.11/site-packages (from datasets>=2.21.0->trl) (0.70.16)\n",
+      "Requirement already satisfied: fsspec<=2024.9.0,>=2023.1.0 in ./nb/lib/python3.11/site-packages (from fsspec[http]<=2024.9.0,>=2023.1.0->datasets>=2.21.0->trl) (2024.9.0)\n",
+      "Requirement already satisfied: aiohttp in ./nb/lib/python3.11/site-packages (from datasets>=2.21.0->trl) (3.11.7)\n",
+      "Requirement already satisfied: regex!=2019.12.17 in ./nb/lib/python3.11/site-packages (from transformers>=4.46.0->trl) (2024.11.6)\n",
+      "Requirement already satisfied: tokenizers<0.21,>=0.20 in ./nb/lib/python3.11/site-packages (from transformers>=4.46.0->trl) (0.20.3)\n",
+      "Requirement already satisfied: markdown-it-py>=2.2.0 in ./nb/lib/python3.11/site-packages (from rich->trl) (3.0.0)\n",
+      "Requirement already satisfied: pygments<3.0.0,>=2.13.0 in ./nb/lib/python3.11/site-packages (from rich->trl) (2.18.0)\n",
+      "Requirement already satisfied: aiohappyeyeballs>=2.3.0 in ./nb/lib/python3.11/site-packages (from aiohttp->datasets>=2.21.0->trl) (2.4.3)\n",
+      "Requirement already satisfied: aiosignal>=1.1.2 in ./nb/lib/python3.11/site-packages (from aiohttp->datasets>=2.21.0->trl) (1.3.1)\n",
+      "Requirement already satisfied: attrs>=17.3.0 in ./nb/lib/python3.11/site-packages (from aiohttp->datasets>=2.21.0->trl) (24.2.0)\n",
+      "Requirement already satisfied: frozenlist>=1.1.1 in ./nb/lib/python3.11/site-packages (from aiohttp->datasets>=2.21.0->trl) (1.5.0)\n",
+      "Requirement already satisfied: multidict<7.0,>=4.5 in ./nb/lib/python3.11/site-packages (from aiohttp->datasets>=2.21.0->trl) (6.1.0)\n",
+      "Requirement already satisfied: propcache>=0.2.0 in ./nb/lib/python3.11/site-packages (from aiohttp->datasets>=2.21.0->trl) (0.2.0)\n",
+      "Requirement already satisfied: yarl<2.0,>=1.17.0 in ./nb/lib/python3.11/site-packages (from aiohttp->datasets>=2.21.0->trl) (1.18.0)\n",
+      "Requirement already satisfied: typing-extensions>=3.7.4.3 in ./nb/lib/python3.11/site-packages (from huggingface-hub>=0.21.0->accelerate>=0.34.0->trl) (4.12.2)\n",
+      "Requirement already satisfied: mdurl~=0.1 in ./nb/lib/python3.11/site-packages (from markdown-it-py>=2.2.0->rich->trl) (0.1.2)\n",
+      "Requirement already satisfied: charset-normalizer<4,>=2 in ./nb/lib/python3.11/site-packages (from requests>=2.32.2->datasets>=2.21.0->trl) (3.4.0)\n",
+      "Requirement already satisfied: idna<4,>=2.5 in ./nb/lib/python3.11/site-packages (from requests>=2.32.2->datasets>=2.21.0->trl) (3.10)\n",
+      "Requirement already satisfied: urllib3<3,>=1.21.1 in ./nb/lib/python3.11/site-packages (from requests>=2.32.2->datasets>=2.21.0->trl) (2.2.3)\n",
+      "Requirement already satisfied: certifi>=2017.4.17 in ./nb/lib/python3.11/site-packages (from requests>=2.32.2->datasets>=2.21.0->trl) (2024.8.30)\n",
+      "Requirement already satisfied: networkx in ./nb/lib/python3.11/site-packages (from torch>=1.10.0->accelerate>=0.34.0->trl) (3.4.2)\n",
+      "Requirement already satisfied: jinja2 in ./nb/lib/python3.11/site-packages (from torch>=1.10.0->accelerate>=0.34.0->trl) (3.1.4)\n",
+      "Requirement already satisfied: nvidia-cuda-nvrtc-cu12==12.4.127 in ./nb/lib/python3.11/site-packages (from torch>=1.10.0->accelerate>=0.34.0->trl) (12.4.127)\n",
+      "Requirement already satisfied: nvidia-cuda-runtime-cu12==12.4.127 in ./nb/lib/python3.11/site-packages (from torch>=1.10.0->accelerate>=0.34.0->trl) (12.4.127)\n",
+      "Requirement already satisfied: nvidia-cuda-cupti-cu12==12.4.127 in ./nb/lib/python3.11/site-packages (from torch>=1.10.0->accelerate>=0.34.0->trl) (12.4.127)\n",
+      "Requirement already satisfied: nvidia-cudnn-cu12==9.1.0.70 in ./nb/lib/python3.11/site-packages (from torch>=1.10.0->accelerate>=0.34.0->trl) (9.1.0.70)\n",
+      "Requirement already satisfied: nvidia-cublas-cu12==12.4.5.8 in ./nb/lib/python3.11/site-packages (from torch>=1.10.0->accelerate>=0.34.0->trl) (12.4.5.8)\n",
+      "Requirement already satisfied: nvidia-cufft-cu12==11.2.1.3 in ./nb/lib/python3.11/site-packages (from torch>=1.10.0->accelerate>=0.34.0->trl) (11.2.1.3)\n",
+      "Requirement already satisfied: nvidia-curand-cu12==10.3.5.147 in ./nb/lib/python3.11/site-packages (from torch>=1.10.0->accelerate>=0.34.0->trl) (10.3.5.147)\n",
+      "Requirement already satisfied: nvidia-cusolver-cu12==11.6.1.9 in ./nb/lib/python3.11/site-packages (from torch>=1.10.0->accelerate>=0.34.0->trl) (11.6.1.9)\n",
+      "Requirement already satisfied: nvidia-cusparse-cu12==12.3.1.170 in ./nb/lib/python3.11/site-packages (from torch>=1.10.0->accelerate>=0.34.0->trl) (12.3.1.170)\n",
+      "Requirement already satisfied: nvidia-nccl-cu12==2.21.5 in ./nb/lib/python3.11/site-packages (from torch>=1.10.0->accelerate>=0.34.0->trl) (2.21.5)\n",
+      "Requirement already satisfied: nvidia-nvtx-cu12==12.4.127 in ./nb/lib/python3.11/site-packages (from torch>=1.10.0->accelerate>=0.34.0->trl) (12.4.127)\n",
+      "Requirement already satisfied: nvidia-nvjitlink-cu12==12.4.127 in ./nb/lib/python3.11/site-packages (from torch>=1.10.0->accelerate>=0.34.0->trl) (12.4.127)\n",
+      "Requirement already satisfied: triton==3.1.0 in ./nb/lib/python3.11/site-packages (from torch>=1.10.0->accelerate>=0.34.0->trl) (3.1.0)\n",
+      "Requirement already satisfied: sympy==1.13.1 in ./nb/lib/python3.11/site-packages (from torch>=1.10.0->accelerate>=0.34.0->trl) (1.13.1)\n",
+      "Requirement already satisfied: mpmath<1.4,>=1.1.0 in ./nb/lib/python3.11/site-packages (from sympy==1.13.1->torch>=1.10.0->accelerate>=0.34.0->trl) (1.3.0)\n",
+      "Requirement already satisfied: python-dateutil>=2.8.2 in ./nb/lib/python3.11/site-packages (from pandas->datasets>=2.21.0->trl) (2.9.0.post0)\n",
+      "Requirement already satisfied: pytz>=2020.1 in ./nb/lib/python3.11/site-packages (from pandas->datasets>=2.21.0->trl) (2024.2)\n",
+      "Requirement already satisfied: tzdata>=2022.7 in ./nb/lib/python3.11/site-packages (from pandas->datasets>=2.21.0->trl) (2024.2)\n",
+      "Requirement already satisfied: six>=1.5 in ./nb/lib/python3.11/site-packages (from python-dateutil>=2.8.2->pandas->datasets>=2.21.0->trl) (1.16.0)\n",
+      "Requirement already satisfied: MarkupSafe>=2.0 in ./nb/lib/python3.11/site-packages (from jinja2->torch>=1.10.0->accelerate>=0.34.0->trl) (3.0.2)\n",
+      "\n",
+      "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m24.0\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m24.3.1\u001b[0m\n",
+      "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\n",
+      "Note: you may need to restart the kernel to use updated packages.\n"
+     ]
+    }
+   ],
+   "source": [
+    "# Install TRL\n",
+    "%pip install trl hf_transfer"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "If you are using a GPU with Ampere architecture (e.g. NVIDIA A10G or RTX 4090/3090) or newer you can use Flash attention. Flash Attention is a an method that reorders the attention computation and leverages classical techniques (tiling, recomputation) to significantly speed it up and reduce memory usage from quadratic to linear in sequence length. The TL;DR; accelerates training up to 3x. Learn more at [FlashAttention](https://github.com/Dao-AILab/flash-attention/tree/main).\n",
+    "\n",
+    "_Note: If your machine has less than 96GB of RAM and lots of CPU cores, reduce the number of `MAX_JOBS`. On the `g6.2xlarge` we used `4`._"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 54,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# import torch; assert torch.cuda.get_device_capability()[0] >= 8, 'Hardware not supported for Flash Attention'\n",
+    "# # install flash-attn\n",
+    "# !pip install ninja packaging\n",
+    "# !MAX_JOBS=4 pip install flash-attn --no-build-isolation"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "_Installing flash attention can take quite a bit of time (10-45 minutes)._"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "We will use the [Hugging Face Hub](https://huggingface.co/models) as a remote model versioning service. This means we will automatically push our model, logs and information to the Hub during training.\n",
+    "\n",
+    "1. **Sign up**: Create an account at [Hugging Face](https://huggingface.co/join) if you don’t already have one.  \n",
+    "2. **Generate a token**: Go to [Token Settings](https://huggingface.co/settings/tokens) and create a new token:  \n",
+    "   - Select *Fine-grained*.  \n",
+    "   - Assign any name.  \n",
+    "   - Enable\n",
+    "     - \"Write access to contents/settings of all repos under your personal namespace.\"  and\n",
+    "     - \"Read access to contents of all public gated repos you can access.\"\n",
+    "   - Create the token and copy-paste it un the next cell.\n",
+    "\n",
+    "⚠️ **Keep your token secret**: Don't push this notebook with your token in it. ⚠️  "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 55,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from huggingface_hub import login\n",
+    "\n",
+    "login(\n",
+    "   token=\"...\", # ADD YOUR TOKEN HERE\n",
+    "   add_to_git_credential=True,\n",
+    ")\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 3. Create and prepare the dataset\n",
+    "\n",
+    "Once you have determined that fine-tuning is the right solution we need to create a dataset to fine-tune our model. The dataset should be a diverse set of demonstrations of the task you want to solve. There are several ways to create such a dataset, including:\n",
+    "* Using existing open-source datasets, e.g., [Spider](https://huggingface.co/datasets/spider)\n",
+    "* Using LLMs to create synthetically datasets, e.g., [Alpaca](https://huggingface.co/datasets/tatsu-lab/alpaca)\n",
+    "* Using Humans to create datasets, e.g., [Dolly](https://huggingface.co/datasets/databricks/databricks-dolly-15k).\n",
+    "* Using a combination of the above methods, e.g., [Orca](https://huggingface.co/datasets/Open-Orca/OpenOrca)\n",
+    "\n",
+    "Each of the methods has its own advantages and disadvantages and depends on the budget, time, and quality requirements. For example, using an existing dataset is the easiest but might not be tailored to your specific use case, while using humans might be the most accurate but can be time-consuming and expensive. It is also possible to combine several methods to create an instruction dataset, as shown in [Orca: Progressive Learning from Complex Explanation Traces of GPT-4.](https://arxiv.org/abs/2306.02707)\n",
+    "\n",
+    "In our example we will use an already existing dataset called [sql-create-context](https://huggingface.co/datasets/b-mc2/sql-create-context), which contains samples of natural language instructions, schema definitions and the corresponding SQL query."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 56,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "{'answer': 'SELECT COUNT(*) FROM head WHERE age > 56', 'question': 'How many heads of the departments are older than 56 ?', 'context': 'CREATE TABLE head (age INTEGER)'}\n"
+     ]
+    }
+   ],
+   "source": [
+    "from datasets import load_dataset\n",
+    "\n",
+    "dataset = load_dataset(\"b-mc2/sql-create-context\", split=\"train\")\n",
+    "print(dataset[0])"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "With the latest release of `trl` we now support popular [standard and conversational dataset formats](https://huggingface.co/docs/trl/en/dataset_formats). This means we only need to convert our dataset to one of the supported formats and `trl` will take care of the rest. Those formats include:\n",
+    "\n",
+    "* Standard format\n",
+    "\n",
+    "```python\n",
+    "{\"text\": \"The sky is blue.\"}\n",
+    "```\n",
+    "\n",
+    "* Conversational format\n",
+    "\n",
+    "```python\n",
+    "{\"messages\": [{\"role\": \"user\", \"content\": \"What color is the sky?\"},\n",
+    "              {\"role\": \"assistant\", \"content\": \"It is blue.\"}]}\n",
+    "```"
+   ]
+  },
+  {
+   "attachments": {},
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "In our example we are going to load our open-source dataset using the 🤗 Datasets library and then convert it into the the conversational format, where we include the schema definition in the system message for our assistant.\n",
+    "\n",
+    "_Note: This step can be different for your use case. For example, if you have already a dataset from, e.g. working with OpenAI, you can skip this step and go directly to the fine-tuning step._"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 57,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from datasets import load_dataset\n",
+    "\n",
+    "# Convert dataset to OAI messages\n",
+    "system_message = \"\"\"You are an text to SQL query translator. Users will ask you questions in English and you will generate a SQL query based on the provided SCHEMA.\n",
+    "SCHEMA:\n",
+    "{schema}\"\"\"\n",
+    "\n",
+    "\n",
+    "def create_conversation(example):\n",
+    "    return {\n",
+    "        \"messages\": [\n",
+    "            {\"role\": \"system\", \"content\": system_message.format(schema=example[\"context\"])},\n",
+    "            {\"role\": \"user\", \"content\": example[\"question\"]},\n",
+    "            {\"role\": \"assistant\", \"content\": example[\"answer\"]}],\n",
+    "    }\n",
+    "\n",
+    "\n",
+    "# Load dataset from the hub\n",
+    "dataset = load_dataset(\"b-mc2/sql-create-context\", split=\"train[:50000]\")\n",
+    "\n",
+    "# Convert dataset to conversational format\n",
+    "dataset = dataset.map(create_conversation, remove_columns=dataset.column_names)\n",
+    "\n",
+    "# Split dataset into train and test (95% train, 5% test)\n",
+    "dataset = dataset.train_test_split(test_size=0.05)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Now let's see if we can load it, and how it looks like."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 58,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "[{'content': 'You are an text to SQL query translator. Users will ask you questions in English and you will generate a SQL query based on the provided SCHEMA.\\nSCHEMA:\\nCREATE TABLE table_name_62 (attendance VARCHAR, result VARCHAR)', 'role': 'system'}, {'content': 'What is Attendance, when Result is \"2-4\"?', 'role': 'user'}, {'content': 'SELECT attendance FROM table_name_62 WHERE result = \"2-4\"', 'role': 'assistant'}]\n"
+     ]
+    }
+   ],
+   "source": [
+    "print(dataset[\"train\"][345][\"messages\"])"
+   ]
+  },
+  {
+   "attachments": {},
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 4. Fine-tune LLM using `trl` and the `SFTTrainer` \n",
+    "\n",
+    "We are now ready to fine-tune our model. We will use the [SFTTrainer](https://huggingface.co/docs/trl/sft_trainer) from `trl` to fine-tune our model. The `SFTTrainer` makes it straightfoward to supervise fine-tune open LLMs. The `SFTTrainer` is a subclass of the `Trainer` from the `transformers` library and supports all the same features, including logging, evaluation, and checkpointing, but adds additiional quality of life features, including:\n",
+    "* Dataset formatting, including standard and conversational format\n",
+    "* Training on completions only, ignoring prompts\n",
+    "* Packing datasets for more efficient training\n",
+    "* PEFT (parameter-efficient fine-tuning) support including Q-LoRA\n",
+    "* Preparing the model and tokenizer for conversational fine-tuning (e.g. adding special tokens)\n",
+    "\n",
+    "We will use the dataset formatting, packing and PEFT features in our example. As peft method we will use [QLoRA](https://huggingface.co/paper/2305.14314) a technique to reduce the memory footprint of large language models during finetuning, without sacrificing performance by using quantization. If you want to learn more about QLoRA and how it works, check out [Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA](https://huggingface.co/blog/4bit-transformers-bitsandbytes) blog post.\n",
+    "\n",
+    "Now, lets get started! 🚀"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "First, we will load our LLM. For our use case we are going to use [Llama 3.1 8B](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B). \n",
+    "But we can easily swap out the model for another model, e.g. [Mistral](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) or [Mixtral](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1) models, TII [Falcon](https://huggingface.co/tiiuae/falcon-40b), or any other LLMs by changing our `model_id` variable. We will use `bitsandbytes` to quantize our model to 4-bit.\n",
+    "\n",
+    "_Note: Be aware the bigger the model the more memory it will require. In our example we will use the 8B version, which can be tuned on 24GB GPUs._"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 59,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from transformers import AutoModelForCausalLM#, BitsAndBytesConfig\n",
+    "\n",
+    "# Hugging Face model id\n",
+    "model_id = \"Qwen/Qwen2.5-0.5B\" # or `mistralai/Mistral-7B-v0.1`\n",
+    "\n",
+    "# Load model and tokenizer\n",
+    "model = AutoModelForCausalLM.from_pretrained(model_id)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Correctly, preparing the LLM and Tokenizer for training chat/conversational models is crucial. We need to add new special tokens to the tokenizer and model and teach to understand the different roles in a conversation. In `trl` we have a convinient method called [`setup_chat_format`](https://huggingface.co/docs/trl/main/en/sft_trainer#add-special-tokens-for-chat-format), which:\n",
+    "* Adds special tokens to the tokenizer, e.g. `<|im_start|>` and `<|im_end|>`, to indicate the start and end of a conversation.\n",
+    "* Resizes the model’s embedding layer to accommodate the new tokens.\n",
+    "* Sets the `chat_template` of the tokenizer, which is used to format the input data into a chat-like format. The default is `chatml` from OpenAI."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 60,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from transformers import AutoTokenizer\n",
+    "from trl import setup_chat_format\n",
+    "\n",
+    "# Load tokenizer\n",
+    "tokenizer = AutoTokenizer.from_pretrained(model_id)\n",
+    "# tokenizer.padding_side = 'right' # to prevent warnings\n",
+    "\n",
+    "# Set chat template to OAI chatml\n",
+    "# model, tokenizer = setup_chat_format(model, tokenizer)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Now that our tokenzer have a proper chat template, it can convert our data into a formatted conversation, based on its chat template. Let's see how it looks like."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 61,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "<|im_start|>system\n",
+      "You are an text to SQL query translator. Users will ask you questions in English and you will generate a SQL query based on the provided SCHEMA.\n",
+      "SCHEMA:\n",
+      "CREATE TABLE table_name_62 (attendance VARCHAR, result VARCHAR)<|im_end|>\n",
+      "<|im_start|>user\n",
+      "What is Attendance, when Result is \"2-4\"?<|im_end|>\n",
+      "<|im_start|>assistant\n",
+      "SELECT attendance FROM table_name_62 WHERE result = \"2-4\"<|im_end|>\n",
+      "\n"
+     ]
+    }
+   ],
+   "source": [
+    "from trl import apply_chat_template\n",
+    "\n",
+    "example = dataset[\"train\"][345]\n",
+    "formatted_example = apply_chat_template(example, tokenizer)\n",
+    "print(formatted_example[\"text\"])"
+   ]
+  },
+  {
+   "attachments": {},
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The `SFTTrainer`  supports a native integration with `peft`, which makes it super easy to efficiently tune LLMs using, e.g. QLoRA. We only need to create our `LoraConfig` and provide it to the trainer. Our `LoraConfig` parameters are defined based on the [qlora paper](https://arxiv.org/pdf/2305.14314.pdf) and sebastian's [blog post](https://magazine.sebastianraschka.com/p/practical-tips-for-finetuning-llms)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 62,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# from peft import LoraConfig\n",
+    "\n",
+    "# # LoRA config based on QLoRA paper & Sebastian Raschka experiment\n",
+    "# peft_config = LoraConfig(\n",
+    "#         lora_alpha=128,\n",
+    "#         lora_dropout=0.05,\n",
+    "#         r=256,\n",
+    "#         bias=\"none\",\n",
+    "#         target_modules=\"all-linear\",\n",
+    "#         task_type=\"CAUSAL_LM\", \n",
+    "# )"
+   ]
+  },
+  {
+   "attachments": {},
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Before we can start our training we need to define the hyperparameters (`TrainingArguments`) we want to use."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 63,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from trl import SFTConfig\n",
+    "\n",
+    "training_args = SFTConfig(\n",
+    "    output_dir=\"Qwen2.5-0.5B-SQL\", # directory to save and repository id\n",
+    "    save_strategy=\"epoch\",         # save checkpoint every epoch\n",
+    "    push_to_hub=True,              # push model to hub\n",
+    "    report_to=\"none\",              # don't report to wandb\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "We now have every building block we need to create our `SFTTrainer` to start then training our model."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 64,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "/fsx/qgallouedec/trl/trl/trainer/sft_trainer.py:248: UserWarning: You didn't pass a `max_seq_length` argument to the SFTTrainer, this will default to 1024\n",
+      "  warnings.warn(\n"
+     ]
+    },
+    {
+     "data": {
+      "application/vnd.jupyter.widget-view+json": {
+       "model_id": "82c6a0e2e52d42a793cfd0be4f17441c",
+       "version_major": 2,
+       "version_minor": 0
+      },
+      "text/plain": [
+       "Map:   0%|          | 0/47500 [00:00<?, ? examples/s]"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    }
+   ],
+   "source": [
+    "from trl import SFTTrainer\n",
+    "\n",
+    "trainer = SFTTrainer(\n",
+    "    model=model,\n",
+    "    args=training_args,\n",
+    "    train_dataset=dataset[\"train\"],\n",
+    "    processing_class=tokenizer,\n",
+    ")"
+   ]
+  },
+  {
+   "attachments": {},
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Start training our model by calling the `train()` method on our `Trainer` instance. This will start the training loop and train our model for 3 epochs. Since we are using a PEFT method, we will only save the adapted model weights and not the full model."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 65,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "\n",
+       "    <div>\n",
+       "      \n",
+       "      <progress value='2999' max='17814' style='width:300px; height:20px; vertical-align: middle;'></progress>\n",
+       "      [ 2999/17814 06:17 < 31:06, 7.94 it/s, Epoch 0.50/3]\n",
+       "    </div>\n",
+       "    <table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       " <tr style=\"text-align: left;\">\n",
+       "      <th>Step</th>\n",
+       "      <th>Training Loss</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <td>10</td>\n",
+       "      <td>1.578700</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>20</td>\n",
+       "      <td>1.099800</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>30</td>\n",
+       "      <td>0.935500</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>40</td>\n",
+       "      <td>0.816000</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>50</td>\n",
+       "      <td>0.825600</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>60</td>\n",
+       "      <td>0.729100</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>70</td>\n",
+       "      <td>0.714400</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>80</td>\n",
+       "      <td>0.680500</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>90</td>\n",
+       "      <td>0.682500</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>100</td>\n",
+       "      <td>0.692700</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>110</td>\n",
+       "      <td>0.693800</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>120</td>\n",
+       "      <td>0.695100</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>130</td>\n",
+       "      <td>0.723100</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>140</td>\n",
+       "      <td>0.674500</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>150</td>\n",
+       "      <td>0.721600</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>160</td>\n",
+       "      <td>0.715700</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>170</td>\n",
+       "      <td>0.640500</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>180</td>\n",
+       "      <td>0.693800</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>190</td>\n",
+       "      <td>0.651500</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>200</td>\n",
+       "      <td>0.943500</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>210</td>\n",
+       "      <td>1.108900</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>220</td>\n",
+       "      <td>0.791800</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>230</td>\n",
+       "      <td>0.739100</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>240</td>\n",
+       "      <td>0.676400</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>250</td>\n",
+       "      <td>0.688700</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>260</td>\n",
+       "      <td>0.665100</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>270</td>\n",
+       "      <td>0.670400</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>280</td>\n",
+       "      <td>0.686600</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>290</td>\n",
+       "      <td>0.679300</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>300</td>\n",
+       "      <td>0.672500</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>310</td>\n",
+       "      <td>0.680900</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>320</td>\n",
+       "      <td>0.604300</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>330</td>\n",
+       "      <td>0.643200</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>340</td>\n",
+       "      <td>0.649500</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>350</td>\n",
+       "      <td>0.661400</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>360</td>\n",
+       "      <td>0.687200</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>370</td>\n",
+       "      <td>0.646700</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>380</td>\n",
+       "      <td>0.688400</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>390</td>\n",
+       "      <td>0.690000</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>400</td>\n",
+       "      <td>0.649300</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>410</td>\n",
+       "      <td>0.628200</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>420</td>\n",
+       "      <td>0.642100</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>430</td>\n",
+       "      <td>0.625900</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>440</td>\n",
+       "      <td>0.678500</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>450</td>\n",
+       "      <td>0.620500</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>460</td>\n",
+       "      <td>0.632800</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>470</td>\n",
+       "      <td>0.611900</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>480</td>\n",
+       "      <td>0.625800</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>490</td>\n",
+       "      <td>0.632900</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>500</td>\n",
+       "      <td>0.607400</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>510</td>\n",
+       "      <td>0.638600</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>520</td>\n",
+       "      <td>0.639700</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>530</td>\n",
+       "      <td>0.655200</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>540</td>\n",
+       "      <td>0.611800</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>550</td>\n",
+       "      <td>0.617600</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>560</td>\n",
+       "      <td>0.589500</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>570</td>\n",
+       "      <td>0.631400</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>580</td>\n",
+       "      <td>0.649300</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>590</td>\n",
+       "      <td>0.650300</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>600</td>\n",
+       "      <td>0.619900</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>610</td>\n",
+       "      <td>0.620300</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>620</td>\n",
+       "      <td>0.638200</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>630</td>\n",
+       "      <td>0.653700</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>640</td>\n",
+       "      <td>0.669100</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>650</td>\n",
+       "      <td>0.649000</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>660</td>\n",
+       "      <td>0.667900</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>670</td>\n",
+       "      <td>0.633000</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>680</td>\n",
+       "      <td>0.635800</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>690</td>\n",
+       "      <td>0.645200</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>700</td>\n",
+       "      <td>0.641000</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>710</td>\n",
+       "      <td>0.613200</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>720</td>\n",
+       "      <td>0.644100</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>730</td>\n",
+       "      <td>0.614900</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>740</td>\n",
+       "      <td>0.604600</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>750</td>\n",
+       "      <td>0.613800</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>760</td>\n",
+       "      <td>0.583300</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>770</td>\n",
+       "      <td>0.648700</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>780</td>\n",
+       "      <td>0.594600</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>790</td>\n",
+       "      <td>0.584100</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>800</td>\n",
+       "      <td>0.630900</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>810</td>\n",
+       "      <td>0.580700</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>820</td>\n",
+       "      <td>0.628100</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>830</td>\n",
+       "      <td>0.617400</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>840</td>\n",
+       "      <td>0.632200</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>850</td>\n",
+       "      <td>0.661200</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>860</td>\n",
+       "      <td>0.636200</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>870</td>\n",
+       "      <td>0.617100</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>880</td>\n",
+       "      <td>0.587000</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>890</td>\n",
+       "      <td>0.595000</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>900</td>\n",
+       "      <td>0.613700</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>910</td>\n",
+       "      <td>0.650500</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>920</td>\n",
+       "      <td>0.635700</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>930</td>\n",
+       "      <td>0.585400</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>940</td>\n",
+       "      <td>0.640100</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>950</td>\n",
+       "      <td>0.602500</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>960</td>\n",
+       "      <td>0.602300</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>970</td>\n",
+       "      <td>0.628400</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>980</td>\n",
+       "      <td>0.591400</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>990</td>\n",
+       "      <td>0.613800</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1000</td>\n",
+       "      <td>0.630400</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1010</td>\n",
+       "      <td>0.635500</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1020</td>\n",
+       "      <td>0.603000</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1030</td>\n",
+       "      <td>0.638600</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1040</td>\n",
+       "      <td>0.610000</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1050</td>\n",
+       "      <td>0.617700</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1060</td>\n",
+       "      <td>0.602900</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1070</td>\n",
+       "      <td>0.642400</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1080</td>\n",
+       "      <td>0.574200</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1090</td>\n",
+       "      <td>0.610700</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1100</td>\n",
+       "      <td>0.602400</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1110</td>\n",
+       "      <td>0.565800</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1120</td>\n",
+       "      <td>0.611800</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1130</td>\n",
+       "      <td>0.598400</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1140</td>\n",
+       "      <td>0.572400</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1150</td>\n",
+       "      <td>0.609400</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1160</td>\n",
+       "      <td>0.623400</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1170</td>\n",
+       "      <td>0.643900</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1180</td>\n",
+       "      <td>0.603100</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1190</td>\n",
+       "      <td>0.582800</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1200</td>\n",
+       "      <td>0.625300</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1210</td>\n",
+       "      <td>0.621200</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1220</td>\n",
+       "      <td>0.602500</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1230</td>\n",
+       "      <td>0.603600</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1240</td>\n",
+       "      <td>0.605700</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1250</td>\n",
+       "      <td>0.621700</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1260</td>\n",
+       "      <td>0.634400</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1270</td>\n",
+       "      <td>0.600800</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1280</td>\n",
+       "      <td>0.598900</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1290</td>\n",
+       "      <td>0.608400</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1300</td>\n",
+       "      <td>0.603900</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1310</td>\n",
+       "      <td>0.631000</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1320</td>\n",
+       "      <td>0.604400</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1330</td>\n",
+       "      <td>0.565300</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1340</td>\n",
+       "      <td>0.612100</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1350</td>\n",
+       "      <td>0.595600</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1360</td>\n",
+       "      <td>0.546300</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1370</td>\n",
+       "      <td>0.616500</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1380</td>\n",
+       "      <td>0.633100</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1390</td>\n",
+       "      <td>0.601200</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1400</td>\n",
+       "      <td>0.571400</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1410</td>\n",
+       "      <td>0.573000</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1420</td>\n",
+       "      <td>0.576800</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1430</td>\n",
+       "      <td>0.624200</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1440</td>\n",
+       "      <td>0.611300</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1450</td>\n",
+       "      <td>0.592600</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1460</td>\n",
+       "      <td>0.621700</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1470</td>\n",
+       "      <td>0.602400</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1480</td>\n",
+       "      <td>0.631600</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1490</td>\n",
+       "      <td>0.628400</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1500</td>\n",
+       "      <td>0.590400</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1510</td>\n",
+       "      <td>0.576200</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1520</td>\n",
+       "      <td>0.559700</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1530</td>\n",
+       "      <td>0.558600</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1540</td>\n",
+       "      <td>0.555300</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1550</td>\n",
+       "      <td>0.586700</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1560</td>\n",
+       "      <td>0.620300</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1570</td>\n",
+       "      <td>0.604200</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1580</td>\n",
+       "      <td>0.565900</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1590</td>\n",
+       "      <td>0.605600</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1600</td>\n",
+       "      <td>0.576200</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1610</td>\n",
+       "      <td>0.586900</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1620</td>\n",
+       "      <td>0.538900</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1630</td>\n",
+       "      <td>0.630800</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1640</td>\n",
+       "      <td>0.573800</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1650</td>\n",
+       "      <td>0.590000</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1660</td>\n",
+       "      <td>0.554300</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1670</td>\n",
+       "      <td>0.561400</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1680</td>\n",
+       "      <td>0.608600</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1690</td>\n",
+       "      <td>0.624900</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1700</td>\n",
+       "      <td>0.574500</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1710</td>\n",
+       "      <td>0.585700</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1720</td>\n",
+       "      <td>0.584100</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1730</td>\n",
+       "      <td>0.568800</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1740</td>\n",
+       "      <td>0.604800</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1750</td>\n",
+       "      <td>0.557200</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1760</td>\n",
+       "      <td>0.565600</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1770</td>\n",
+       "      <td>0.598000</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1780</td>\n",
+       "      <td>0.565200</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1790</td>\n",
+       "      <td>0.604600</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1800</td>\n",
+       "      <td>0.588700</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1810</td>\n",
+       "      <td>0.552400</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1820</td>\n",
+       "      <td>0.627300</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1830</td>\n",
+       "      <td>0.601200</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1840</td>\n",
+       "      <td>0.558900</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1850</td>\n",
+       "      <td>0.541700</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1860</td>\n",
+       "      <td>0.610900</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1870</td>\n",
+       "      <td>0.558100</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1880</td>\n",
+       "      <td>0.584000</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1890</td>\n",
+       "      <td>0.607400</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1900</td>\n",
+       "      <td>0.561400</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1910</td>\n",
+       "      <td>0.571900</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1920</td>\n",
+       "      <td>0.634600</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1930</td>\n",
+       "      <td>0.589600</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1940</td>\n",
+       "      <td>0.583700</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1950</td>\n",
+       "      <td>0.581000</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1960</td>\n",
+       "      <td>0.563500</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1970</td>\n",
+       "      <td>0.555800</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1980</td>\n",
+       "      <td>0.568500</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>1990</td>\n",
+       "      <td>0.574000</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2000</td>\n",
+       "      <td>0.597900</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2010</td>\n",
+       "      <td>0.601300</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2020</td>\n",
+       "      <td>0.586100</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2030</td>\n",
+       "      <td>0.551000</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2040</td>\n",
+       "      <td>0.581400</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2050</td>\n",
+       "      <td>0.580100</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2060</td>\n",
+       "      <td>0.588300</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2070</td>\n",
+       "      <td>0.558400</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2080</td>\n",
+       "      <td>0.561900</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2090</td>\n",
+       "      <td>0.622200</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2100</td>\n",
+       "      <td>0.577300</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2110</td>\n",
+       "      <td>0.618500</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2120</td>\n",
+       "      <td>0.585500</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2130</td>\n",
+       "      <td>0.553000</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2140</td>\n",
+       "      <td>0.569300</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2150</td>\n",
+       "      <td>0.559400</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2160</td>\n",
+       "      <td>0.552000</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2170</td>\n",
+       "      <td>0.572300</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2180</td>\n",
+       "      <td>0.541100</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2190</td>\n",
+       "      <td>0.571200</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2200</td>\n",
+       "      <td>0.530100</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2210</td>\n",
+       "      <td>0.560900</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2220</td>\n",
+       "      <td>0.604000</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2230</td>\n",
+       "      <td>0.581000</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2240</td>\n",
+       "      <td>0.550900</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2250</td>\n",
+       "      <td>0.590800</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2260</td>\n",
+       "      <td>0.603700</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2270</td>\n",
+       "      <td>0.581600</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2280</td>\n",
+       "      <td>0.583100</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2290</td>\n",
+       "      <td>0.608700</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2300</td>\n",
+       "      <td>0.574100</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2310</td>\n",
+       "      <td>0.567600</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2320</td>\n",
+       "      <td>0.593300</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2330</td>\n",
+       "      <td>0.573500</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2340</td>\n",
+       "      <td>0.557900</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2350</td>\n",
+       "      <td>0.587300</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2360</td>\n",
+       "      <td>0.550300</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2370</td>\n",
+       "      <td>0.572700</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2380</td>\n",
+       "      <td>0.533900</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2390</td>\n",
+       "      <td>0.564300</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2400</td>\n",
+       "      <td>0.552300</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2410</td>\n",
+       "      <td>0.548400</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2420</td>\n",
+       "      <td>0.605300</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2430</td>\n",
+       "      <td>0.640300</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2440</td>\n",
+       "      <td>0.565600</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2450</td>\n",
+       "      <td>0.541200</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2460</td>\n",
+       "      <td>0.557400</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2470</td>\n",
+       "      <td>0.534200</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2480</td>\n",
+       "      <td>0.541800</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2490</td>\n",
+       "      <td>0.554900</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2500</td>\n",
+       "      <td>0.531100</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2510</td>\n",
+       "      <td>0.616900</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2520</td>\n",
+       "      <td>0.549000</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2530</td>\n",
+       "      <td>0.607900</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2540</td>\n",
+       "      <td>0.568100</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2550</td>\n",
+       "      <td>0.557300</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2560</td>\n",
+       "      <td>0.582100</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2570</td>\n",
+       "      <td>0.555500</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2580</td>\n",
+       "      <td>0.553700</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2590</td>\n",
+       "      <td>0.559900</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2600</td>\n",
+       "      <td>0.512100</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2610</td>\n",
+       "      <td>0.556200</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2620</td>\n",
+       "      <td>0.559800</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2630</td>\n",
+       "      <td>0.563400</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2640</td>\n",
+       "      <td>0.557700</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2650</td>\n",
+       "      <td>0.624200</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2660</td>\n",
+       "      <td>0.553400</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2670</td>\n",
+       "      <td>0.582400</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2680</td>\n",
+       "      <td>0.553800</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2690</td>\n",
+       "      <td>0.527100</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2700</td>\n",
+       "      <td>0.557600</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2710</td>\n",
+       "      <td>0.565900</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2720</td>\n",
+       "      <td>0.609700</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2730</td>\n",
+       "      <td>0.600900</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2740</td>\n",
+       "      <td>0.547100</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2750</td>\n",
+       "      <td>0.569500</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2760</td>\n",
+       "      <td>0.587900</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2770</td>\n",
+       "      <td>0.545300</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2780</td>\n",
+       "      <td>0.551800</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2790</td>\n",
+       "      <td>0.546400</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2800</td>\n",
+       "      <td>0.615400</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2810</td>\n",
+       "      <td>0.574700</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2820</td>\n",
+       "      <td>0.609600</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2830</td>\n",
+       "      <td>0.547100</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2840</td>\n",
+       "      <td>0.598900</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2850</td>\n",
+       "      <td>0.574900</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2860</td>\n",
+       "      <td>0.565000</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2870</td>\n",
+       "      <td>0.530400</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2880</td>\n",
+       "      <td>0.558300</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2890</td>\n",
+       "      <td>0.563300</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2900</td>\n",
+       "      <td>0.555800</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2910</td>\n",
+       "      <td>0.536700</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2920</td>\n",
+       "      <td>0.565800</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2930</td>\n",
+       "      <td>0.568500</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2940</td>\n",
+       "      <td>0.547200</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2950</td>\n",
+       "      <td>0.571600</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2960</td>\n",
+       "      <td>0.579100</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2970</td>\n",
+       "      <td>0.533600</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2980</td>\n",
+       "      <td>0.545400</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <td>2990</td>\n",
+       "      <td>0.550600</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table><p>"
+      ],
+      "text/plain": [
+       "<IPython.core.display.HTML object>"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    },
+    {
+     "ename": "KeyboardInterrupt",
+     "evalue": "",
+     "output_type": "error",
+     "traceback": [
+      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
+      "\u001b[0;31mKeyboardInterrupt\u001b[0m                         Traceback (most recent call last)",
+      "Cell \u001b[0;32mIn[65], line 2\u001b[0m\n\u001b[1;32m      1\u001b[0m \u001b[38;5;66;03m# start training, the model will be automatically saved to the hub and the output directory\u001b[39;00m\n\u001b[0;32m----> 2\u001b[0m \u001b[43mtrainer\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mtrain\u001b[49m\u001b[43m(\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m      4\u001b[0m \u001b[38;5;66;03m# save model \u001b[39;00m\n\u001b[1;32m      5\u001b[0m trainer\u001b[38;5;241m.\u001b[39msave_model()\n",
+      "File \u001b[0;32m/fsx/qgallouedec/trl/nb/lib/python3.11/site-packages/transformers/trainer.py:2114\u001b[0m, in \u001b[0;36mTrainer.train\u001b[0;34m(self, resume_from_checkpoint, trial, ignore_keys_for_eval, **kwargs)\u001b[0m\n\u001b[1;32m   2111\u001b[0m \u001b[38;5;28;01mtry\u001b[39;00m:\n\u001b[1;32m   2112\u001b[0m     \u001b[38;5;66;03m# Disable progress bars when uploading models during checkpoints to avoid polluting stdout\u001b[39;00m\n\u001b[1;32m   2113\u001b[0m     hf_hub_utils\u001b[38;5;241m.\u001b[39mdisable_progress_bars()\n\u001b[0;32m-> 2114\u001b[0m     \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43minner_training_loop\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m   2115\u001b[0m \u001b[43m        \u001b[49m\u001b[43margs\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43margs\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m   2116\u001b[0m \u001b[43m        \u001b[49m\u001b[43mresume_from_checkpoint\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mresume_from_checkpoint\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m   2117\u001b[0m \u001b[43m        \u001b[49m\u001b[43mtrial\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mtrial\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m   2118\u001b[0m \u001b[43m        \u001b[49m\u001b[43mignore_keys_for_eval\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mignore_keys_for_eval\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m   2119\u001b[0m \u001b[43m    \u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m   2120\u001b[0m \u001b[38;5;28;01mfinally\u001b[39;00m:\n\u001b[1;32m   2121\u001b[0m     hf_hub_utils\u001b[38;5;241m.\u001b[39menable_progress_bars()\n",
+      "File \u001b[0;32m/fsx/qgallouedec/trl/nb/lib/python3.11/site-packages/transformers/trainer.py:2481\u001b[0m, in \u001b[0;36mTrainer._inner_training_loop\u001b[0;34m(self, batch_size, args, resume_from_checkpoint, trial, ignore_keys_for_eval)\u001b[0m\n\u001b[1;32m   2475\u001b[0m context \u001b[38;5;241m=\u001b[39m (\n\u001b[1;32m   2476\u001b[0m     functools\u001b[38;5;241m.\u001b[39mpartial(\u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39maccelerator\u001b[38;5;241m.\u001b[39mno_sync, model\u001b[38;5;241m=\u001b[39mmodel)\n\u001b[1;32m   2477\u001b[0m     \u001b[38;5;28;01mif\u001b[39;00m i \u001b[38;5;241m!=\u001b[39m \u001b[38;5;28mlen\u001b[39m(batch_samples) \u001b[38;5;241m-\u001b[39m \u001b[38;5;241m1\u001b[39m\n\u001b[1;32m   2478\u001b[0m     \u001b[38;5;28;01melse\u001b[39;00m contextlib\u001b[38;5;241m.\u001b[39mnullcontext\n\u001b[1;32m   2479\u001b[0m )\n\u001b[1;32m   2480\u001b[0m \u001b[38;5;28;01mwith\u001b[39;00m context():\n\u001b[0;32m-> 2481\u001b[0m     tr_loss_step \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mtraining_step\u001b[49m\u001b[43m(\u001b[49m\u001b[43mmodel\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43minputs\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mnum_items_in_batch\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m   2483\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m (\n\u001b[1;32m   2484\u001b[0m     args\u001b[38;5;241m.\u001b[39mlogging_nan_inf_filter\n\u001b[1;32m   2485\u001b[0m     \u001b[38;5;129;01mand\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m is_torch_xla_available()\n\u001b[1;32m   2486\u001b[0m     \u001b[38;5;129;01mand\u001b[39;00m (torch\u001b[38;5;241m.\u001b[39misnan(tr_loss_step) \u001b[38;5;129;01mor\u001b[39;00m torch\u001b[38;5;241m.\u001b[39misinf(tr_loss_step))\n\u001b[1;32m   2487\u001b[0m ):\n\u001b[1;32m   2488\u001b[0m     \u001b[38;5;66;03m# if loss is nan or inf simply add the average of previous logged losses\u001b[39;00m\n\u001b[1;32m   2489\u001b[0m     tr_loss \u001b[38;5;241m=\u001b[39m tr_loss \u001b[38;5;241m+\u001b[39m tr_loss \u001b[38;5;241m/\u001b[39m (\u001b[38;5;241m1\u001b[39m \u001b[38;5;241m+\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mstate\u001b[38;5;241m.\u001b[39mglobal_step \u001b[38;5;241m-\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_globalstep_last_logged)\n",
+      "File \u001b[0;32m/fsx/qgallouedec/trl/nb/lib/python3.11/site-packages/transformers/trainer.py:3612\u001b[0m, in \u001b[0;36mTrainer.training_step\u001b[0;34m(***failed resolving arguments***)\u001b[0m\n\u001b[1;32m   3610\u001b[0m         scaled_loss\u001b[38;5;241m.\u001b[39mbackward()\n\u001b[1;32m   3611\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[0;32m-> 3612\u001b[0m     \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43maccelerator\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mbackward\u001b[49m\u001b[43m(\u001b[49m\u001b[43mloss\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m   3613\u001b[0m     \u001b[38;5;66;03m# Finally we need to normalize the loss for reporting\u001b[39;00m\n\u001b[1;32m   3614\u001b[0m     \u001b[38;5;28;01mif\u001b[39;00m num_items_in_batch \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m:\n",
+      "File \u001b[0;32m/fsx/qgallouedec/trl/nb/lib/python3.11/site-packages/accelerate/accelerator.py:2241\u001b[0m, in \u001b[0;36mAccelerator.backward\u001b[0;34m(self, loss, **kwargs)\u001b[0m\n\u001b[1;32m   2239\u001b[0m     \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mlomo_backward(loss, learning_rate)\n\u001b[1;32m   2240\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[0;32m-> 2241\u001b[0m     \u001b[43mloss\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mbackward\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\u001b[43m)\u001b[49m\n",
+      "File \u001b[0;32m/fsx/qgallouedec/trl/nb/lib/python3.11/site-packages/torch/_tensor.py:581\u001b[0m, in \u001b[0;36mTensor.backward\u001b[0;34m(self, gradient, retain_graph, create_graph, inputs)\u001b[0m\n\u001b[1;32m    571\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m has_torch_function_unary(\u001b[38;5;28mself\u001b[39m):\n\u001b[1;32m    572\u001b[0m     \u001b[38;5;28;01mreturn\u001b[39;00m handle_torch_function(\n\u001b[1;32m    573\u001b[0m         Tensor\u001b[38;5;241m.\u001b[39mbackward,\n\u001b[1;32m    574\u001b[0m         (\u001b[38;5;28mself\u001b[39m,),\n\u001b[0;32m   (...)\u001b[0m\n\u001b[1;32m    579\u001b[0m         inputs\u001b[38;5;241m=\u001b[39minputs,\n\u001b[1;32m    580\u001b[0m     )\n\u001b[0;32m--> 581\u001b[0m \u001b[43mtorch\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mautograd\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mbackward\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m    582\u001b[0m \u001b[43m    \u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mgradient\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mretain_graph\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mcreate_graph\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43minputs\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43minputs\u001b[49m\n\u001b[1;32m    583\u001b[0m \u001b[43m\u001b[49m\u001b[43m)\u001b[49m\n",
+      "File \u001b[0;32m/fsx/qgallouedec/trl/nb/lib/python3.11/site-packages/torch/autograd/__init__.py:347\u001b[0m, in \u001b[0;36mbackward\u001b[0;34m(tensors, grad_tensors, retain_graph, create_graph, grad_variables, inputs)\u001b[0m\n\u001b[1;32m    342\u001b[0m     retain_graph \u001b[38;5;241m=\u001b[39m create_graph\n\u001b[1;32m    344\u001b[0m \u001b[38;5;66;03m# The reason we repeat the same comment below is that\u001b[39;00m\n\u001b[1;32m    345\u001b[0m \u001b[38;5;66;03m# some Python versions print out the first line of a multi-line function\u001b[39;00m\n\u001b[1;32m    346\u001b[0m \u001b[38;5;66;03m# calls in the traceback and some print out the last line\u001b[39;00m\n\u001b[0;32m--> 347\u001b[0m \u001b[43m_engine_run_backward\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m    348\u001b[0m \u001b[43m    \u001b[49m\u001b[43mtensors\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    349\u001b[0m \u001b[43m    \u001b[49m\u001b[43mgrad_tensors_\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    350\u001b[0m \u001b[43m    \u001b[49m\u001b[43mretain_graph\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    351\u001b[0m \u001b[43m    \u001b[49m\u001b[43mcreate_graph\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    352\u001b[0m \u001b[43m    \u001b[49m\u001b[43minputs\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m    353\u001b[0m \u001b[43m    \u001b[49m\u001b[43mallow_unreachable\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43;01mTrue\u001b[39;49;00m\u001b[43m,\u001b[49m\n\u001b[1;32m    354\u001b[0m \u001b[43m    \u001b[49m\u001b[43maccumulate_grad\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43;01mTrue\u001b[39;49;00m\u001b[43m,\u001b[49m\n\u001b[1;32m    355\u001b[0m \u001b[43m\u001b[49m\u001b[43m)\u001b[49m\n",
+      "File \u001b[0;32m/fsx/qgallouedec/trl/nb/lib/python3.11/site-packages/torch/autograd/graph.py:825\u001b[0m, in \u001b[0;36m_engine_run_backward\u001b[0;34m(t_outputs, *args, **kwargs)\u001b[0m\n\u001b[1;32m    823\u001b[0m     unregister_hooks \u001b[38;5;241m=\u001b[39m _register_logging_hooks_on_whole_graph(t_outputs)\n\u001b[1;32m    824\u001b[0m \u001b[38;5;28;01mtry\u001b[39;00m:\n\u001b[0;32m--> 825\u001b[0m     \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43mVariable\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_execution_engine\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mrun_backward\u001b[49m\u001b[43m(\u001b[49m\u001b[43m  \u001b[49m\u001b[38;5;66;43;03m# Calls into the C++ engine to run the backward pass\u001b[39;49;00m\n\u001b[1;32m    826\u001b[0m \u001b[43m        \u001b[49m\u001b[43mt_outputs\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43margs\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\n\u001b[1;32m    827\u001b[0m \u001b[43m    \u001b[49m\u001b[43m)\u001b[49m  \u001b[38;5;66;03m# Calls into the C++ engine to run the backward pass\u001b[39;00m\n\u001b[1;32m    828\u001b[0m \u001b[38;5;28;01mfinally\u001b[39;00m:\n\u001b[1;32m    829\u001b[0m     \u001b[38;5;28;01mif\u001b[39;00m attach_logging_hooks:\n",
+      "\u001b[0;31mKeyboardInterrupt\u001b[0m: "
+     ]
+    }
+   ],
+   "source": [
+    "# start training, the model will be automatically saved to the hub and the output directory\n",
+    "trainer.train()\n",
+    "\n",
+    "# save model \n",
+    "trainer.save_model()"
+   ]
+  },
+  {
+   "attachments": {},
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The training with Flash Attention for 3 epochs with a dataset of 10k samples took 02:05:58 on a `g6.2xlarge`. The instance costs `1,212$/h` which brings us to a total cost of only `1.8$`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 52,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "# free the memory again\n",
+    "del model\n",
+    "del trainer\n",
+    "torch.cuda.empty_cache()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Merge LoRA adapter in to the original model\n",
+    "\n",
+    "When using QLoRA, we only train adapters and not the full model. This means when saving the model during training we only save the adapter weights and not the full model. If you want to save the full model, which makes it easier to use with Text Generation Inference you can merge the adapter weights into the model weights using the `merge_and_unload` method and then save the model with the `save_pretrained` method. This will save a default model, which can be used for inference.\n",
+    "\n",
+    "_Note: This requires > 30GB CPU Memory._"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "application/vnd.jupyter.widget-view+json": {
+       "model_id": "07e8fc0add14407c8c32c88abe998e78",
+       "version_major": 2,
+       "version_minor": 0
+      },
+      "text/plain": [
+       "Loading checkpoint shards:   0%|          | 0/4 [00:00<?, ?it/s]"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    }
+   ],
+   "source": [
+    "#### COMMENT IN TO MERGE PEFT AND BASE MODEL ####\n",
+    "from peft import AutoPeftModelForCausalLM\n",
+    "\n",
+    "# Load PEFT model on CPU\n",
+    "model = AutoPeftModelForCausalLM.from_pretrained(\n",
+    "    args.output_dir,\n",
+    "    torch_dtype=torch.float16,\n",
+    "    low_cpu_mem_usage=True,\n",
+    ")  \n",
+    "# Merge LoRA and base model and save\n",
+    "merged_model = model.merge_and_unload()\n",
+    "merged_model.save_pretrained(args.output_dir,safe_serialization=True, max_shard_size=\"2GB\")"
+   ]
+  },
+  {
+   "attachments": {},
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 4. Test Model and run Inference\n",
+    "\n",
+    "After the training is done we want to evaluate and test our model. We will load different samples from the original dataset and evaluate the model on those samples, using a simple loop and accuracy as our metric. \n",
+    "\n",
+    "_Note: Evaluating Generative AI models is not a trivial task since 1 input can have multiple correct outputs. If you want to learn more about evaluating generative models, check out [Evaluate LLMs and RAG a practical example using Langchain and Hugging Face](https://www.philschmid.de/evaluate-llm) blog post._\n",
+    "\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "/opt/conda/envs/pytorch/lib/python3.10/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: '/opt/conda/envs/pytorch/lib/python3.10/site-packages/torchvision/image.so: undefined symbol: _ZN3c1017RegisterOperatorsD1Ev'If you don't plan on using image functionality from `torchvision.io`, you can ignore this warning. Otherwise, there might be something wrong with your environment. Did you have `libjpeg` or `libpng` installed before building `torchvision` from source?\n",
+      "  warn(\n"
+     ]
+    },
+    {
+     "data": {
+      "application/vnd.jupyter.widget-view+json": {
+       "model_id": "89b45d7199c04dbea843232c087488fd",
+       "version_major": 2,
+       "version_minor": 0
+      },
+      "text/plain": [
+       "Loading checkpoint shards:   0%|          | 0/4 [00:00<?, ?it/s]"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    }
+   ],
+   "source": [
+    "import torch\n",
+    "from transformers import AutoTokenizer, pipeline, AutoModelForCausalLM\n",
+    "\n",
+    "model_id = \"Qwen/Qwen2.5-0.5B-SQL\"\n",
+    "\n",
+    "# Load Model with PEFT adapter\n",
+    "model = AutoModelForCausalLM.from_pretrained(\n",
+    "  model_id,\n",
+    "  # device_map=\"auto\",\n",
+    "  # torch_dtype=torch.float16\n",
+    ")\n",
+    "tokenizer = AutoTokenizer.from_pretrained(model_id)\n",
+    "# load into pipeline\n",
+    "pipe = pipeline(\"text-generation\", model=model, tokenizer=tokenizer)"
+   ]
+  },
+  {
+   "attachments": {},
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Let’s load our test dataset try to generate an instruction."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "application/vnd.jupyter.widget-view+json": {
+       "model_id": "3844eccf972047eb82384102948ca553",
+       "version_major": 2,
+       "version_minor": 0
+      },
+      "text/plain": [
+       "Generating train split: 0 examples [00:00, ? examples/s]"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "/opt/conda/envs/pytorch/lib/python3.10/site-packages/transformers/generation/configuration_utils.py:567: UserWarning: `do_sample` is set to `False`. However, `temperature` is set to `0.1` -- this flag is only used in sample-based generation modes. You should set `do_sample=True` or unset `temperature`.\n",
+      "  warnings.warn(\n",
+      "/opt/conda/envs/pytorch/lib/python3.10/site-packages/transformers/generation/configuration_utils.py:572: UserWarning: `do_sample` is set to `False`. However, `top_p` is set to `0.1` -- this flag is only used in sample-based generation modes. You should set `do_sample=True` or unset `top_p`.\n",
+      "  warnings.warn(\n"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Query:\n",
+      "What is the Highest first elected year that has a district of 06.0 6, and a committee of economic matters?\n",
+      "Original Answer:\n",
+      "SELECT MAX(first_elected) FROM table_name_99 WHERE district = \"06.0 6\" AND committee = \"economic matters\"\n",
+      "Generated Answer:\n",
+      "SELECT MAX(first_elected) FROM table_name_99 WHERE district = \"06.0 6\" AND committee = \"economic matters\"\n"
+     ]
+    }
+   ],
+   "source": [
+    "from datasets import load_dataset \n",
+    "from random import randint\n",
+    "\n",
+    "\n",
+    "# Load our test dataset\n",
+    "eval_dataset = load_dataset(\"json\", data_files=\"test_dataset.json\", split=\"train\")\n",
+    "rand_idx = randint(0, len(eval_dataset))\n",
+    "\n",
+    "# Test on sample \n",
+    "prompt = pipe.tokenizer.apply_chat_template(eval_dataset[rand_idx][\"messages\"][:2], tokenize=False, add_generation_prompt=True)\n",
+    "outputs = pipe(prompt, max_new_tokens=256, do_sample=False, temperature=0.1, top_k=50, top_p=0.1, eos_token_id=pipe.tokenizer.eos_token_id, pad_token_id=pipe.tokenizer.pad_token_id)\n",
+    "\n",
+    "print(f\"Query:\\n{eval_dataset[rand_idx]['messages'][1]['content']}\")\n",
+    "print(f\"Original Answer:\\n{eval_dataset[rand_idx]['messages'][2]['content']}\")\n",
+    "print(f\"Generated Answer:\\n{outputs[0]['generated_text'][len(prompt):].strip()}\")"
+   ]
+  },
+  {
+   "attachments": {},
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Nice! Our model was able to generate a SQL query based on the natural language instruction. Lets evaluate our model on the full 2,500 samples of our test dataset. \n",
+    "_Note: As mentioned above, evaluating generative models is not a trivial task. In our example we used the accuracy of the generated SQL based on the ground truth SQL query as our metric. An alternative way could be to automatically execute the generated SQL query and compare the results with the ground truth. This would be a more accurate metric but requires more work to setup._"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "  1%|          | 9/1000 [00:12<21:30,  1.30s/it]You seem to be using the pipelines sequentially on GPU. In order to maximize efficiency please use a dataset\n",
+      "100%|██████████| 1000/1000 [24:27<00:00,  1.47s/it]"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Accuracy: 80.00%\n"
+     ]
+    },
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "\n"
+     ]
+    }
+   ],
+   "source": [
+    "from tqdm import tqdm\n",
+    "\n",
+    "\n",
+    "def evaluate(sample):\n",
+    "    prompt = pipe.tokenizer.apply_chat_template(sample[\"messages\"][:2], tokenize=False, add_generation_prompt=True)\n",
+    "    outputs = pipe(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95, eos_token_id=pipe.tokenizer.eos_token_id, pad_token_id=pipe.tokenizer.pad_token_id)\n",
+    "    predicted_answer = outputs[0]['generated_text'][len(prompt):].strip()\n",
+    "    if predicted_answer == sample[\"messages\"][2][\"content\"]:\n",
+    "        return 1 \n",
+    "    else:\n",
+    "        return 0\n",
+    "\n",
+    "success_rate = []\n",
+    "number_of_eval_samples = 1000\n",
+    "# iterate over eval dataset and predict\n",
+    "for s in tqdm(eval_dataset.shuffle().select(range(number_of_eval_samples))):\n",
+    "    success_rate.append(evaluate(s))\n",
+    "\n",
+    "# compute accuracy\n",
+    "accuracy = sum(success_rate)/len(success_rate)\n",
+    "\n",
+    "print(f\"Accuracy: {accuracy*100:.2f}%\")  \n",
+    "        "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "We evaluated our model on 1000 samples from the evaluation dataset and got an accuracy of 80.00%, which took ~25 minutes. \n",
+    "This is quite good, but as mentioned you need to take this metric with a grain of salt. It would be better if we could evaluate our model by running the qureies against a real database and compare the results. Since there might be different \"correct\" SQL queries for the same instruction. There are also several ways on how we could improve the performance by using few-shot learning, using RAG, Self-healing to generate the SQL query."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 6. Deploy the LLM for Production\n",
+    "\n",
+    "You can now deploy your model to production. For deploying open LLMs into production we recommend using [Text Generation Inference (TGI)](https://github.com/huggingface/text-generation-inference). TGI is a purpose-built solution for deploying and serving Large Language Models (LLMs). TGI enables high-performance text generation using Tensor Parallelism and continous batching for the most popular open LLMs, including Llama, Mistral, Mixtral, StarCoder, T5 and more. Text Generation Inference is used by companies as IBM, Grammarly, Uber, Deutsche Telekom, and many more. There are several ways to deploy your model, including:\n",
+    "\n",
+    " * [Deploy LLMs with Hugging Face Inference Endpoints](https://huggingface.co/blog/inference-endpoints-llm)\n",
+    " * [Hugging Face LLM Inference Container for Amazon SageMaker](https://huggingface.co/blog/sagemaker-huggingface-llm)\n",
+    "\n",
+    "If you have docker installed you can use the following command to start the inference server. \n",
+    "\n",
+    "_Note: Make sure that you have enough GPU memory to run the container. Restart kernel to remove all allocated GPU memory from the notebook._ "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "Unable to find image 'ghcr.io/huggingface/text-generation-inference:2.2.0' locally\n",
+      "2.2.0: Pulling from huggingface/text-generation-inference\n",
+      "aece8493d397: Already exists\n",
+      "45f7ea5367fe: Already exists\n",
+      "3d97a47c3c73: Already exists\n",
+      "12cd4d19752f: Already exists\n",
+      "da5a484f9d74: Already exists\n",
+      "4f4fb700ef54: Already exists\n",
+      "43566b48e5d6: Already exists\n",
+      "f165933352a8: Already exists\n",
+      "f166ffc7c7b4: Already exists\n",
+      "58165ae83a0e: Already exists\n",
+      "074d930e1b90: Already exists\n",
+      "1033b2636622: Already exists\n",
+      "e0aa534acffe: Already exists\n",
+      "130989d28b48: Already exists\n",
+      "a65ea9ebfaba: Already exists\n",
+      "7225b2c46f88: Already exists\n",
+      "43154e73908f: Already exists\n",
+      "8f400e318724: Already exists\n",
+      "f694acf6c40f: Already exists\n",
+      "44fc79164bc4: Already exists\n",
+      "8bc7c142e917: Already exists\n",
+      "021f7d48bdcb: Already exists\n",
+      "c9d01f7d10cc: Already exists\n",
+      "400740bc31be: Already exists\n",
+      "bd4b49ea4512: Already exists\n",
+      "141228b9bdde: Already exists\n",
+      "4f4fb700ef54: Already exists\n",
+      "34d4a7457184: Already exists\n",
+      "66e724dff43a: Already exists\n",
+      "25c75c242d08: Already exists\n",
+      "6a4be63c7e70: Already exists\n",
+      "b2d83f4bca52: Already exists\n",
+      "373c47aa4b50: Already exists\n",
+      "4f4fb700ef54: Already exists\n",
+      "Digest: sha256:d39d513f13727ffa9b6a4d0e949f36413b944aabc9a236c0aa2986c929906769\n",
+      "Status: Downloaded newer image for ghcr.io/huggingface/text-generation-inference:2.2.0\n"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "42be7f00ddeb0a3214920a09a5ea303d8eb034942d7020155b6a6761fca87193\n"
+     ]
+    }
+   ],
+   "source": [
+    "%%bash \n",
+    "# model=$PWD/{args.output_dir} # path to model\n",
+    "model=$(pwd)/code-llama-3-1-8b-text-to-sql # path to model\n",
+    "num_shard=1             # number of shards\n",
+    "max_input_length=1024   # max input length\n",
+    "max_total_tokens=2048   # max total tokens\n",
+    "\n",
+    "docker run -d --name tgi --gpus all -ti -p 8080:80 \\\n",
+    "  -e MODEL_ID=/workspace \\\n",
+    "  -e NUM_SHARD=$num_shard \\\n",
+    "  -e MAX_INPUT_LENGTH=$max_input_length \\\n",
+    "  -e MAX_TOTAL_TOKENS=$max_total_tokens \\\n",
+    "  -v $model:/workspace \\\n",
+    "  ghcr.io/huggingface/text-generation-inference:2.2.0"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Once your container is running you can send requests using the `openai` or `huggingface_hub` sdk. Here we ll use the `openai` sdk to send a request to our inference server. If you don't have the `openai` sdk installed you can install it using `pip install openai`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Query:\n",
+      "Name the first elected for kentucky 1\n",
+      "Original Answer:\n",
+      "SELECT first_elected FROM table_2668378_5 WHERE district = \"Kentucky 1\"\n",
+      "Generated Answer:\n",
+      "SELECT first_elected FROM table_2668378_5 WHERE district = \"Kentucky 1\"\n"
+     ]
+    }
+   ],
+   "source": [
+    "from openai import OpenAI\n",
+    "from datasets import load_dataset\n",
+    "from random import randint\n",
+    "\n",
+    "# create client \n",
+    "client = OpenAI(base_url=\"http://localhost:8080/v1\",api_key=\"-\")\n",
+    "\n",
+    "# Load our test dataset\n",
+    "eval_dataset = load_dataset(\"json\", data_files=\"test_dataset.json\", split=\"train\")\n",
+    "rand_idx = randint(0, len(eval_dataset))\n",
+    "\n",
+    "# Take a random sample from the dataset and remove the last message and send it to the model\n",
+    "response = client.chat.completions.create(\n",
+    "\tmodel=\"code-llama-3-1-8b-text-to-sql\",\n",
+    "\tmessages=eval_dataset[rand_idx][\"messages\"][:2],\n",
+    "\tstream=False, # no streaming\n",
+    "\tmax_tokens=1024,\n",
+    ")\n",
+    "response = response.choices[0].message.content\n",
+    "\n",
+    "# Print results\n",
+    "print(f\"Query:\\n{eval_dataset[rand_idx]['messages'][1]['content']}\")\n",
+    "print(f\"Original Answer:\\n{eval_dataset[rand_idx]['messages'][2]['content']}\")\n",
+    "print(f\"Generated Answer:\\n{response}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Awesome, Don't forget to stop your container once you are done."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "tgi\n"
+     ]
+    }
+   ],
+   "source": [
+    "!docker stop tgi"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "pytorch",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.10"
+  },
+  "orig_nbformat": 4,
+  "vscode": {
+   "interpreter": {
+    "hash": "2d58e898dde0263bc564c6968b04150abacfd33eed9b19aaa8e45c040360e146"
+   }
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
diff --git a/examples/notebooks/sft.ipynb b/examples/notebooks/sft.ipynb
new file mode 100644
index 0000000000..ff46b31bd4
--- /dev/null
+++ b/examples/notebooks/sft.ipynb
@@ -0,0 +1,122 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Build Your Chatbot with Supervised Fine-Tuning\n",
+    "\n",
+    "SFT, or Supervised Fine-Tuning, is a method used in the development of Large Language Models (LLMs) to improve model performance by training it on specific tasks or datasets with labeled examples. It’s an essential process for aligning a base model—often a general-purpose LLM that’s been pre-trained on vast amounts of unsupervised text data—to perform well on more specialized tasks or to follow specific user instructions effectively.\n",
+    "\n",
+    "In this notebook, we will demonstrate how to fine-tune a pre-trained Qwen model on a conversational dataset to create a chatbot."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Pre-requisites\n",
+    "\n",
+    "To run this notebook, you need to install TRL:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "%pip install trl"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Load the model"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from transformers import AutoTokenizer, AutoModelForCausalLM\n",
+    "\n",
+    "model_id = \"Qwen/Qwen2.5-0.5B\"\n",
+    "\n",
+    "tokenizer = AutoTokenizer.from_pretrained(model_id)\n",
+    "model = AutoModelForCausalLM.from_pretrained(model_id)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Let's try the model a bit before fine-tuning it."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stderr",
+     "output_type": "stream",
+     "text": [
+      "Device set to use cuda\n"
+     ]
+    },
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Once upon a time, there was a little girl named Lily. She loved to play with her toys and watch the stars. One day, she found a special toy that could make her see in different colors. She named it the \"Color Wheel.\" \n",
+      "\n",
+      "One day, Lily decided to play with her toy and saw a rainbow of colors. She noticed that the colors were not in order, but she wanted to know how to make them look like they were in order. \n",
+      "\n",
+      "Lily thought about it and decided to use her \"Color Wheel\" to help her. She started by looking at the colors in the rainbow and trying to figure out how to make them look like they were\n"
+     ]
+    }
+   ],
+   "source": [
+    "from transformers import pipeline\n",
+    "\n",
+    "prompt = \"Once upon a time, there was a\"\n",
+    "generator = pipeline(\"text-generation\", model=model, tokenizer=tokenizer, device=\"cuda\")\n",
+    "output = generator(prompt, max_new_tokens=128)[0]\n",
+    "print(output[\"generated_text\"])\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Fascinating story."
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "trl",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.9"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}