You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
| Reinforcement Learning |[`GRPOTrainer`]| Post training an LLM for reasoning with GRPO in TRL |[Sergio Paniego](https://huggingface.co/sergiopaniego)|[Link](https://huggingface.co/learn/cookbook/fine_tuning_llm_grpo_trl)|[](https://colab.research.google.com/github/huggingface/cookbook/blob/main/notebooks/en/fine_tuning_llm_grpo_trl.ipynb)|
@@ -15,16 +17,21 @@ Community tutorials are made by active members of the Hugging Face community who
15
17
| Preference Optimization |[`ORPOTrainer`]| Fine-tuning Llama 3 with ORPO combining instruction tuning and preference alignment |[Maxime Labonne](https://huggingface.co/mlabonne)|[Link](https://mlabonne.github.io/blog/posts/2024-04-19_Fine_tune_Llama_3_with_ORPO.html)|[](https://colab.research.google.com/drive/1eHNWg9gnaXErdAa8_mcvjMupbSS6rDvi)|
16
18
| Instruction tuning |[`SFTTrainer`]| How to fine-tune open LLMs in 2025 with Hugging Face |[Philipp Schmid](https://huggingface.co/philschmid)|[Link](https://www.philschmid.de/fine-tune-llms-in-2025)|[](https://colab.research.google.com/github/philschmid/deep-learning-pytorch-huggingface/blob/main/training/fine-tune-llms-in-2025.ipynb)|
17
19
18
-
<Youtubeid="cnGyyM0vOes" />
19
20
20
-
<Youtubeid="jKdXv3BiLu0" />
21
+
### Videos
22
+
23
+
| Task | Title | Author | Video |
24
+
| --- | --- | --- | --- |
25
+
| Instruction tuning | Fine-tuning open AI models using Hugging Face TRL |[Wietse Venema](https://huggingface.co/wietsevenema)|[<imgsrc="https://img.youtube.com/vi/cnGyyM0vOes/0.jpg">](https://youtu.be/cnGyyM0vOes)|
26
+
| Instruction tuning | How to fine-tune a smol-LM with Hugging Face, TRL, and the smoltalk Dataset |[Mayurji](https://huggingface.co/iammayur)|[<imgsrc="https://img.youtube.com/vi/jKdXv3BiLu0/0.jpg">](https://youtu.be/jKdXv3BiLu0)|
27
+
21
28
22
29
<details>
23
-
<summary>⚠️ Deprecated features notice (click to expand)</summary>
30
+
<summary>⚠️ Deprecated features notice for "How to fine-tune a smol-LM with Hugging Face, TRL, and the smoltalk Dataset" (click to expand)</summary>
24
31
25
32
<Tipwarning={true}>
26
33
27
-
The tutorial above uses two deprecated features:
34
+
The tutorial uses two deprecated features:
28
35
-`SFTTrainer(..., tokenizer=tokenizer)`: Use `SFTTrainer(..., processing_class=tokenizer)` instead, or simply omit it (it will be inferred from the model).
29
36
-`setup_chat_format(model, tokenizer)`: Use `SFTConfig(..., chat_template_path="Qwen/Qwen3-0.6B")`, where `chat_template_path` specifies the model whose chat template you want to copy.
30
37
@@ -34,6 +41,8 @@ The tutorial above uses two deprecated features:
0 commit comments