You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/source/example_overview.md
+1Lines changed: 1 addition & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -29,6 +29,7 @@ These notebooks are easier to run and are designed for quick experimentation wit
29
29
30
30
| Notebook | Description | Open in Colab |
31
31
|----------|-------------|---------------|
32
+
|[`openenv_wordle_grpo.ipynb`](https://github.com/huggingface/trl/tree/main/examples/notebooks/openenv_wordle_grpo.ipynb)| GRPO to play Worldle on an OpenEnv environment |[](https://colab.research.google.com/github/huggingface/trl/blob/main/examples/notebooks/openenv_wordle_grpo.ipynb)|
32
33
|[`sft_trl_lora_qlora.ipynb`](https://github.com/huggingface/trl/tree/main/examples/notebooks/sft_trl_lora_qlora.ipynb)| Supervised Fine-Tuning (SFT) using QLoRA on free Colab |[](https://colab.research.google.com/github/huggingface/trl/blob/main/examples/notebooks/sft_trl_lora_qlora.ipynb)|
33
34
|[`sft_qwen_vl.ipynb`](https://github.com/huggingface/trl/tree/main/examples/notebooks/sft_qwen_vl.ipynb)| Supervised Fine-Tuning (SFT) Qwen3-VL with QLoRA using TRL on free Colab |[](https://colab.research.google.com/github/huggingface/trl/blob/main/examples/notebooks/sft_qwen_vl.ipynb)|
34
35
|[`grpo_qwen3_vl.ipynb`](https://github.com/huggingface/trl/tree/main/examples/notebooks/grpo_qwen3_vl.ipynb)| GRPO Qwen3-VL with QLoRA using TRL on free Colab |[](https://colab.research.google.com/github/huggingface/trl/blob/main/examples/notebooks/grpo_qwen3_vl.ipynb)|
Copy file name to clipboardExpand all lines: examples/notebooks/README.md
+1Lines changed: 1 addition & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,6 +4,7 @@ This directory contains a collection of Jupyter notebooks that demonstrate how t
4
4
5
5
| Notebook | Description | Open in Colab |
6
6
| --- | --- | --- |
7
+
|[`openenv_wordle_grpo.ipynb`](https://github.com/huggingface/trl/tree/main/examples/notebooks/openenv_wordle_grpo.ipynb)| GRPO to play Worldle on an OpenEnv environment |[](https://colab.research.google.com/github/huggingface/trl/blob/main/examples/notebooks/openenv_wordle_grpo.ipynb)|
7
8
|[`sft_trl_lora_qlora.ipynb`](https://github.com/huggingface/trl/tree/main/examples/notebooks/sft_trl_lora_qlora.ipynb)| Supervised Fine-Tuning (SFT) using QLoRA on free Colab |[](https://colab.research.google.com/github/huggingface/trl/blob/main/examples/notebooks/sft_trl_lora_qlora.ipynb)|
8
9
|[`sft_qwen_vl.ipynb`](https://github.com/huggingface/trl/tree/main/examples/notebooks/sft_qwen_vl.ipynb)| Supervised Fine-Tuning (SFT) Qwen3-VL with QLoRA using TRL on free Colab |[](https://colab.research.google.com/github/huggingface/trl/blob/main/examples/notebooks/sft_qwen_vl.ipynb)|
9
10
|[`grpo_qwen3_vl.ipynb`](https://github.com/huggingface/trl/tree/main/examples/notebooks/grpo_qwen3_vl.ipynb)| GRPO Qwen3-VL with QLoRA using TRL on free Colab |[](https://colab.research.google.com/github/huggingface/trl/blob/main/examples/notebooks/grpo_qwen3_vl.ipynb)|
0 commit comments