Skip to content

Commit

Permalink
πŸ‘¨β€πŸ« smol course links and badges (#2484)
Browse files Browse the repository at this point in the history
* smol course links and badges

* try without space

* revert space
  • Loading branch information
qgallouedec authored Dec 15, 2024
1 parent 117c6d4 commit aeca637
Show file tree
Hide file tree
Showing 4 changed files with 5 additions and 7 deletions.
2 changes: 1 addition & 1 deletion docs/source/dpo_trainer.mdx
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# DPO Trainer

[![](https://img.shields.io/badge/All_models-DPO-blue)](https://huggingface.co/models?other=dpo,trl)
[![](https://img.shields.io/badge/All_models-DPO-blue)](https://huggingface.co/models?other=dpo,trl) [![](https://img.shields.io/badge/smol_course-Chapter_2-yellow)](https://github.com/huggingface/smol-course/tree/main/2_preference_alignment)

## Overview

Expand Down
6 changes: 2 additions & 4 deletions docs/source/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,9 @@
TRL is a full stack library where we provide a set of tools to train transformer language models with Reinforcement Learning, from the Supervised Fine-tuning step (SFT), Reward Modeling step (RM) to the Proximal Policy Optimization (PPO) step.
The library is integrated with πŸ€— [transformers](https://github.com/huggingface/transformers).

<div style="text-align: center">
<img src="https://huggingface.co/datasets/trl-internal-testing/example-images/resolve/main/images/TRL-readme.png">
</div>
## Learn post-training

Check the appropriate sections of the documentation depending on your needs:
Learn post-training with the πŸ€— [smol course](https://github.com/huggingface/smol-course).

## API documentation

Expand Down
2 changes: 1 addition & 1 deletion docs/source/orpo_trainer.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# ORPO Trainer

[![](https://img.shields.io/badge/All_models-ORPO-blue)](https://huggingface.co/models?other=orpo,trl)
[![](https://img.shields.io/badge/All_models-ORPO-blue)](https://huggingface.co/models?other=orpo,trl) [![](https://img.shields.io/badge/smol_course-Chapter_2-yellow)](https://github.com/huggingface/smol-course/tree/main/2_preference_alignment)

## Overview

Expand Down
2 changes: 1 addition & 1 deletion docs/source/sft_trainer.mdx
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Supervised Fine-tuning Trainer

[![](https://img.shields.io/badge/All_models-SFT-blue)](https://huggingface.co/models?other=sft,trl)
[![](https://img.shields.io/badge/All_models-SFT-blue)](https://huggingface.co/models?other=sft,trl) [![](https://img.shields.io/badge/smol_course-Chapter_1-yellow)](https://github.com/huggingface/smol-course/tree/main/1_instruction_tuning)

Supervised fine-tuning (or SFT for short) is a crucial step in RLHF. In TRL we provide an easy-to-use API to create your SFT models and train them with few lines of code on your dataset.

Expand Down

0 comments on commit aeca637

Please sign in to comment.