Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .github/workflows/docker/docker-compose.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
services:
trinity-node-1:
image: trinity-rft-unittest:20250918
image: trinity-rft-unittest:20250924
pull_policy: never
command: sh -c "pip install -e .[dev] && ray start --head --dashboard-host 0.0.0.0 --include-dashboard true --block"
environment:
Expand Down Expand Up @@ -28,7 +28,7 @@ services:
capabilities: [gpu]

trinity-node-2:
image: trinity-rft-unittest:20250918
image: trinity-rft-unittest:20250924
pull_policy: never
command: sh -c "pip install -e .[dev] && ray start --address=trinity-node-1:6379 --block"
environment:
Expand Down
133 changes: 53 additions & 80 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,47 +18,36 @@

</div>


## 🚀 News

* [2025-09] ✨ [[Release Notes](https://github.com/modelscope/Trinity-RFT/releases/tag/v0.3.0)] Trinity-RFT v0.3.0 released: enhanced Buffer, FSDP2 & Megatron support, multi-modal models, and new RL algorithms/examples.
* [2025-08] 🎵 Introducing [CHORD](https://github.com/modelscope/Trinity-RFT/tree/main/examples/mix_chord): dynamic SFT + RL integration for advanced LLM fine-tuning ([paper](https://arxiv.org/pdf/2508.11408)).
* [2025-08] [[Release Notes](https://github.com/modelscope/Trinity-RFT/releases/tag/v0.2.1)] Trinity-RFT v0.2.1 released.
* [2025-07] [[Release Notes](https://github.com/modelscope/Trinity-RFT/releases/tag/v0.2.0)] Trinity-RFT v0.2.0 released.
* [2025-07] Technical report (arXiv v2) updated with new features, examples, and experiments: [link](https://arxiv.org/abs/2505.17826).
* [2025-06] [[Release Notes](https://github.com/modelscope/Trinity-RFT/releases/tag/v0.1.1)] Trinity-RFT v0.1.1 released.
* [2025-05] [[Release Notes](https://github.com/modelscope/Trinity-RFT/releases/tag/v0.1.0)] Trinity-RFT v0.1.0 released, plus [technical report](https://arxiv.org/abs/2505.17826).
* [2025-04] Trinity-RFT open sourced.


## 💡 What is Trinity-RFT?

Trinity-RFT is a flexible, general-purpose framework for reinforcement fine-tuning (RFT) of large language models (LLMs). It supports a wide range of applications and provides a unified platform for RL research in the [era of experience](https://storage.googleapis.com/deepmind-media/Era-of-Experience%20/The%20Era%20of%20Experience%20Paper.pdf).
Trinity-RFT is a flexible, general-purpose framework for reinforcement fine-tuning (RFT) of large language models (LLMs). It provides three independent modules for users with different needs:

The RFT process is modularized into three core components:
* 🤖 **Explorer**:For agent application developers. [[tutorial]](https://modelscope.github.io/Trinity-RFT/en/main/tutorial/develop_workflow.html)
- Train an agent application to enhance its ability to complete tasks in a specified environment
- Examples: [Multi-Turn Interaction](https://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_multi_turn.html), [ReAct Agent](https://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_react.html)

* **Explorer**: Handles agent-environment interaction
* **Trainer**: Manages model training
* **Buffer**: Manages data storage and processing
* 🧠 **Trainer**: For RL algorithm researchers. [[tutorial]](https://modelscope.github.io/Trinity-RFT/en/main/tutorial/develop_algorithm.html)
- Design and validate new RL algorithms in compact, plug-and-play classes
- Examples: [Mixture of RL Algorithms](https://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_mix_algo.html)

* 🗄️ **Buffer**: For data engineers. [[tutorial]](https://modelscope.github.io/Trinity-RFT/en/main/tutorial/develop_operator.html)
- Design task-specific datasets and build data pipelines for cleaning, augmentation, and human-in-the-loop scenarios
- Examples: [Data Functionalities](https://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_data_functionalities.html)

<img src="https://img.alicdn.com/imgextra/i2/O1CN01H3UbpF1yP7E1OCLbi_!!6000000006570-2-tps-1334-638.png" alt="The high-level design of Trinity-RFT" width="800" />



## ✨ Key Features
Trinity-RFT unifies the above three modules and provides the following key features:

* **Flexible RFT Modes:**
- Supports synchronous/asynchronous, on-policy/off-policy, and online/offline training. Rollout and training can run separately and scale independently across devices.

<img src="https://img.alicdn.com/imgextra/i3/O1CN01E7NskS1FFoTI9jlaQ_!!6000000000458-2-tps-1458-682.png" alt="RFT modes supported by Trinity-RFT" width="600" />

* **Agent Framework Compatible Workflows:**
- Supports both concatenated and general multi-turn agentic workflows. Automatically collects training data from model API clients (e.g., OpenAI) and is compatible with agent frameworks like AgentScope.
* **General Agentic-RL Support:**
- Supports both concatenated and general multi-turn agentic workflows. Able to directly train agent applications developed using agent frameworks like AgentScope.

<img src="https://img.alicdn.com/imgextra/i1/O1CN01z1i7kk1jlMEVa8ZHV_!!6000000004588-2-tps-1262-695.png" alt="Agentic workflows" width="600" />

* **Powerful Data Pipelines:**
* **Full Lifecycle Data Pipelines:**
- Enables pipeline processing of rollout and experience data, supporting active management (prioritization, cleaning, augmentation) throughout the RFT lifecycle.

<img src="https://img.alicdn.com/imgextra/i2/O1CN01BfeHp61sXSlGjH7zQ_!!6000000005776-2-tps-1734-473.png" alt="Data pipeline design" width="600" />
Expand All @@ -69,26 +58,24 @@ The RFT process is modularized into three core components:
<img src="https://img.alicdn.com/imgextra/i1/O1CN01Ti0o4320RywoAuyhN_!!6000000006847-2-tps-3840-2134.png" alt="System architecture" width="600" />


## 🚀 News

* [2025-09] ✨ [[Release Notes](https://github.com/modelscope/Trinity-RFT/releases/tag/v0.3.0)] Trinity-RFT v0.3.0 released: enhanced Buffer, FSDP2 & Megatron support, multi-modal models, and new RL algorithms/examples.
* [2025-08] 🎵 Introducing [CHORD](https://github.com/modelscope/Trinity-RFT/tree/main/examples/mix_chord): dynamic SFT + RL integration for advanced LLM fine-tuning ([paper](https://arxiv.org/pdf/2508.11408)).
* [2025-08] [[Release Notes](https://github.com/modelscope/Trinity-RFT/releases/tag/v0.2.1)] Trinity-RFT v0.2.1 released.
* [2025-07] [[Release Notes](https://github.com/modelscope/Trinity-RFT/releases/tag/v0.2.0)] Trinity-RFT v0.2.0 released.
* [2025-07] Technical report (arXiv v2) updated with new features, examples, and experiments: [link](https://arxiv.org/abs/2505.17826).
* [2025-06] [[Release Notes](https://github.com/modelscope/Trinity-RFT/releases/tag/v0.1.1)] Trinity-RFT v0.1.1 released.
* [2025-05] [[Release Notes](https://github.com/modelscope/Trinity-RFT/releases/tag/v0.1.0)] Trinity-RFT v0.1.0 released, plus [technical report](https://arxiv.org/abs/2505.17826).
* [2025-04] Trinity-RFT open sourced.

## 🛠️ What can I use Trinity-RFT for?

* **Train agent applications with RL and minimal migration cost** [[Tutorial]](https://modelscope.github.io/Trinity-RFT/en/main/tutorial/trinity_programming_guide.html#workflows-for-rl-environment-developers)
- Implement agent-environment interaction logic in a single workflow class ([example1](https://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_multi_turn.html), [example2](https://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_step_wise.html)),
- Or import workflows from agent frameworks like AgentScope ([example](https://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_react.html)).

* **Rapid RL algorithm design and validation** [[Tutorial]](https://modelscope.github.io/Trinity-RFT/en/main/tutorial/trinity_programming_guide.html#algorithms-for-rl-algorithm-developers)
- Develop custom RL algorithms (loss design, sampling strategy, etc.) in compact, plug-and-play classes ([example](https://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_mix_algo.html)).

* **Custom datasets and data pipelines for RFT** [[Tutorial]](https://modelscope.github.io/Trinity-RFT/en/main/tutorial/trinity_programming_guide.html#operators-for-data-developers)
- Design task-specific datasets and build data pipelines for cleaning, augmentation, and human-in-the-loop scenarios ([example](https://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_data_functionalities.html)).

---

## Table of contents


- [Getting started](#getting-started)
- [Quick Start](#quick-start)
- [Step 1: installation](#step-1-installation)
- [Step 2: prepare dataset and model](#step-2-prepare-dataset-and-model)
- [Step 3: configurations](#step-3-configurations)
Expand All @@ -101,7 +88,7 @@ The RFT process is modularized into three core components:



## Getting started
## Quick Start


> [!NOTE]
Expand All @@ -110,18 +97,16 @@ The RFT process is modularized into three core components:

### Step 1: installation

#### Prerequisites

Before installing, make sure your system meets the following requirements:

- **Python**: version 3.10 to 3.12 (inclusive)
- **CUDA**: version 12.4 to 12.8 (inclusive)
- **GPUs**: at least 2 GPUs


#### Option A: Install from Source (Recommended)
#### From Source (Recommended)

This method gives you full control and is best if you plan to customize or contribute to the project.
If you plan to customize or contribute to Trinity-RFT, this is the best option.

##### 1. Clone the Repository

Expand All @@ -132,81 +117,71 @@ cd Trinity-RFT

##### 2. Set Up a Virtual Environment

Choose one of the following options to create an isolated environment:
Choose one of the following options:

###### Using Conda

```bash
conda create -n trinity python=3.10
conda activate trinity

pip install -e ".[dev]"
pip install -e ".[flash_attn]"
# if you encounter issues when installing flash-attn, try:
# pip install flash-attn==2.8.1 --no-build-isolation
```

###### Using venv

```bash
python3.10 -m venv .venv
source .venv/bin/activate
```

##### 3. Install the Package

Install in editable mode so you can make changes without reinstalling:

```bash
pip install -e ".[dev]"
pip install -e ".[flash_attn]"
# if you encounter issues when installing flash-attn, try:
# pip install flash-attn==2.8.1 --no-build-isolation
```

##### 4. Install Flash Attention

Flash Attention boosts training speed. It takes a few minutes to compile — please be patient!

```bash
pip install flash-attn==2.8.1
```
###### Using `uv`

If you encounter issues during installation, try this alternative:
[`uv`](https://github.com/astral-sh/uv) is a modern Python package installer.

```bash
pip install flash-attn==2.8.1 --no-build-isolation
uv sync --extra dev --extra flash_attn
```


##### ⚡ Fast Alternative: Use `uv` (Optional)
#### Via PyPI

If you'd like a faster installation, try [`uv`](https://github.com/astral-sh/uv), a modern Python package installer:
If you just want to use the package without modifying the code:

```bash
uv venv
source .venv/bin/activate

uv pip install -e ".[dev]"
uv pip install flash-attn==2.8.1 --no-build-isolation
pip install trinity-rft==0.3.0
pip install flash-attn==2.8.1
```

#### Option B: Install via pip (Quick Start)

If you just want to use the package without modifying the code:
Or with `uv`:

```bash
pip install trinity-rft==0.3.0
pip install flash-attn==2.8.1 # Install Flash Attention separately

# Use uv to install trinity-rft
# uv pip install trinity-rft==0.3.0
# uv pip install flash-attn==2.8.1
uv pip install trinity-rft==0.3.0
uv pip install flash-attn==2.8.1
```

#### Option C: Use Docker

#### Using Docker

We provide a Docker setup for hassle-free environment configuration.

```bash
git clone https://github.com/modelscope/Trinity-RFT
cd Trinity-RFT

## Build the Docker image
# Build the Docker image
## Tip: You can modify the Dockerfile to add mirrors or set API keys
docker build -f scripts/docker/Dockerfile -t trinity-rft:latest .

## Run the container
# Run the container, replacing <path_to_your_data_and_checkpoints> with your actual path
docker run -it \
--gpus all \
--shm-size="64g" \
Expand All @@ -216,9 +191,7 @@ docker run -it \
trinity-rft:latest
```

💡 **Note**: Replace `<path_to_your_data_and_checkpoints>` with the actual path on your machine where datasets and model checkpoints are stored.

> If you'd like to integrate with **Megatron-LM**, check out our [example setup guide for Megatron](https://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_megatron.html).
> For training with **Megatron-LM**, please refer to [Megatron-LM Backend](https://modelscope.github.io/Trinity-RFT/en/main/tutorial/example_megatron.html).

### Step 2: prepare dataset and model

Expand Down
Loading