Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/workflows/docker.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@ jobs:
tags: |
type=match,pattern=\d.\d.\d,enable=${{ github.event_name == 'release' }}
type=raw,value=${{ github.event.inputs.version }},enable=${{ github.event_name == 'workflow_dispatch' }}
type=raw,value=latest,enable={{is_default_branch}}
# This step uses the `docker/build-push-action` action to build the image, based on your repository's `Dockerfile`. If the build succeeds, it pushes the image to GitHub Packages.
# It uses the `context` parameter to define the build's context as the set of files located in the specified path. For more information, see [Usage](https://github.com/docker/build-push-action#usage) in the README of the `docker/build-push-action` repository.
# It uses the `tags` and `labels` parameters to tag and label the image with the output from the "meta" step.
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/docker/docker-compose.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
services:
trinity-node-1:
image: trinity-rft-unittest:20260126
image: trinity-rft-unittest:20260205
cap_add:
- SYS_PTRACE
pull_policy: never
Expand Down Expand Up @@ -32,7 +32,7 @@ services:
capabilities: [gpu]

trinity-node-2:
image: trinity-rft-unittest:20260126
image: trinity-rft-unittest:20260205
cap_add:
- SYS_PTRACE
pull_policy: never
Expand Down
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ Trinity-RFT provides functionalities for users with different backgrounds and ob

## 🚀 News

* [2026-02] [[Release Notes]](https://github.com/agentscope-ai/Trinity-RFT/releases/tag/v0.5.0) Trinity-RFT v0.5.0 released: colocate mode for single-GPU scenarios, trainer driven weight synchronization, automatic parallelism setting suggestion, and more.
* [2026-01] 🎉 Three papers accepted by ICLR 2026: [CHORD](https://github.com/agentscope-ai/Trinity-RFT/tree/main/examples/mix_chord), [BOTS](https://github.com/agentscope-ai/Trinity-RFT/tree/main/examples/bots), and [Group-relative REINFORCE variants](https://github.com/agentscope-ai/Trinity-RFT/tree/main/examples/rec_gsm8k). Try out these new algorithms in Trinity-RFT!
* [2026-01] [[Release Notes]](https://github.com/agentscope-ai/Trinity-RFT/releases/tag/v0.4.1) Trinity-RFT v0.4.1 released: upgraded verl to v0.7.0, Tinker backend supports OpenAI API, bug fixes.
* [2026-01] Introducing [R3L](https://github.com/shiweijiezero/R3L): a systematic reflect-then-retry RL mechanism with efficient language-guided exploration and stable off-policy learning ([paper](https://arxiv.org/abs/2601.03715)).
Expand Down Expand Up @@ -165,7 +166,7 @@ Before installing, make sure your system meets the following requirements:

- Python: version 3.10 to 3.12 (inclusive)
- CUDA: version >= 12.8
- GPUs: at least 2 GPUs
- GPUs: At least one NVIDIA GPU with [compute capability](https://developer.nvidia.com/cuda/gpus) 8.0 or higher (e.g., RTX 30 series, A100, H100)

**Recommended for first-time users:**

Expand Down
3 changes: 2 additions & 1 deletion README_zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@ Trinity-RFT 面向不同背景和目标的用户提供相应功能:

## 🚀 新闻

* [2026-02] [[发布说明]](https://github.com/agentscope-ai/Trinity-RFT/releases/tag/v0.5.0) Trinity-RFT v0.5.0 发布:单 GPU 场景下的 colocate 模式,trainer 驱动的权重同步,自动并行设置建议等新功能。
* [2026-01] 🎉 三篇论文被 ICLR 2026 接收:[CHORD](https://github.com/agentscope-ai/Trinity-RFT/tree/main/examples/mix_chord)、[BOTS](https://github.com/agentscope-ai/Trinity-RFT/tree/main/examples/bots) 和 [Group-relative REINFORCE 系列变种](https://github.com/agentscope-ai/Trinity-RFT/tree/main/examples/rec_gsm8k)。在 Trinity-RFT 中尝试这些新算法吧!
* [2026-01] [[发布说明]](https://github.com/agentscope-ai/Trinity-RFT/releases/tag/v0.4.1) Trinity-RFT v0.4.1 发布:升级 verl 至 v0.7.0,Tinker 后端支持 OpenAI API,修复若干 Bug。
* [2026-01] 推出 [R3L](https://github.com/shiweijiezero/R3L):基于反思-重试的强化学习机制,由自然语言反馈引导高效探索,并达成稳定的 off-policy 学习([论文](https://arxiv.org/abs/2601.03715))。
Expand Down Expand Up @@ -171,7 +172,7 @@ Trinity-RFT 面向不同背景和目标的用户提供相应功能:

- **Python**:版本 3.10 至 3.12(含)
- **CUDA**:版本 >= 12.8
- **GPU**:至少 2 块 GPU
- **GPU**: 至少一块 [compute capability](https://developer.nvidia.com/cuda/gpus) 为 8.0 或更高的 NVIDIA GPU(例如 RTX 30 系列、A100、H100)

## 源码安装(推荐)

Expand Down
38 changes: 19 additions & 19 deletions docs/sphinx_doc/source/tutorial/trinity_installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ Make sure your system meets these requirements:

- **Python**: 3.10 – 3.12
- **CUDA**: 12.8 or higher
- **GPUs**: At least 2 available
- **GPUs**: At least one NVIDIA GPU with [compute capability](https://developer.nvidia.com/cuda/gpus) 8.0 or higher (e.g., RTX 30 series, A100, H100)

### If you don’t have GPUs or prefer not to use them

Expand Down Expand Up @@ -84,24 +84,6 @@ uv sync --extra vllm --extra dev --extra flash_attn

---

## Via PyPI

If you just want to use the package without modifying the code:

```bash
pip install trinity-rft
pip install flash-attn==2.8.1
```

Or with `uv`:

```bash
uv pip install trinity-rft
uv pip install flash-attn==2.8.1
```

---

## Using Docker

You can download the Trinity-RFT Docker image from Github Container Registry or build it locally.
Expand Down Expand Up @@ -150,6 +132,24 @@ docker run -it \

---

## Via PyPI

If you just want to use the package without modifying the code:

```bash
pip install trinity-rft
pip install flash-attn==2.8.1 --no-build-isolation
```

Or with `uv`:

```bash
uv pip install trinity-rft
uv pip install flash-attn==2.8.1 --no-build-isolation
```

---

```{note}
For training with **Megatron-LM**, please refer to {ref}`Megatron-LM Backend <Megatron-LM>`.
```
Expand Down
6 changes: 3 additions & 3 deletions docs/sphinx_doc/source_zh/tutorial/trinity_installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@

- **Python**:3.10 – 3.12
- **CUDA**:12.8 或更高版本
- **GPU**:至少 2 块可用
- **GPU**:至少一块 [compute capability](https://developer.nvidia.com/cuda/gpus) 为 8.0 或更高的 NVIDIA GPU(例如 RTX 30 系列、A100、H100)

### 如果您没有 GPU 或不希望使用 GPU

Expand Down Expand Up @@ -139,14 +139,14 @@ docker run -it \

```bash
pip install trinity-rft
pip install flash-attn==2.8.1
pip install flash-attn==2.8.1 --no-build-isolation
```

或使用 `uv`:

```bash
uv pip install trinity-rft
uv pip install flash-attn==2.8.1
uv pip install flash-attn==2.8.1 --no-build-isolation
```

---
Expand Down
6 changes: 3 additions & 3 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"

[project]
name = "trinity-rft"
version = "0.5.0.dev0"
version = "0.5.0"
authors = [
{name="Trinity-RFT Team", email="trinity-rft@outlook.com"},
]
Expand Down Expand Up @@ -50,9 +50,9 @@ trinity = "trinity.cli.launcher:main"

[project.optional-dependencies]
vllm = [
"vllm>=0.10.2,<=0.14.1,!=0.12.0",
"vllm>=0.10.2,<=0.15.1,!=0.12.0",
# v0.12.0 has a huge performance regression so we exclude it
# v0.10.2 is the most stable version, but we allow up to 0.14.1 for new features
# v0.10.2 is the most stable version, but we allow up to 0.15.1 for new features
]
data = [
"py-data-juicer>=1.4.3"
Expand Down
2 changes: 1 addition & 1 deletion trinity/__init__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# -*- coding: utf-8 -*-
"""Trinity-RFT (Reinforcement Fine-Tuning)"""

__version__ = "0.5.0.dev0"
__version__ = "0.5.0"
5 changes: 4 additions & 1 deletion trinity/common/models/vllm_model.py
Original file line number Diff line number Diff line change
Expand Up @@ -149,7 +149,10 @@ def __init__(

async def _initialize_tokenizer(self):
if self.tokenizer is None:
self.tokenizer = await self.async_llm.get_tokenizer()
if self.vllm_version >= parse_version("0.15.0"):
self.tokenizer = self.async_llm.get_tokenizer()
else:
self.tokenizer = await self.async_llm.get_tokenizer()
self.tokenizer.truncation_side = "left"

def _initialize_processor(self):
Expand Down
6 changes: 3 additions & 3 deletions trinity/common/models/vllm_patch/worker_patch.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,10 +13,10 @@
def patch_vllm_prompt_logprobs(model_runner: GPUModelRunner): # noqa: C901
"""Patch vLLM model runner to support prompt logprobs extraction."""
version = get_vllm_version()
if version < parse_version("0.10.2") or version > parse_version("0.14.1"):
if version < parse_version("0.10.2") or version > parse_version("0.15.1"):
raise ValueError(
f"Unsupported vllm version: {vllm.__version__}. "
"This patch requires vllm version >= 0.10.2, <= 0.14.1."
"This patch requires vllm version >= 0.10.2, <= 0.15.1."
)
is_v0102 = version == parse_version("0.10.2")

Expand Down Expand Up @@ -150,7 +150,7 @@ def _get_prompt_logprobs_dict_v12(

This is a monkey-patched version of `_get_prompt_logprobs_dict` from
`vllm.v1.worker.gpu_model_runner.GPUModelRunner` (vLLM versions
0.12.0 to 0.14.1).
0.12.0 to 0.15.1).

The original function does not apply temperature scaling to logits when
calculating prompt logprobs, which can lead to incorrect logprob values
Expand Down