agentscope-ai · pan-x-c · Aug 26, 2025 · Aug 26, 2025
diff --git a/README.md b/README.md
@@ -26,7 +26,7 @@
 * [2025-08] 🎵 We introduce [CHORD](https://github.com/modelscope/Trinity-RFT/tree/main/examples/mix_chord), a dynamic integration of SFT and RL for enhanced LLM fine-tuning ([paper](https://arxiv.org/pdf/2508.11408)).
 * [2025-08] ✨ Trinity-RFT v0.2.1 is released! Enhanced features include:
   * Agentic RL: support training with general multi-step agentic workflows; check out the [ALFWorld](./docs/sphinx_doc/source/tutorial/example_step_wise.md) and [ReAct](./docs/sphinx_doc/source/tutorial/example_react.md) examples.
-  * Rollout-Training scheduling: introduce Scheduler, [Synchronizer](./docs/sphinx_doc/source/tutorial/synchronizer.html) and priority queue buffer, which facilitates more efficient and dynamic scheduling of the RFT process.
+  * Rollout-Training scheduling: introduce Scheduler, [Synchronizer](./docs/sphinx_doc/source/tutorial/synchronizer.md) and priority queue buffer, which facilitates more efficient and dynamic scheduling of the RFT process.
   * [A benchmark tool](./benchmark) for quick verification and experimentation.
   * RL algorithms: implement [GSPO](https://github.com/modelscope/Trinity-RFT/pull/154), [AsymRE](https://github.com/modelscope/Trinity-RFT/pull/187), [TOPR, CISPO](https://github.com/modelscope/Trinity-RFT/pull/185), [RAFT](https://github.com/modelscope/Trinity-RFT/pull/174).
 * [2025-07] Trinity-RFT v0.2.0 is released.
@@ -362,8 +362,8 @@ Guidelines for developers and researchers:
 
 + [Build new RL scenarios](./docs/sphinx_doc/source/tutorial/trinity_programming_guide.md#workflows-for-rl-environment-developers)
 + [Implement new RL algorithms](./docs/sphinx_doc/source/tutorial/trinity_programming_guide.md#algorithms-for-rl-algorithm-developers)
-+ [Develop new data operators](./docs/sphinx_doc/source/tutorial/trinity_programming_guide.html#operators-for-data-developers)
-+ [Understand the coordination between explorer and trainer](./docs/sphinx_doc/source/tutorial/synchronizer.html)
++ [Develop new data operators](./docs/sphinx_doc/source/tutorial/trinity_programming_guide.md#operators-for-data-developers)
++ [Understand the coordination between explorer and trainer](./docs/sphinx_doc/source/tutorial/synchronizer.md)
 
 
 

diff --git a/README_zh.md b/README_zh.md
@@ -25,7 +25,7 @@
 * [2025-08] 🎵 我们推出了 [CHORD](https://github.com/modelscope/Trinity-RFT/tree/main/examples/mix_chord)，一种动态整合 SFT 和 RL 来微调 LLM 的方法（[论文](https://arxiv.org/pdf/2508.11408)）。
 * [2025-08] ✨ 发布 Trinity-RFT v0.2.1 版本！新增功能包括：
   * 智能体 RL：支持通用多轮工作流的训练；请参考 [ALFWorld](./docs/sphinx_doc/source/tutorial/example_step_wise.md) 和 [ReAct](./docs/sphinx_doc/source/tutorial/example_react.md) 例子。
-  * Rollout-Training 调度: 通过引入 Scheduler, [Synchronizer](./docs/sphinx_doc/source/tutorial/synchronizer.html) 以及优先队列类型 Buffer, 支持 RFT 流程中更高效与灵活的调度。
+  * Rollout-Training 调度: 通过引入 Scheduler, [Synchronizer](./docs/sphinx_doc/source/tutorial/synchronizer.md) 以及优先队列类型 Buffer, 支持 RFT 流程中更高效与灵活的调度。
   * [Benchmark 工具](./benchmark)，用于快速验证与实验。
   * RL 算法：实现 [GSPO](https://github.com/modelscope/Trinity-RFT/pull/154), [AsymRE](https://github.com/modelscope/Trinity-RFT/pull/187), [TOPR, CISPO](https://github.com/modelscope/Trinity-RFT/pull/185), [RAFT](https://github.com/modelscope/Trinity-RFT/pull/174) 等算法。
 * [2025-07] 发布 Trinity-RFT v0.2.0 版本，新增了多项功能优化。
@@ -361,8 +361,8 @@ trinity run --config examples/grpo_gsm8k/gsm8k.yaml
 
 + [构建新的 RL 场景](./docs/sphinx_doc/source/tutorial/trinity_programming_guide.md#workflows-for-rl-environment-developers)
 + [实现新的 RL 算法](./docs/sphinx_doc/source/tutorial/trinity_programming_guide.md#algorithms-for-rl-algorithm-developers)
-+ [开发新的数据处理操作](./docs/sphinx_doc/source/tutorial/trinity_programming_guide.html#operators-for-data-developers)
-+ [理解 explorer-trainer 调度逻辑](./docs/sphinx_doc/source/tutorial/synchronizer.html)
++ [开发新的数据处理操作](./docs/sphinx_doc/source/tutorial/trinity_programming_guide.md#operators-for-data-developers)
++ [理解 explorer-trainer 调度逻辑](./docs/sphinx_doc/source/tutorial/synchronizer.md)
 
 
 

diff --git a/docs/sphinx_doc/source/main.md b/docs/sphinx_doc/source/main.md
@@ -10,9 +10,9 @@
 
 * [2025-08] 🎵 We introduce [CHORD](https://github.com/modelscope/Trinity-RFT/tree/main/examples/mix_chord), a dynamic integration of SFT and RL for enhanced LLM fine-tuning ([paper](https://arxiv.org/pdf/2508.11408)).
 * [2025-08] ✨ Trinity-RFT v0.2.1 is released! Enhanced features include:
-  * Agentic RL: support training with general multi-step agentic workflows; check out the [ALFWorld](./docs/sphinx_doc/source/tutorial/example_step_wise.md) and [ReAct](./docs/sphinx_doc/source/tutorial/example_react.md) examples.
-  * Rollout-Training scheduling: introduce Scheduler, [Synchronizer](./docs/sphinx_doc/source/tutorial/synchronizer.html) and priority queue buffer, which facilitates more efficient and dynamic scheduling of the RFT process.
-  * [A benchmark tool](./benchmark) for quick verification and experimentation.
+  * Agentic RL: support training with general multi-step agentic workflows; check out the [ALFWorld](/tutorial/example_step_wise.md) and [ReAct](/tutorial/example_react.md) examples.
+  * Rollout-Training scheduling: introduce Scheduler, [Synchronizer](/tutorial/synchronizer.md) and priority queue buffer, which facilitates more efficient and dynamic scheduling of the RFT process.
+  * [A benchmark tool](https://github.com/modelscope/Trinity-RFT/tree/main/benchmark) for quick verification and experimentation.
   * RL algorithms: implement [GSPO](https://github.com/modelscope/Trinity-RFT/pull/154), [AsymRE](https://github.com/modelscope/Trinity-RFT/pull/187), [TOPR, CISPO](https://github.com/modelscope/Trinity-RFT/pull/185), [RAFT](https://github.com/modelscope/Trinity-RFT/pull/174).
 * [2025-07] Trinity-RFT v0.2.0 is released.
 * [2025-07] We update the [technical report](https://arxiv.org/abs/2505.17826) (arXiv v2) with new features, examples, and experiments.
@@ -341,8 +341,8 @@ Guidelines for developers and researchers:
 
 + {ref}`Build new RL scenarios <Workflows>`
 + {ref}`Implement new RL algorithms <Algorithms>`
-+ [Develop new data operators](/tutorial/trinity_programming_guide.html#operators-for-data-developers)
-+ [Understand the coordination between explorer and trainer](/tutorial/synchronizer.html)
++ [Develop new data operators](/tutorial/trinity_programming_guide.md#operators-for-data-developers)
++ [Understand the coordination between explorer and trainer](/tutorial/synchronizer.md)
 
 
 For some frequently asked questions, see [FAQ](/tutorial/faq.md).