diff --git a/README.md b/README.md index 2d1404bdb3..2c70a709db 100644 --- a/README.md +++ b/README.md @@ -83,7 +83,7 @@ Trinity-RFT is a flexible, general-purpose framework for reinforcement fine-tuni ## 🚀 News * [2025-11] Introducing [BOTS](https://github.com/modelscope/Trinity-RFT/tree/main/examples/bots): online RL task selection for efficient LLM fine-tuning ([paper](https://arxiv.org/pdf/2510.26374)). -* [2025-10] [[Release Notes](https://github.com/modelscope/Trinity-RFT/releases/tag/v0.3.2)] Trinity-RFT v0.3.2 released: bug fixes and advanced task selection & scheduling. +* [2025-11] [[Release Notes](https://github.com/modelscope/Trinity-RFT/releases/tag/v0.3.2)] Trinity-RFT v0.3.2 released: bug fixes and advanced task selection & scheduling. * [2025-10] [[Release Notes](https://github.com/modelscope/Trinity-RFT/releases/tag/v0.3.1)] Trinity-RFT v0.3.1 released: multi-stage training support, improved agentic RL examples, LoRA support, debug mode and new RL algorithms. * [2025-09] [[Release Notes](https://github.com/modelscope/Trinity-RFT/releases/tag/v0.3.0)] Trinity-RFT v0.3.0 released: enhanced Buffer, FSDP2 & Megatron support, multi-modal models, and new RL algorithms/examples. * [2025-08] Introducing [CHORD](https://github.com/modelscope/Trinity-RFT/tree/main/examples/mix_chord): dynamic SFT + RL integration for advanced LLM fine-tuning ([paper](https://arxiv.org/pdf/2508.11408)). diff --git a/README_zh.md b/README_zh.md index e8700d83ea..2a6c8e03d7 100644 --- a/README_zh.md +++ b/README_zh.md @@ -83,8 +83,8 @@ Trinity-RFT 是一个灵活、通用的大语言模型(LLM)强化微调(RF ## 🚀 新闻 -* [2025-11] 推出 [BOTS](https://github.com/modelscope/Trinity-RFT/tree/main/examples/bots):在线RL任务选择,实现高效LLM微调([论文](https://arxiv.org/pdf/2510.26374))。 -* [2025-10] [[发布说明](https://github.com/modelscope/Trinity-RFT/releases/tag/v0.3.2)] Trinity-RFT v0.3.2 发布:修复若干 Bug 并支持进阶的任务选择和调度。 +* [2025-11] 推出 [BOTS](https://github.com/modelscope/Trinity-RFT/tree/main/examples/bots):在线 RL 任务选择,实现高效 LLM 微调([论文](https://arxiv.org/pdf/2510.26374))。 +* [2025-11] [[发布说明](https://github.com/modelscope/Trinity-RFT/releases/tag/v0.3.2)] Trinity-RFT v0.3.2 发布:修复若干 Bug 并支持进阶的任务选择和调度。 * [2025-10] [[发布说明](https://github.com/modelscope/Trinity-RFT/releases/tag/v0.3.1)] Trinity-RFT v0.3.1 发布:多阶段训练支持、改进的智能体 RL 示例、LoRA 支持、调试模式和全新 RL 算法。 * [2025-09] [[发布说明](https://github.com/modelscope/Trinity-RFT/releases/tag/v0.3.0)] Trinity-RFT v0.3.0 发布:增强的 Buffer、FSDP2 & Megatron 支持,多模态模型,以及全新 RL 算法/示例。 * [2025-08] 推出 [CHORD](https://github.com/modelscope/Trinity-RFT/tree/main/examples/mix_chord):动态 SFT + RL 集成,实现进阶 LLM 微调([论文](https://arxiv.org/pdf/2508.11408))。 diff --git a/examples/bots/workflow/bots_math_boxed_workflow.py b/examples/bots/workflow/bots_math_boxed_workflow.py index 1596aa87f6..d94d338357 100644 --- a/examples/bots/workflow/bots_math_boxed_workflow.py +++ b/examples/bots/workflow/bots_math_boxed_workflow.py @@ -19,12 +19,12 @@ def format_messages(self): return self.task_desc @property - def task_desc(self) -> Union[str, None]: + def task_desc(self) -> Union[str, None]: # type: ignore [override] prompt_key = self.format_args.prompt_key return nested_query(prompt_key, self.raw_task) # type: ignore @property - def truth(self) -> Union[str, None]: + def truth(self) -> Union[str, None]: # type: ignore [override] response_key = self.format_args.response_key return nested_query(response_key, self.raw_task) diff --git a/trinity/common/workflows/workflow.py b/trinity/common/workflows/workflow.py index 38881f22a7..8a493e161f 100644 --- a/trinity/common/workflows/workflow.py +++ b/trinity/common/workflows/workflow.py @@ -60,6 +60,18 @@ def to_workflow( auxiliary_models=auxiliary_models, ) + # Deprecated property, will be removed in the future + @property + def task_desc(self) -> Union[str, None]: + prompt_key = self.format_args.prompt_key + return self.raw_task[prompt_key] if prompt_key in self.raw_task else None # type: ignore + + # Deprecated property, will be removed in the future + @property + def truth(self) -> Union[str, None]: + response_key = self.format_args.response_key + return self.raw_task[response_key] if response_key in self.raw_task else None # type: ignore + def to_dict(self) -> dict: return self.raw_task # type: ignore