Sync the config manager with the latest codebase by hiyuchang · Pull Request #332 · agentscope-ai/Trinity-RFT

hiyuchang · 2025-10-21T09:43:33Z

Description

Add sync_style
Fix special cases for SFT
Ensure default_value change with algorithm_type
Add actor_ppo_max_token_len_per_gpu in expert mode

Checklist

Please check the following items before code is ready to be reviewed.

Code has passed all tests
Docstrings have been added/updated in Google Style
Documentation has been updated
Code is ready for review

gemini-code-assist · 2025-10-21T09:44:04Z

Summary of Changes

Hello @hiyuchang, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request focuses on a comprehensive update to the configuration manager, aiming to align it with the latest codebase and enhance its functionality. The changes introduce a new synchronization style, significantly improve support for the SFT algorithm with dedicated dataset configurations, and streamline the overall configuration process. It also refines how various algorithm parameters are managed, ensuring dynamic updates and clearer help messages, while simultaneously cleaning up obsolete settings to simplify the user experience.

Highlights

New Sync Style Configuration: Introduced a new sync_style configuration option for the synchronizer, allowing for more flexible synchronization methods between the explorer and trainer.
Enhanced SFT Algorithm Support: Improved support for the Supervised Fine-Tuning (SFT) algorithm by adding dedicated dataset configurations (sft_dataset_kwargs) and updating conditional logic across various configuration managers to properly handle SFT-specific settings.
Dynamic Configuration Updates: Implemented on_change callbacks for several algorithm-related functions (e.g., sample_strategy, advantage_fn, kl_loss_fn) to dynamically update their associated parameters when the primary function changes, improving user experience and configuration consistency.
Flexible Training Duration Control: Added total_steps as an alternative configuration for controlling training duration, which will take precedence over total_epochs if set.
Actor Model Token Length Control: Integrated actor_ppo_max_token_len_per_gpu into the expert mode configurations, providing finer control over the maximum token length per GPU for the actor model.
Configuration Cleanup and Streamlining: Removed several deprecated or redundant configuration parameters such as engine_type, use_v1, max_timeout, explorer_max_retry_times, and total_training_steps, simplifying the configuration interface.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request effectively synchronizes the configuration manager with the latest codebase, introducing support for the SFT algorithm, adding new configuration options like sync_style, and refactoring various parts of the UI logic. The changes are well-structured and align with the goal of supporting new features and improving user experience.

My review focuses on improving maintainability by reducing code duplication. I've identified a few areas where similar logic is repeated across different functions or conditional blocks. By extracting this repeated logic into helper functions or using more generalized approaches, the code can be made more concise and easier to maintain in the future. I've provided specific suggestions for these refactorings.

trinity/manager/config_manager.py

trinity/manager/config_registry/algorithm_config_manager.py

hiyuchang · 2025-10-21T11:22:37Z

/unittest-module-common

yanxi-chen

Refine some help messages

trinity/manager/config_registry/trainer_config_manager.py

trinity/manager/config_registry/explorer_config_manager.py

trinity/manager/config_registry/buffer_config_manager.py

github-actions · 2025-10-21T12:28:29Z

Summary

Tests 📝	Passed ✅	Failed ❌	Skipped ⏭️	Other ❓	Flaky 🍂	Duration ⏱️
31	31	0	0	0	0	313ms

Tests

Test Name	Status	Duration
tests/common/config_test.py::TestConfig::test_all_examples_are_valid	✅	33ms
tests/common/config_test.py::TestConfig::test_config_flatten	✅	1ms
tests/common/config_test.py::TestConfig::test_continue_from_checkpoint_is_valid	✅	1ms
tests/common/config_test.py::TestConfig::test_load_default_config	✅	3ms
tests/common/config_test.py::TestConfig::test_update_config_from_ray_cluster	✅	1ms
tests/common/experience_test.py::TestEID::test_eid_properties	✅	1ms
tests/common/experience_test.py::TestExperience::test_action_mask_and_logprobs_type	✅	1ms
tests/common/experience_test.py::TestExperience::test_assertions	✅	1ms
tests/common/experience_test.py::TestExperience::test_dpo_experience	✅	1ms
tests/common/experience_test.py::TestExperience::test_gather	✅	1ms
tests/common/experience_test.py::TestExperience::test_hf_datasets_conversion	✅	1ms
tests/common/experience_test.py::TestExperience::test_multi_turn_experience	✅	1ms
tests/common/experience_test.py::TestExperience::test_serialize_deserialize	✅	1ms
tests/common/experience_test.py::TestExperience::test_single_turn_experience	✅	1ms
tests/common/experience_test.py::TestExperience::test_to_dict	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_batch_conversion	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_dpo_experience_batch_conversion	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_experience_model_experience_conversion	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_gather_experiences_with_custom_fields	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_multiturn_experience_batch_converstion	✅	1ms
tests/common/vllm_test.py::ModelWrapperTest_0::test_generate	✅	55ms
tests/common/vllm_test.py::ModelWrapperTest_1::test_generate	✅	35ms
tests/common/vllm_test.py::ModelWrapperTest_2::test_generate	✅	45ms
tests/common/vllm_test.py::TestModelLen_0::test_model_len	✅	21ms
tests/common/vllm_test.py::TestModelLen_1::test_model_len	✅	20ms
tests/common/vllm_test.py::TestAPIServer::test_api	✅	24ms
tests/common/vllm_test.py::TestAsyncAPIServer::test_api_async	✅	24ms
tests/common/vllm_test.py::TestTokenizer::test_action_mask	✅	1ms
tests/common/vllm_test.py::TestTokenizer::test_action_mask_with_tools	✅	1ms
tests/common/vllm_test.py::TestAPIServerToolCall_0_deepseek_r1::test_api_tool_calls	✅	22ms
tests/common/vllm_test.py::TestAPIServerToolCall_1::test_api_tool_calls	✅	20ms

Github Test Reporter by CTRF 💚

hiyuchang · 2025-10-22T07:00:11Z

/unittest-module-common

github-actions · 2025-10-22T07:08:10Z

Summary

Tests 📝	Passed ✅	Failed ❌	Skipped ⏭️	Other ❓	Flaky 🍂	Duration ⏱️
31	31	0	0	0	0	337ms

Tests

Test Name	Status	Duration
tests/common/config_test.py::TestConfig::test_all_examples_are_valid	✅	33ms
tests/common/config_test.py::TestConfig::test_config_flatten	✅	1ms
tests/common/config_test.py::TestConfig::test_continue_from_checkpoint_is_valid	✅	1ms
tests/common/config_test.py::TestConfig::test_load_default_config	✅	22ms
tests/common/config_test.py::TestConfig::test_update_config_from_ray_cluster	✅	1ms
tests/common/experience_test.py::TestEID::test_eid_properties	✅	1ms
tests/common/experience_test.py::TestExperience::test_action_mask_and_logprobs_type	✅	1ms
tests/common/experience_test.py::TestExperience::test_assertions	✅	1ms
tests/common/experience_test.py::TestExperience::test_dpo_experience	✅	1ms
tests/common/experience_test.py::TestExperience::test_gather	✅	1ms
tests/common/experience_test.py::TestExperience::test_hf_datasets_conversion	✅	1ms
tests/common/experience_test.py::TestExperience::test_multi_turn_experience	✅	1ms
tests/common/experience_test.py::TestExperience::test_serialize_deserialize	✅	1ms
tests/common/experience_test.py::TestExperience::test_single_turn_experience	✅	1ms
tests/common/experience_test.py::TestExperience::test_to_dict	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_batch_conversion	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_dpo_experience_batch_conversion	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_experience_model_experience_conversion	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_gather_experiences_with_custom_fields	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_multiturn_experience_batch_converstion	✅	1ms
tests/common/vllm_test.py::ModelWrapperTest_0::test_generate	✅	57ms
tests/common/vllm_test.py::ModelWrapperTest_1::test_generate	✅	36ms
tests/common/vllm_test.py::ModelWrapperTest_2::test_generate	✅	45ms
tests/common/vllm_test.py::TestModelLen_0::test_model_len	✅	21ms
tests/common/vllm_test.py::TestModelLen_1::test_model_len	✅	21ms
tests/common/vllm_test.py::TestAPIServer::test_api	✅	24ms
tests/common/vllm_test.py::TestAsyncAPIServer::test_api_async	✅	24ms
tests/common/vllm_test.py::TestTokenizer::test_action_mask	✅	1ms
tests/common/vllm_test.py::TestTokenizer::test_action_mask_with_tools	✅	1ms
tests/common/vllm_test.py::TestAPIServerToolCall_0_deepseek_r1::test_api_tool_calls	✅	22ms
tests/common/vllm_test.py::TestAPIServerToolCall_1::test_api_tool_calls	✅	20ms

Github Test Reporter by CTRF 💚

hiyuchang · 2025-10-23T07:14:13Z

/unittest-module-common

github-actions · 2025-10-23T07:20:55Z

Summary

Tests 📝	Passed ✅	Failed ❌	Skipped ⏭️	Other ❓	Flaky 🍂	Duration ⏱️
33	33	0	0	0	0	331ms

Tests

Test Name	Status	Duration
tests/common/config_test.py::TestConfig::test_all_examples_are_valid	✅	32ms
tests/common/config_test.py::TestConfig::test_config_flatten	✅	1ms
tests/common/config_test.py::TestConfig::test_continue_from_checkpoint_is_valid	✅	1ms
tests/common/config_test.py::TestConfig::test_default_workflow	✅	1ms
tests/common/config_test.py::TestConfig::test_load_default_config	✅	14ms
tests/common/config_test.py::TestConfig::test_max_token_len_per_gpu_set_correctly	✅	1ms
tests/common/config_test.py::TestConfig::test_update_config_from_ray_cluster	✅	1ms
tests/common/experience_test.py::TestEID::test_eid_properties	✅	1ms
tests/common/experience_test.py::TestExperience::test_action_mask_and_logprobs_type	✅	1ms
tests/common/experience_test.py::TestExperience::test_assertions	✅	1ms
tests/common/experience_test.py::TestExperience::test_dpo_experience	✅	1ms
tests/common/experience_test.py::TestExperience::test_gather	✅	1ms
tests/common/experience_test.py::TestExperience::test_hf_datasets_conversion	✅	1ms
tests/common/experience_test.py::TestExperience::test_multi_turn_experience	✅	1ms
tests/common/experience_test.py::TestExperience::test_serialize_deserialize	✅	1ms
tests/common/experience_test.py::TestExperience::test_single_turn_experience	✅	1ms
tests/common/experience_test.py::TestExperience::test_to_dict	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_batch_conversion	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_dpo_experience_batch_conversion	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_experience_model_experience_conversion	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_gather_experiences_with_custom_fields	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_multiturn_experience_batch_converstion	✅	1ms
tests/common/vllm_test.py::ModelWrapperTest_0::test_generate	✅	57ms
tests/common/vllm_test.py::ModelWrapperTest_1::test_generate	✅	35ms
tests/common/vllm_test.py::ModelWrapperTest_2::test_generate	✅	48ms
tests/common/vllm_test.py::TestModelLen_0::test_model_len	✅	21ms
tests/common/vllm_test.py::TestModelLen_1::test_model_len	✅	21ms
tests/common/vllm_test.py::TestAPIServer::test_api	✅	24ms
tests/common/vllm_test.py::TestAsyncAPIServer::test_api_async	✅	24ms
tests/common/vllm_test.py::TestTokenizer::test_action_mask	✅	1ms
tests/common/vllm_test.py::TestTokenizer::test_action_mask_with_tools	✅	1ms
tests/common/vllm_test.py::TestAPIServerToolCall_0_deepseek_r1::test_api_tool_calls	✅	23ms
tests/common/vllm_test.py::TestAPIServerToolCall_1::test_api_tool_calls	✅	21ms

Github Test Reporter by CTRF 💚

hiyuchang added 5 commits October 21, 2025 14:26

fix sync_style

7c65571

fix sft and dpo

3b96784

Merge branch 'main' into dev/upd_config_manager

9d8eaa0

fix actor_ppo_max_token_len_per_gpu

096b26d

remove typo

df5e1c3

hiyuchang requested a review from chenyushuo October 21, 2025 09:43

gemini-code-assist bot reviewed Oct 21, 2025

View reviewed changes

trinity/manager/config_manager.py Show resolved Hide resolved

trinity/manager/config_registry/algorithm_config_manager.py Show resolved Hide resolved

simplify duplicaed code

2b5d77f

yanxi-chen reviewed Oct 21, 2025

View reviewed changes

trinity/manager/config_registry/trainer_config_manager.py Outdated Show resolved Hide resolved

trinity/manager/config_registry/explorer_config_manager.py Outdated Show resolved Hide resolved

trinity/manager/config_registry/buffer_config_manager.py Outdated Show resolved Hide resolved

hiyuchang added 5 commits October 22, 2025 09:36

avoid repeatedly load plugins

efb8960

Update beginner mode

116442c

optimize beginner mode

95cb836

fix some comments

e89c493

refine some help messages

63d74fd

hiyuchang added 8 commits October 22, 2025 19:28

Merge branch 'main' into dev/upd_config_manager

80e223a

fix typo

5542506

change default algorithm to grpo

50f63a8

refine some words

8585c16

remove adv and policy loss

617da57

fix typo and upper case

0080055

fix header

530ae0b

fix sft

8d272f9

pan-x-c approved these changes Oct 23, 2025

View reviewed changes

pan-x-c merged commit 9f1719e into agentscope-ai:main Oct 23, 2025
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sync the config manager with the latest codebase#332

Sync the config manager with the latest codebase#332
pan-x-c merged 19 commits intoagentscope-ai:mainfrom
hiyuchang:dev/upd_config_manager

hiyuchang commented Oct 21, 2025

Uh oh!

gemini-code-assist bot commented Oct 21, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

hiyuchang commented Oct 21, 2025

Uh oh!

yanxi-chen left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Oct 21, 2025

Uh oh!

hiyuchang commented Oct 22, 2025

Uh oh!

github-actions bot commented Oct 22, 2025

Uh oh!

hiyuchang commented Oct 23, 2025

Uh oh!

github-actions bot commented Oct 23, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

hiyuchang commented Oct 21, 2025

Description

Checklist

Uh oh!

gemini-code-assist bot commented Oct 21, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

hiyuchang commented Oct 21, 2025

Uh oh!

yanxi-chen left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Oct 21, 2025

Summary

Tests

Uh oh!

hiyuchang commented Oct 22, 2025

Uh oh!

github-actions bot commented Oct 22, 2025

Summary

Tests

Uh oh!

hiyuchang commented Oct 23, 2025

Uh oh!

github-actions bot commented Oct 23, 2025

Summary

Tests

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants