Support Multi turn SFT with tools by pan-x-c · Pull Request #245 · agentscope-ai/Trinity-RFT

pan-x-c · 2025-09-01T07:19:14Z

Description

support concatenated multi-turn SFT
support tools in concatenated multi-turn

Checklist

Please check the following items before code is ready to be reviewed.

Code has passed all tests
Docstrings have been added/updated in Google Style
Documentation has been updated
Code is ready for review

gemini-code-assist

Summary of Changes

Hello @pan-x-c, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the training capabilities for conversational AI models by introducing robust support for multi-turn Supervised Fine-Tuning (SFT) that can also incorporate tool usage. The changes allow for more accurate and efficient training on complex dialogue flows, including those where the model interacts with external tools. This improvement is crucial for developing more sophisticated and context-aware conversational agents.

Highlights

Multi-turn SFT Support: Introduced support for concatenated multi-turn Supervised Fine-Tuning (SFT), allowing models to be trained on longer, more complex conversational sequences.
Tool Integration: Enabled the use of tools within the concatenated multi-turn SFT process, enhancing the model's ability to handle function calling and tool-augmented conversations during training.
Flexible Chat Template Configuration: Added new configuration options to specify whether multi-turn SFT is enabled and to provide a custom chat template, offering greater control over how messages are formatted for training.
Improved Token Masking Logic: Refactored the token masking mechanism to dynamically select the appropriate method (Hugging Face's return_assistant_tokens_mask or a default implementation) based on the chat template, ensuring correct action mask generation for various conversational formats, including those with tool calls.
Enhanced Testing: Expanded test coverage for the SFT formatter and action mask generation to validate the new multi-turn and tool integration functionalities.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces support for multi-turn Supervised Fine-Tuning (SFT) with tools. The changes primarily involve correctly handling action_mask for multi-turn conversations and ensuring that tools and chat_template configurations are propagated and used consistently throughout the data formatting and model interaction logic. The implementation looks solid, with new utility functions to determine the correct masking method and refactoring to centralize this logic. I've identified one potential bug regarding inconsistent chat_template usage and a suggestion to refactor duplicated test code for better maintainability.

trinity/buffer/schema/formatter.py

tests/buffer/formatter_test.py

pan-x-c · 2025-09-01T07:25:49Z

/unittest-all

github-actions · 2025-09-01T08:00:37Z

Summary

Tests 📝	Passed ✅	Failed ❌	Skipped ⏭️	Other ❓	Flaky 🍂	Duration ⏱️
136	136	0	0	0	0	2.0s

Tests

Test Name	Status	Duration
tests/algorithm/advantage_fn_test.py::TestGroupedAdvantageFn::test_duplicate_grpo	✅	1ms
tests/algorithm/advantage_fn_test.py::TestGroupedAdvantageFn::test_grpo_advantage	✅	1ms
tests/algorithm/advantage_fn_test.py::TestGroupedAdvantageFn::test_grpo_correct_bias	✅	1ms
tests/algorithm/advantage_fn_test.py::TestGroupedAdvantageFn::test_grpo_reward_std	✅	1ms
tests/algorithm/advantage_fn_test.py::TestGroupedAdvantageFn::test_step_wise_grpo_advantage	✅	1ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_dpo_policy_loss	✅	1ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_gspo_policy_loss	✅	1ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_mix_policy_loss	✅	1ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_opmd_policy_loss	✅	1ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_ppo_policy_loss	✅	1ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_sft_policy_loss	✅	1ms
tests/buffer/experience_pipeline_test.py::TestExperiencePipeline::test_experience_pipeline	✅	11ms
tests/buffer/experience_storage_test.py::ExperienceStorageTest::test_sql_storage	✅	6ms
tests/buffer/file_test.py::TestFileBuffer::test_file_reader	✅	1ms
tests/buffer/file_test.py::TestFileBuffer::test_file_writer	✅	2ms
tests/buffer/formatter_test.py::TestFormatter::test_dpo_messages_formatter	✅	1ms
tests/buffer/formatter_test.py::TestFormatter::test_dpo_plaintext_formatter	✅	1ms
tests/buffer/formatter_test.py::TestFormatter::test_sft_messages_formatter	✅	1ms
tests/buffer/formatter_test.py::TestFormatter::test_sft_plaintext_formatter	✅	1ms
tests/buffer/formatter_test.py::TestFormatter::test_task_formatter	✅	1ms
tests/buffer/queue_test.py::TestQueueBuffer::test_priority_queue_buffer_reuse	✅	7ms
tests/buffer/queue_test.py::TestQueueBuffer::test_priority_queue_capacity	✅	3ms
tests/buffer/queue_test.py::TestQueueBuffer::test_queue_buffer_0_queue	✅	4ms
tests/buffer/queue_test.py::TestQueueBuffer::test_queue_buffer_1_priority_queue	✅	4ms
tests/buffer/queue_test.py::TestQueueBuffer::test_queue_buffer_capacity	✅	4ms
tests/buffer/reward_shaping_mapper_test.py::TestRewardShapingMapper::test_basic_usage	✅	1ms
tests/buffer/sql_test.py::TestSQLBuffer::test_sql_buffer_read_write	✅	4ms
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_0	✅	1ms
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_1	✅	3ms
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_2	✅	1ms
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_3	✅	3ms
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_4	✅	1ms
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_5	✅	3ms
tests/cli/launcher_test.py::TestLauncherMain::test_main_run_command	✅	1ms
tests/cli/launcher_test.py::TestLauncherMain::test_main_run_in_dlc	✅	1ms
tests/cli/launcher_test.py::TestLauncherMain::test_main_studio_command	✅	1ms
tests/common/config_test.py::TestConfig::test_all_examples_are_valid	✅	2ms
tests/common/config_test.py::TestConfig::test_config_flatten	✅	1ms
tests/common/config_test.py::TestConfig::test_continue_from_checkpoint_is_valid	✅	1ms
tests/common/config_test.py::TestConfig::test_load_default_config	✅	4ms
tests/common/experience_test.py::TestEID::test_eid_properties	✅	1ms
tests/common/experience_test.py::TestExperience::test_action_mask_and_logprobs_type	✅	1ms
tests/common/experience_test.py::TestExperience::test_assertions	✅	1ms
tests/common/experience_test.py::TestExperience::test_dpo_experience	✅	1ms
tests/common/experience_test.py::TestExperience::test_gather	✅	1ms
tests/common/experience_test.py::TestExperience::test_hf_datasets_conversion	✅	1ms
tests/common/experience_test.py::TestExperience::test_multi_turn_experience	✅	1ms
tests/common/experience_test.py::TestExperience::test_serialize_deserialize	✅	1ms
tests/common/experience_test.py::TestExperience::test_single_turn_experience	✅	1ms
tests/common/experience_test.py::TestExperience::test_to_dict	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_batch_conversion	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_dpo_experience_batch_conversion	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_experience_model_experience_conversion	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_gather_experiences_with_custom_fields	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_multiturn_experience_batch_converstion	✅	1ms
tests/common/vllm_test.py::ModelWrapperTest_0::test_generate	✅	37ms
tests/common/vllm_test.py::ModelWrapperTest_1::test_generate	✅	16ms
tests/common/vllm_test.py::ModelWrapperTest_2::test_generate	✅	16ms
tests/common/vllm_test.py::ModelWrapperTest_3::test_generate	✅	54ms
tests/common/vllm_test.py::ModelWrapperTest_4::test_generate	✅	48ms
tests/common/vllm_test.py::ModelWrapperTest_5::test_generate	✅	35ms
tests/common/vllm_test.py::ModelWrapperTest_6::test_generate	✅	46ms
tests/common/vllm_test.py::TestAPIServer::test_api	✅	24ms
tests/common/vllm_test.py::TestTokenizer::test_action_mask	✅	1ms
tests/common/vllm_test.py::TestTokenizer::test_action_mask_with_tools	✅	1ms
tests/common/vllm_test.py::TestAPIServerToolCall_0_deepseek_r1::test_api_tool_calls	✅	21ms
tests/common/vllm_test.py::TestAPIServerToolCall_1::test_api_tool_calls	✅	19ms
tests/explorer/explorer_test.py::BaseExplorerCase::test_explorer	✅	1ms
tests/explorer/explorer_test.py::TestExplorerCountdownEval::test_explorer	✅	58ms
tests/explorer/explorer_test.py::TestExplorerCountdownNoEval::test_explorer	✅	70ms
tests/explorer/explorer_test.py::TestExplorerGSM8k::test_explorer	✅	199ms
tests/explorer/scheduler_test.py::SchedulerTest::test_concurrent_operations	✅	4ms
tests/explorer/scheduler_test.py::SchedulerTest::test_get_results	✅	19ms
tests/explorer/scheduler_test.py::SchedulerTest::test_multi_step_execution	✅	4ms
tests/explorer/scheduler_test.py::SchedulerTest::test_non_repeatable_workflow	✅	4ms
tests/explorer/scheduler_test.py::SchedulerTest::test_scheduler_all_methods	✅	14ms
tests/explorer/scheduler_test.py::SchedulerTest::test_scheduler_restart_after_stop	✅	7ms
tests/explorer/scheduler_test.py::SchedulerTest::test_split_tasks	✅	7ms
tests/explorer/scheduler_test.py::SchedulerTest::test_stepwise_experience_eid	✅	4ms
tests/explorer/scheduler_test.py::SchedulerTest::test_wait_all	✅	7ms
tests/explorer/scheduler_test.py::SchedulerTest::test_wait_all_timeout_with_multi_batch	✅	13ms
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_reward_propagation_workflow	✅	1ms
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_step_wise_reward_workflow	✅	1ms
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_workflows_raise_error	✅	1ms
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_workflows_stop_at_max_env_steps	✅	1ms
tests/explorer/workflow_test.py::WorkflowTest::test_gsm8k_workflow	✅	1ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_boxed_workflow	✅	1ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_complex_workflow	✅	1ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_eval_workflow	✅	1ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_fraction_workflow	✅	1ms
tests/explorer/workflow_test.py::WorkflowTest::test_math_workflow	✅	1ms
tests/explorer/workflow_test.py::WorkflowTest::test_rm_gallery_workflow	✅	1ms
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_repeatable	✅	1ms
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_resettable	✅	1ms
tests/manager/synchronizer_test.py::TestSynchronizerExit::test_synchronizer	✅	28ms
tests/manager/synchronizer_test.py::TestStateDictBasedSynchronizer_0::test_synchronizer	✅	60ms
tests/manager/synchronizer_test.py::TestStateDictBasedSynchronizer_1::test_synchronizer	✅	64ms
tests/manager/synchronizer_test.py::TestStateDictBasedSynchronizer_2::test_synchronizer	✅	89ms
tests/manager/synchronizer_test.py::TestStateDictBasedSynchronizer_3::test_synchronizer	✅	76ms
tests/manager/synchronizer_test.py::TestNCCLBasedSynchronizer_0::test_synchronizer	✅	52ms
tests/manager/synchronizer_test.py::TestNCCLBasedSynchronizer_1::test_synchronizer	✅	52ms
tests/service/data_juicer_test.py::TestDataJuicer::test_config	✅	1ms
tests/service/data_juicer_test.py::TestDataJuicer::test_server_start	✅	21ms
tests/service/data_juicer_test.py::TestDataJuicerExperiencePipeline::test_data_juicer_operators	✅	21ms
tests/service/data_juicer_test.py::TestDataJuicerTaskPipeline::test_data_juicer_task_pipeline	✅	14ms
tests/trainer/trainer_test.py::TestTrainerCountdown::test_trainer	✅	138ms
tests/trainer/trainer_test.py::TestStepAheadAsyncRL::test_trainer	✅	50ms
tests/trainer/trainer_test.py::TestTrainerGSM8K_0_fsdp::test_trainer	✅	47ms
tests/trainer/trainer_test.py::TestTrainerGSM8K_1_fsdp2::test_trainer	✅	46ms
tests/trainer/trainer_test.py::TestTrainerGSM8K_2_fsdp::test_trainer	✅	49ms
tests/trainer/trainer_test.py::TestTrainerGSM8K_3_fsdp2::test_trainer	✅	53ms
tests/trainer/trainer_test.py::TestTrainerSFTWarmupGSM8K::test_trainer	✅	60ms
tests/trainer/trainer_test.py::TestTrainerDPO::test_trainer	✅	31ms
tests/trainer/trainer_test.py::TestTrainerSFT::test_trainer	✅	29ms
tests/trainer/trainer_test.py::TestTrainerToolsSFT::test_trainer_tools	✅	29ms
tests/trainer/trainer_test.py::TestFullyAsyncMode::test_fully_async_mode_0_queue	✅	66ms
tests/trainer/trainer_test.py::TestFullyAsyncMode::test_fully_async_mode_1_priority_queue	✅	66ms
tests/trainer/trainer_test.py::TestTrainerMIX::test_trainer	✅	56ms
tests/utils/eval_utils_test.py::TestComputeScore::test_both_boxed_and_equivalent	✅	1ms
tests/utils/eval_utils_test.py::TestComputeScore::test_both_boxed_and_not_equivalent	✅	1ms
tests/utils/eval_utils_test.py::TestComputeScore::test_empty_ground_truth	✅	1ms
tests/utils/eval_utils_test.py::TestComputeScore::test_empty_solution_string	✅	1ms
tests/utils/eval_utils_test.py::TestComputeScore::test_multiple_boxed_answers_in_solution	✅	1ms
tests/utils/eval_utils_test.py::TestComputeScore::test_solution_boxed_truth_raw_and_equivalent	✅	1ms
tests/utils/eval_utils_test.py::TestComputeScore::test_solution_boxed_truth_raw_and_not_equivalent	✅	1ms
tests/utils/eval_utils_test.py::TestComputeScore::test_solution_not_boxed	✅	1ms
tests/utils/eval_utils_test.py::TestComputeScore::test_solution_raw_and_ground_truth_boxed_equivalent	✅	1ms
tests/utils/eval_utils_test.py::TestMathEvalUtils::test_extract_answer	✅	1ms
tests/utils/eval_utils_test.py::TestMathEvalUtils::test_verify_math_answer	✅	1ms
tests/utils/eval_utils_test.py::TestEvalUtils::test_is_equiv	✅	1ms
tests/utils/log_test.py::LogTest::test_actor_log	✅	2ms
tests/utils/log_test.py::LogTest::test_group_by_node	✅	2ms
tests/utils/log_test.py::LogTest::test_no_actor_log	✅	1ms
tests/utils/plugin_test.py::TestPluginLoader::test_load_plugins_local	✅	1ms
tests/utils/plugin_test.py::TestPluginLoader::test_load_plugins_remote	✅	6ms
tests/utils/plugin_test.py::TestPluginLoader::test_passing_custom_class	✅	3ms

Github Test Reporter by CTRF 💚

pan-x-c · 2025-09-02T02:55:15Z

/unittest-module-trainer

Copilot

Pull Request Overview

This PR implements support for multi-turn supervised fine-tuning (SFT) with tools functionality. The main purpose is to enhance the system's ability to handle concatenated multi-turn conversations that include tool usage.

Added concatenated multi-turn SFT support through new configuration options and processing logic
Integrated tools support across the tokenization and formatting pipeline
Refactored action mask method selection into a centralized utility function

Reviewed Changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
trinity/common/models/vllm_model.py	Updated to use centralized action mask method and added tools parameter support
trinity/common/models/utils.py	Added tools parameter to tokenization functions and created `get_action_mask_method` utility
trinity/common/config.py	Added multi-turn and chat template configuration options to FormatConfig
trinity/buffer/schema/formatter.py	Enhanced SFT formatter with multi-turn support and tools integration
trinity/buffer/reader/file_reader.py	Removed unused RawDataReader class
tests/tools.py	Updated test configuration to enable concatenated multi-turn
tests/common/vllm_test.py	Added comprehensive tests for action masking with tools and updated chat template
tests/buffer/formatter_test.py	Enhanced formatter tests to cover multi-turn scenarios
docs/sphinx_doc/source/tutorial/trinity_configs.md	Updated documentation with new configuration options
docs/sphinx_doc/source/tutorial/example_dpo.md	Enhanced DPO tutorial with clearer examples and configuration explanations

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

trinity/common/config.py

trinity/common/models/utils.py

trinity/buffer/schema/formatter.py

pan-x-c · 2025-09-02T03:06:57Z

/unittest-module-trainer

github-actions · 2025-09-02T03:20:21Z

Summary

Tests 📝	Passed ✅	Failed ❌	Skipped ⏭️	Other ❓	Flaky 🍂	Duration ⏱️
13	13	0	0	0	0	752ms

Tests

Test Name	Status	Duration
tests/trainer/trainer_test.py::TestTrainerCountdown::test_trainer	✅	144ms
tests/trainer/trainer_test.py::TestStepAheadAsyncRL::test_trainer	✅	59ms
tests/trainer/trainer_test.py::TestTrainerGSM8K_0_fsdp::test_trainer	✅	51ms
tests/trainer/trainer_test.py::TestTrainerGSM8K_1_fsdp2::test_trainer	✅	46ms
tests/trainer/trainer_test.py::TestTrainerGSM8K_2_fsdp::test_trainer	✅	47ms
tests/trainer/trainer_test.py::TestTrainerGSM8K_3_fsdp2::test_trainer	✅	53ms
tests/trainer/trainer_test.py::TestTrainerSFTWarmupGSM8K::test_trainer	✅	58ms
tests/trainer/trainer_test.py::TestTrainerDPO::test_trainer	✅	30ms
tests/trainer/trainer_test.py::TestTrainerSFT::test_trainer	✅	29ms
tests/trainer/trainer_test.py::TestTrainerToolsSFT::test_trainer_tools	✅	29ms
tests/trainer/trainer_test.py::TestFullyAsyncMode::test_fully_async_mode_0_queue	✅	67ms
tests/trainer/trainer_test.py::TestFullyAsyncMode::test_fully_async_mode_1_priority_queue	✅	74ms
tests/trainer/trainer_test.py::TestTrainerMIX::test_trainer	✅	55ms

Github Test Reporter by CTRF 💚

pan-x-c · 2025-09-02T03:45:21Z

/unittest-module-common

github-actions · 2025-09-02T03:52:03Z

Summary

Tests 📝	Passed ✅	Failed ❌	Skipped ⏭️	Other ❓	Flaky 🍂	Duration ⏱️
31	31	0	0	0	0	349ms

Tests

Test Name	Status	Duration
tests/common/config_test.py::TestConfig::test_all_examples_are_valid	✅	3ms
tests/common/config_test.py::TestConfig::test_config_flatten	✅	1ms
tests/common/config_test.py::TestConfig::test_continue_from_checkpoint_is_valid	✅	1ms
tests/common/config_test.py::TestConfig::test_load_default_config	✅	4ms
tests/common/experience_test.py::TestEID::test_eid_properties	✅	1ms
tests/common/experience_test.py::TestExperience::test_action_mask_and_logprobs_type	✅	1ms
tests/common/experience_test.py::TestExperience::test_assertions	✅	1ms
tests/common/experience_test.py::TestExperience::test_dpo_experience	✅	1ms
tests/common/experience_test.py::TestExperience::test_gather	✅	1ms
tests/common/experience_test.py::TestExperience::test_hf_datasets_conversion	✅	1ms
tests/common/experience_test.py::TestExperience::test_multi_turn_experience	✅	1ms
tests/common/experience_test.py::TestExperience::test_serialize_deserialize	✅	1ms
tests/common/experience_test.py::TestExperience::test_single_turn_experience	✅	1ms
tests/common/experience_test.py::TestExperience::test_to_dict	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_batch_conversion	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_dpo_experience_batch_conversion	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_experience_model_experience_conversion	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_gather_experiences_with_custom_fields	✅	1ms
tests/common/experience_test.py::TestExperienceConversion::test_multiturn_experience_batch_converstion	✅	1ms
tests/common/vllm_test.py::ModelWrapperTest_0::test_generate	✅	40ms
tests/common/vllm_test.py::ModelWrapperTest_1::test_generate	✅	17ms
tests/common/vllm_test.py::ModelWrapperTest_2::test_generate	✅	17ms
tests/common/vllm_test.py::ModelWrapperTest_3::test_generate	✅	56ms
tests/common/vllm_test.py::ModelWrapperTest_4::test_generate	✅	50ms
tests/common/vllm_test.py::ModelWrapperTest_5::test_generate	✅	37ms
tests/common/vllm_test.py::ModelWrapperTest_6::test_generate	✅	47ms
tests/common/vllm_test.py::TestAPIServer::test_api	✅	25ms
tests/common/vllm_test.py::TestTokenizer::test_action_mask	✅	1ms
tests/common/vllm_test.py::TestTokenizer::test_action_mask_with_tools	✅	1ms
tests/common/vllm_test.py::TestAPIServerToolCall_0_deepseek_r1::test_api_tool_calls	✅	23ms
tests/common/vllm_test.py::TestAPIServerToolCall_1::test_api_tool_calls	✅	21ms

Github Test Reporter by CTRF 💚

pan-x-c added 2 commits September 1, 2025 14:59

support multi-turn-sft

109558d

add more tests

5219859

gemini-code-assist bot reviewed Sep 1, 2025

View reviewed changes

fix pre-commit

1d22acc

gemini-code-assist bot reviewed Sep 1, 2025

View reviewed changes

trinity/buffer/schema/formatter.py Show resolved Hide resolved

tests/buffer/formatter_test.py Show resolved Hide resolved

fix comments

97d5b5b

update doc

c89501e

pan-x-c requested a review from Copilot September 2, 2025 02:55

Copilot AI reviewed Sep 2, 2025

View reviewed changes

trinity/common/config.py Show resolved Hide resolved

trinity/common/models/utils.py Outdated Show resolved Hide resolved

trinity/buffer/schema/formatter.py Show resolved Hide resolved

fix comments

79accee

pan-x-c added 3 commits September 2, 2025 11:28

fix doc

4bafffe

Merge branch 'main' into feature/multi-turn-sft

70cd54e

fix config

58dfc3b

hiyuchang approved these changes Sep 2, 2025

View reviewed changes

hiyuchang merged commit 1850f33 into agentscope-ai:main Sep 2, 2025
2 checks passed

yaochaorui pushed a commit to yaochaorui/Trinity-RFT that referenced this pull request Sep 19, 2025

Support Multi turn SFT with tools (agentscope-ai#245)

828b993

Conversation

pan-x-c commented Sep 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Checklist

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

pan-x-c commented Sep 1, 2025

Uh oh!

github-actions bot commented Sep 1, 2025

Summary

Tests

Uh oh!

pan-x-c commented Sep 2, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pan-x-c commented Sep 2, 2025

Uh oh!

github-actions bot commented Sep 2, 2025

Summary

Tests

Uh oh!

pan-x-c commented Sep 2, 2025

Uh oh!

github-actions bot commented Sep 2, 2025

Summary

Tests

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pan-x-c commented Sep 1, 2025 •

edited

Loading