Refactor Launcher with Typer by pan-x-c · Pull Request #502 · agentscope-ai/Trinity-RFT

pan-x-c · 2026-02-09T05:46:52Z

Description

Replace argparse with typer for easier maintenance
Optimized imports to speed up the response time of the trinity CLI (up to 50%)
Add more logging in the workflow runner.

Checklist

Please check the following items before code is ready to be reviewed.

Code has passed all tests
Docstrings have been added/updated in Google Style
Documentation has been updated
Code is ready for review

gemini-code-assist · 2026-02-09T05:48:10Z

Summary of Changes

Hello @pan-x-c, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request refactors the Trinity CLI to use typer instead of argparse. This change modernizes the CLI definition, improves testability by using TyperCliRunner, and enhances the overall structure of the command-line interface. The update includes modifications to the pyproject.toml file to add typer as a dependency, updates to the test suite to reflect the new CLI structure, and refactoring of the main launcher script to use typer decorators.

Highlights

Typer Integration: Replaces argparse with typer for defining command-line interfaces, modernizing the CLI structure.
Dependency Update: Adds typer>=0.9.0 as a project dependency in pyproject.toml.
Test Refactoring: Updates tests in tests/cli/launcher_test.py to use TyperCliRunner instead of mocking argparse.ArgumentParser.parse_args, improving test accuracy and reducing reliance on mocks.
Viewer Update: Adds a run_viewer method to trinity/buffer/viewer.py to start the Streamlit viewer programmatically.
CLI Command Restructuring: Refactors trinity/cli/launcher.py to use typer decorators for defining CLI commands, enhancing code readability and maintainability.

Changelog

pyproject.toml
- Added typer>=0.9.0 as a dependency.
tests/cli/launcher_test.py
- Replaced argparse mocking with TyperCliRunner for testing CLI commands.
- Adjusted test assertions to align with typer's output.
trinity/buffer/viewer.py
- Added run_viewer method to programmatically start the Streamlit viewer.
trinity/cli/launcher.py
- Replaced argparse with typer for defining CLI commands.
- Refactored command definitions using typer decorators.
- Removed direct calls to streamlit.cli in favor of ConfigManager.run.
trinity/common/experience.py
- Minor: moved Tensor import to the top.
trinity/common/models/model.py
- Minor: moved openai import inside the function to avoid circular dependency.
trinity/explorer/workflow_runner.py
- Added debug logging.
trinity/manager/config_manager.py
- Added static method run to start the Streamlit app.
trinity/manager/synchronizer.py
- Minor: moved imports inside functions to avoid circular dependency.

Activity

The pull request introduces typer and removes argparse to modernize the CLI.
Tests have been updated to use TyperCliRunner for better integration testing.
The debug command has been refactored to use typer for argument parsing.
The studio command now uses ConfigManager.run to start the Streamlit app.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request refactors the command-line interface from argparse to typer, which significantly improves the code's readability and maintainability. The changes are well-executed, including updating tests to use typer's testing utilities and introducing lazy imports to improve CLI startup performance. I've found a couple of minor areas for improvement regarding parameter naming for better clarity. Overall, this is a great enhancement.

trinity/cli/launcher.py

pan-x-c · 2026-02-09T05:58:07Z

/unittest-module-common

github-actions · 2026-02-09T06:11:42Z

Summary

Tests 📝	Passed ✅	Failed ❌	Skipped ⏭️	Other ❓	Flaky 🍂	Duration ⏱️
55	54	0	1	0	0	11m 13s

Skipped

Tests	Status
tests/common/vllm_test.py::TestTinkerAsyncAPIServer::test_api_async	skipped ⏭️

Tests

Test Name	Status	Duration
tests/common/config_test.py::TestConfig::test_all_examples_are_valid	✅	9h 21m
tests/common/config_test.py::TestConfig::test_chat_template_path	✅	5m 2s
tests/common/config_test.py::TestConfig::test_config_flatten	✅	32.7s
tests/common/config_test.py::TestConfig::test_continue_from_checkpoint_is_valid	✅	6m 25s
tests/common/config_test.py::TestConfig::test_default_workflow	✅	5m 1s
tests/common/config_test.py::TestConfig::test_load_default_config	✅	1h 59m
tests/common/config_test.py::TestConfig::test_max_token_len_per_gpu_set_correctly	✅	5m 7s
tests/common/config_test.py::TestConfig::test_optimizer_config_propagation	✅	5m 20s
tests/common/config_test.py::TestConfig::test_update_config_from_ray_cluster	✅	30m 38s
tests/common/experience_test.py::TestEID::test_eid_properties	✅	522ms
tests/common/experience_test.py::TestExperience::test_action_mask_and_logprobs_type	✅	485ms
tests/common/experience_test.py::TestExperience::test_assertions	✅	337ms
tests/common/experience_test.py::TestExperience::test_dpo_experience	✅	410ms
tests/common/experience_test.py::TestExperience::test_gather	✅	982ms
tests/common/experience_test.py::TestExperience::test_gather_with_token_level_reward	✅	571ms
tests/common/experience_test.py::TestExperience::test_hf_datasets_conversion	✅	15.3s
tests/common/experience_test.py::TestExperience::test_multi_turn_experience	✅	352ms
tests/common/experience_test.py::TestExperience::test_serialize_deserialize	✅	1.9s
tests/common/experience_test.py::TestExperience::test_single_turn_experience	✅	362ms
tests/common/experience_test.py::TestExperience::test_to_dict	✅	338ms
tests/common/experience_test.py::TestExperienceConversion::test_batch_conversion	✅	715ms
tests/common/experience_test.py::TestExperienceConversion::test_dpo_experience_batch_conversion	✅	580ms
tests/common/experience_test.py::TestExperienceConversion::test_experience_model_experience_conversion	✅	803ms
tests/common/experience_test.py::TestExperienceConversion::test_gather_experiences_with_custom_fields	✅	494ms
tests/common/experience_test.py::TestExperienceConversion::test_multiturn_experience_batch_converstion	✅	596ms
tests/common/sudoku_test.py::test_9x9_generator_produces_valid_solution	✅	781ms
tests/common/sudoku_test.py::test_9x9_generator_creates_holes	✅	683ms
tests/common/sudoku_test.py::test_9x9_solution_is_fully_filled	✅	673ms
tests/common/sudoku_test.py::test_judge_allows_incomplete_board	✅	254ms
tests/common/sudoku_test.py::test_judge_detects_row_violation	✅	219ms
tests/common/sudoku_test.py::test_judge_detects_column_violation	✅	217ms
tests/common/sudoku_test.py::test_judge_detects_block_violation	✅	218ms
tests/common/sudoku_test.py::test_4x4_generator_produces_valid_solution	✅	267ms
tests/common/sudoku_test.py::test_4x4_solution_is_fully_filled	✅	246ms
tests/common/sudoku_test.py::test_4x4_judge_detects_row_violation	✅	208ms
tests/common/sudoku_test.py::test_4x4_judge_detects_block_violation	✅	224ms
tests/common/vllm_test.py::ModelWrapperTest_0::test_generate	✅	15h 54m
tests/common/vllm_test.py::ModelWrapperTest_1::test_generate	✅	10h 46m
tests/common/vllm_test.py::ModelWrapperTest_2::test_generate	✅	13h 9m
tests/common/vllm_test.py::TestModelLen_0::test_model_len	✅	7h 34m
tests/common/vllm_test.py::TestModelLen_1::test_model_len	✅	9h 1m
tests/common/vllm_test.py::TestModelLen_2::test_model_len	✅	7h 39m
tests/common/vllm_test.py::TestModelLenWithoutPromptTruncation::test_model_len	✅	7h 35m
tests/common/vllm_test.py::TestMessageProcess::test_no_prompt_truncation	✅	7h 28m
tests/common/vllm_test.py::TestMessageProcess::test_truncation_status	✅	7h 28m
tests/common/vllm_test.py::TestAPIServer::test_api	✅	8h 19m
tests/common/vllm_test.py::TestLogprobs::test_logprobs_api	✅	7h 15m
tests/common/vllm_test.py::TestAsyncAPIServer::test_api_async	✅	8h 3m
tests/common/vllm_test.py::TestTinkerAsyncAPIServer::test_api_async	⏭️	742ms
tests/common/vllm_test.py::TestTokenizer::test_action_mask	✅	3m 55s
tests/common/vllm_test.py::TestTokenizer::test_action_mask_with_tools	✅	3m 54s
tests/common/vllm_test.py::TestAPIServerToolCall_0_deepseek_r1::test_api_tool_calls	✅	8h 52m
tests/common/vllm_test.py::TestAPIServerToolCall_1::test_api_tool_calls	✅	8h 39m
tests/common/vllm_test.py::TestSuperLongGeneration::test_generate	✅	34h 39m
tests/common/vllm_test.py::TestTinkerAPI::test_tinker_api	✅	11h 4m

Github Test Reporter by CTRF 💚

pan-x-c · 2026-02-09T11:06:50Z

/unittest-diff

github-actions · 2026-02-09T11:58:40Z

Summary

Tests 📝	Passed ✅	Failed ❌	Skipped ⏭️	Other ❓	Flaky 🍂	Duration ⏱️
196	194	0	2	0	0	49m 30s

Skipped

Tests	Status
tests/common/vllm_test.py::TestTinkerAsyncAPIServer::test_api_async	skipped ⏭️
tests/utils/swanlab_test.py::TestSwanlabMonitor::test_swanlab_monitor_smoke	skipped ⏭️

Tests

Test Name	Status	Duration
tests/buffer/experience_pipeline_test.py::TestExperiencePipeline::test_experience_pipeline	✅	3h 3m
tests/buffer/experience_pipeline_test.py::TestExperiencePipeline::test_pass_rate_calculation	✅	1h 43m
tests/buffer/experience_storage_test.py::ExperienceStorageTest::test_sql_experience_buffer	✅	42m 51s
tests/buffer/experience_storage_test.py::ExperienceStorageTest::test_sql_storage_0_sft	✅	1h 11m
tests/buffer/experience_storage_test.py::ExperienceStorageTest::test_sql_storage_1_dpo	✅	1h 19m
tests/buffer/file_test.py::TestFileBuffer::test_file_reader	✅	6m 29s
tests/buffer/file_test.py::TestFileBuffer::test_file_writer	✅	28m 20s
tests/buffer/formatter_test.py::TestFormatter::test_dpo_messages_formatter	✅	8m 22s
tests/buffer/formatter_test.py::TestFormatter::test_dpo_plaintext_formatter	✅	8m 3s
tests/buffer/formatter_test.py::TestFormatter::test_multi_modal_sft_formatter	✅	14m 22s
tests/buffer/formatter_test.py::TestFormatter::test_sft_messages_formatter	✅	17m 29s
tests/buffer/formatter_test.py::TestFormatter::test_sft_plaintext_formatter	✅	12m 22s
tests/buffer/formatter_test.py::TestFormatter::test_task_formatter	✅	3m 51s
tests/buffer/queue_test.py::TestQueueBuffer::test_priority_queue_buffer_reuse	✅	1h 49m
tests/buffer/queue_test.py::TestQueueBuffer::test_priority_queue_capacity	✅	37m 48s
tests/buffer/queue_test.py::TestQueueBuffer::test_priority_queue_reuse_count_control	✅	1h 7m
tests/buffer/queue_test.py::TestQueueBuffer::test_queue_buffer_0_queue	✅	51m 7s
tests/buffer/queue_test.py::TestQueueBuffer::test_queue_buffer_1_priority_queue	✅	50m 50s
tests/buffer/queue_test.py::TestQueueBuffer::test_queue_buffer_capacity	✅	59m 17s
tests/buffer/reader_test.py::TestBufferReader::test_buffer_reader_registration	✅	12m 50s
tests/buffer/reward_shaping_mapper_test.py::TestRewardShapingMapper::test_basic_usage	✅	6.7s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_0::test_default_queue_default_sample_strategy	✅	36m 40s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_0::test_default_queue_staleness_control_sample_strategy	✅	33m 19s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_0::test_priority_queue_default_sample_strategy	✅	30m 1s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_0::test_priority_queue_staleness_control_sample_strategy	✅	29m 42s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_0::test_sql_staleness_control_sample_strategy	✅	1h 15m
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_1::test_default_queue_default_sample_strategy	✅	38m 11s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_1::test_default_queue_staleness_control_sample_strategy	✅	29m 52s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_1::test_priority_queue_default_sample_strategy	✅	29m 29s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_1::test_priority_queue_staleness_control_sample_strategy	✅	33m 15s
tests/buffer/sample_strategy_test.py::ExperienceStorageTest_1::test_sql_staleness_control_sample_strategy	✅	59m 29s
tests/buffer/sql_test.py::TestSQLBuffer::test_sql_exp_buffer_read_write_0	✅	1h 32m
tests/buffer/sql_test.py::TestSQLBuffer::test_sql_exp_buffer_read_write_1	✅	34m 44s
tests/buffer/sql_test.py::TestSQLBuffer::test_sql_task_buffer_read_write	✅	42m
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_0	✅	5m 16s
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_1	✅	4m 50s
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_2	✅	5m 21s
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_3	✅	5m 24s
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_4	✅	5m 37s
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_5	✅	5m 30s
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_6	✅	5m 44s
tests/buffer/task_scheduler_test.py::TestTaskScheduler::test_task_scheduler_simple	✅	4m 40s
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_0_file	✅	5m 7s
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_1_sql	✅	41m 21s
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_2_file	✅	41.6s
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_3_sql	✅	42m 10s
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_4_file	✅	41.6s
tests/buffer/task_storage_test.py::TaskStorageTest::test_read_task_5_sql	✅	50m 11s
tests/cli/launcher_test.py::TestLauncherMain::test_debug_mode	✅	12h 34m
tests/cli/launcher_test.py::TestLauncherMain::test_main_run_command	✅	1h 40m
tests/cli/launcher_test.py::TestLauncherMain::test_main_run_in_dlc	✅	26m 23s
tests/cli/launcher_test.py::TestLauncherMain::test_main_studio_command	✅	15m 31s
tests/cli/launcher_test.py::TestLauncherMain::test_multi_stage_run	✅	4h 9m
tests/common/config_test.py::TestConfig::test_all_examples_are_valid	✅	8h 58m
tests/common/config_test.py::TestConfig::test_chat_template_path	✅	5m 1s
tests/common/config_test.py::TestConfig::test_config_flatten	✅	32.1s
tests/common/config_test.py::TestConfig::test_continue_from_checkpoint_is_valid	✅	6m 19s
tests/common/config_test.py::TestConfig::test_default_workflow	✅	4m 55s
tests/common/config_test.py::TestConfig::test_load_default_config	✅	1h 33m
tests/common/config_test.py::TestConfig::test_max_token_len_per_gpu_set_correctly	✅	4m 53s
tests/common/config_test.py::TestConfig::test_optimizer_config_propagation	✅	4m 59s
tests/common/config_test.py::TestConfig::test_update_config_from_ray_cluster	✅	10m 38s
tests/common/experience_test.py::TestEID::test_eid_properties	✅	495ms
tests/common/experience_test.py::TestExperience::test_action_mask_and_logprobs_type	✅	497ms
tests/common/experience_test.py::TestExperience::test_assertions	✅	380ms
tests/common/experience_test.py::TestExperience::test_dpo_experience	✅	408ms
tests/common/experience_test.py::TestExperience::test_gather	✅	801ms
tests/common/experience_test.py::TestExperience::test_gather_with_token_level_reward	✅	567ms
tests/common/experience_test.py::TestExperience::test_hf_datasets_conversion	✅	15.3s
tests/common/experience_test.py::TestExperience::test_multi_turn_experience	✅	372ms
tests/common/experience_test.py::TestExperience::test_serialize_deserialize	✅	1.0s
tests/common/experience_test.py::TestExperience::test_single_turn_experience	✅	363ms
tests/common/experience_test.py::TestExperience::test_to_dict	✅	335ms
tests/common/experience_test.py::TestExperienceConversion::test_batch_conversion	✅	687ms
tests/common/experience_test.py::TestExperienceConversion::test_dpo_experience_batch_conversion	✅	545ms
tests/common/experience_test.py::TestExperienceConversion::test_experience_model_experience_conversion	✅	883ms
tests/common/experience_test.py::TestExperienceConversion::test_gather_experiences_with_custom_fields	✅	509ms
tests/common/experience_test.py::TestExperienceConversion::test_multiturn_experience_batch_converstion	✅	598ms
tests/common/sudoku_test.py::test_9x9_generator_produces_valid_solution	✅	776ms
tests/common/sudoku_test.py::test_9x9_generator_creates_holes	✅	648ms
tests/common/sudoku_test.py::test_9x9_solution_is_fully_filled	✅	615ms
tests/common/sudoku_test.py::test_judge_allows_incomplete_board	✅	245ms
tests/common/sudoku_test.py::test_judge_detects_row_violation	✅	242ms
tests/common/sudoku_test.py::test_judge_detects_column_violation	✅	221ms
tests/common/sudoku_test.py::test_judge_detects_block_violation	✅	221ms
tests/common/sudoku_test.py::test_4x4_generator_produces_valid_solution	✅	271ms
tests/common/sudoku_test.py::test_4x4_solution_is_fully_filled	✅	248ms
tests/common/sudoku_test.py::test_4x4_judge_detects_row_violation	✅	256ms
tests/common/sudoku_test.py::test_4x4_judge_detects_block_violation	✅	226ms
tests/common/vllm_test.py::ModelWrapperTest_0::test_generate	✅	15h 21m
tests/common/vllm_test.py::ModelWrapperTest_1::test_generate	✅	11h 6m
tests/common/vllm_test.py::ModelWrapperTest_2::test_generate	✅	10h 42m
tests/common/vllm_test.py::TestModelLen_0::test_model_len	✅	8h 44m
tests/common/vllm_test.py::TestModelLen_1::test_model_len	✅	7h 11m
tests/common/vllm_test.py::TestModelLen_2::test_model_len	✅	7h 41m
tests/common/vllm_test.py::TestModelLenWithoutPromptTruncation::test_model_len	✅	7h 30m
tests/common/vllm_test.py::TestMessageProcess::test_no_prompt_truncation	✅	7h 33m
tests/common/vllm_test.py::TestMessageProcess::test_truncation_status	✅	7h 23m
tests/common/vllm_test.py::TestAPIServer::test_api	✅	7h 58m
tests/common/vllm_test.py::TestLogprobs::test_logprobs_api	✅	7h 26m
tests/common/vllm_test.py::TestAsyncAPIServer::test_api_async	✅	8h 6m
tests/common/vllm_test.py::TestTinkerAsyncAPIServer::test_api_async	⏭️	843ms
tests/common/vllm_test.py::TestTokenizer::test_action_mask	✅	4m 47s
tests/common/vllm_test.py::TestTokenizer::test_action_mask_with_tools	✅	4m 29s
tests/common/vllm_test.py::TestAPIServerToolCall_0_deepseek_r1::test_api_tool_calls	✅	8h 57m
tests/common/vllm_test.py::TestAPIServerToolCall_1::test_api_tool_calls	✅	8h 49m
tests/common/vllm_test.py::TestSuperLongGeneration::test_generate	✅	34h 36m
tests/common/vllm_test.py::TestTinkerAPI::test_tinker_api	✅	11h 25m
tests/explorer/explorer_test.py::TestExplorerCountdownEval::test_explorer	✅	29h
tests/explorer/explorer_test.py::TestExplorerEvalDetailedStats::test_explorer	✅	19h 50m
tests/explorer/explorer_test.py::TestExplorerGSM8KRULERNoEval::test_explorer	✅	15h 3m
tests/explorer/explorer_test.py::TestExplorerGSM8k::test_explorer	✅	49h 59m
tests/explorer/explorer_test.py::ServeTest::test_serve	✅	15h 36m
tests/explorer/proxy_test.py::RecorderTest::test_recorder	✅	1m 27s
tests/explorer/scheduler_test.py::SchedulerTest::test_async_workflow	✅	1h 18m
tests/explorer/scheduler_test.py::SchedulerTest::test_concurrent_operations	✅	1h 20m
tests/explorer/scheduler_test.py::SchedulerTest::test_dynamic_timeout	✅	3h 34m
tests/explorer/scheduler_test.py::SchedulerTest::test_get_results	✅	5h 33m
tests/explorer/scheduler_test.py::SchedulerTest::test_metric_calculation_with_non_repeatable_workflow_0	✅	1h 22m
tests/explorer/scheduler_test.py::SchedulerTest::test_metric_calculation_with_non_repeatable_workflow_1	✅	1h 15m
tests/explorer/scheduler_test.py::SchedulerTest::test_metric_calculation_with_repeatable_workflow_0	✅	1h 16m
tests/explorer/scheduler_test.py::SchedulerTest::test_metric_calculation_with_repeatable_workflow_1	✅	1h 14m
tests/explorer/scheduler_test.py::SchedulerTest::test_multi_step_execution	✅	1h 26m
tests/explorer/scheduler_test.py::SchedulerTest::test_non_repeatable_workflow	✅	1h 17m
tests/explorer/scheduler_test.py::SchedulerTest::test_over_rollout_min_wait	✅	2h 22m
tests/explorer/scheduler_test.py::SchedulerTest::test_scheduler_all_methods	✅	4h
tests/explorer/scheduler_test.py::SchedulerTest::test_scheduler_restart_after_stop	✅	2h 19m
tests/explorer/scheduler_test.py::SchedulerTest::test_split_tasks	✅	2h 12m
tests/explorer/scheduler_test.py::SchedulerTest::test_stepwise_experience_eid	✅	6h 57m
tests/explorer/scheduler_test.py::SchedulerTest::test_wait_all	✅	2h 10m
tests/explorer/scheduler_test.py::SchedulerTest::test_wait_all_timeout_with_multi_batch	✅	3h 45m
tests/explorer/scheduler_test.py::TestRunnerStateCollection::test_runner_state_collection	✅	2h 55m
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_reward_propagation_workflow_0	✅	1.4s
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_reward_propagation_workflow_1	✅	10m 1s
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_step_wise_reward_workflow_0	✅	889ms
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_step_wise_reward_workflow_1	✅	16m 43s
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_workflows_raise_error	✅	1.3s
tests/explorer/step_wise_workflow_test.py::WorkflowTest::test_workflows_stop_at_max_env_steps	✅	16m 43s
tests/explorer/workflow_test.py::WorkflowTest::test_gsm8k_workflow	✅	11.2s
tests/explorer/workflow_test.py::WorkflowTest::test_math_boxed_workflow	✅	16.7s
tests/explorer/workflow_test.py::WorkflowTest::test_math_complex_workflow	✅	2m 7s
tests/explorer/workflow_test.py::WorkflowTest::test_math_eval_workflow	✅	3.7s
tests/explorer/workflow_test.py::WorkflowTest::test_math_fraction_workflow	✅	11.4s
tests/explorer/workflow_test.py::WorkflowTest::test_math_workflow	✅	7.5s
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_repeatable_0	✅	760ms
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_repeatable_1	✅	1m 40s
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_resettable_0	✅	741ms
tests/explorer/workflow_test.py::WorkflowTest::test_workflow_resettable_1	✅	3m 21s
tests/explorer/workflow_test.py::MultiTurnWorkflowTest_0::test_multi_turn_workflow	✅	6h 21m
tests/explorer/workflow_test.py::MultiTurnWorkflowTest_1::test_multi_turn_workflow	✅	6h 31m
tests/explorer/workflow_test.py::TestWorkflowStateRecording::test_workflow_state_recording	✅	1h 6m
tests/explorer/workflow_test.py::TestAgentScopeWorkflowAdapter::test_adapter_v0	✅	12m 22s
tests/explorer/workflow_test.py::TestAgentScopeWorkflowAdapter::test_adapter_v1	✅	14.5s
tests/explorer/workflow_test.py::TestWorkflowRunner::test_workflow_runner	✅	2m 20s
tests/explorer/workflow_test.py::TestWorkflowRunner::test_workflow_runner_get_state	✅	2h 14m
tests/explorer/workflow_test.py::TestWorkflowRunner::test_workflow_with_openai	✅	7h 32m
tests/explorer/workflow_test.py::TestConcurrentWorkflowRunner::test_concurrent_workflow_runner	✅	10h 43m
tests/manager/synchronizer_test.py::TestSynchronizerExit_0::test_synchronizer	✅	44h 22m
tests/manager/synchronizer_test.py::TestSynchronizerExit_1::test_synchronizer	✅	46h 13m
tests/manager/synchronizer_test.py::TestStateDictBasedSynchronizer_0::test_synchronizer	✅	35h 51m
tests/manager/synchronizer_test.py::TestStateDictBasedSynchronizer_1::test_synchronizer	✅	29h 27m
tests/manager/synchronizer_test.py::TestStateDictBasedSynchronizer_2::test_synchronizer	✅	34h 1m
tests/manager/synchronizer_test.py::TestStateDictBasedSynchronizer_3::test_synchronizer	✅	43h 42m
tests/manager/synchronizer_test.py::TestStateDictBasedSynchronizer_4::test_synchronizer	✅	39h 59m
tests/manager/synchronizer_test.py::TestStateDictBasedSynchronizer_5::test_synchronizer	✅	43h 47m
tests/manager/synchronizer_test.py::TestNCCLBasedSynchronizer_0::test_synchronizer	✅	19h 8m
tests/manager/synchronizer_test.py::TestNCCLBasedSynchronizer_1::test_synchronizer	✅	18h
tests/manager/synchronizer_test.py::TestNCCLBasedSynchronizer_2::test_synchronizer	✅	17h 49m
tests/utils/eval_utils_test.py::TestComputeScore::test_both_boxed_and_equivalent	✅	2.3s
tests/utils/eval_utils_test.py::TestComputeScore::test_both_boxed_and_not_equivalent	✅	1.1s
tests/utils/eval_utils_test.py::TestComputeScore::test_empty_ground_truth	✅	1.7s
tests/utils/eval_utils_test.py::TestComputeScore::test_empty_solution_string	✅	290ms
tests/utils/eval_utils_test.py::TestComputeScore::test_multiple_boxed_answers_in_solution	✅	1.7s
tests/utils/eval_utils_test.py::TestComputeScore::test_solution_boxed_truth_raw_and_equivalent	✅	1.0s
tests/utils/eval_utils_test.py::TestComputeScore::test_solution_boxed_truth_raw_and_not_equivalent	✅	1.1s
tests/utils/eval_utils_test.py::TestComputeScore::test_solution_not_boxed	✅	283ms
tests/utils/eval_utils_test.py::TestComputeScore::test_solution_raw_and_ground_truth_boxed_equivalent	✅	1.0s
tests/utils/eval_utils_test.py::TestMathEvalUtils::test_extract_answer	✅	3.8s
tests/utils/eval_utils_test.py::TestMathEvalUtils::test_verify_math_answer	✅	1m 6s
tests/utils/eval_utils_test.py::TestEvalUtils::test_is_equiv	✅	5.1s
tests/utils/log_test.py::LogTest::test_actor_log	✅	35m 45s
tests/utils/log_test.py::LogTest::test_group_by_node	✅	35m 13s
tests/utils/log_test.py::LogTest::test_no_actor_log	✅	14m 19s
tests/utils/plugin_test.py::TestPluginLoader::test_load_plugins_local_0__workspace_tests_utils_plugins	✅	5m 18s
tests/utils/plugin_test.py::TestPluginLoader::test_load_plugins_local_1_tests_utils_plugins	✅	5m 3s
tests/utils/plugin_test.py::TestPluginLoader::test_load_plugins_remote_0__workspace_tests_utils_plugins	✅	2h 28m
tests/utils/plugin_test.py::TestPluginLoader::test_load_plugins_remote_1_tests_utils_plugins	✅	2h 28m
tests/utils/plugin_test.py::TestPluginLoader::test_passing_custom_class_0__workspace_tests_utils_plugins	✅	1h 23m
tests/utils/plugin_test.py::TestPluginLoader::test_passing_custom_class_1_tests_utils_plugins	✅	1h 23m
tests/utils/registry_test.py::TestRegistryWithRay::test_dynamic_import	✅	40m 30s
tests/utils/registry_test.py::TestRegistry::test_algorithm_registry_mapping	✅	10.5s
tests/utils/registry_test.py::TestRegistry::test_buffer_module_registry_mapping	✅	5.8s
tests/utils/registry_test.py::TestRegistry::test_common_module_registry_mapping	✅	1m
tests/utils/registry_test.py::TestRegistry::test_register_module	✅	561ms
tests/utils/registry_test.py::TestRegistry::test_utils_module_registry_mapping	✅	681ms
tests/utils/swanlab_test.py::TestSwanlabMonitor::test_swanlab_monitor_smoke	⏭️	444ms

Github Test Reporter by CTRF 💚

pan-x-c added 5 commits February 9, 2026 12:17

update logger

8992e99

fix pre-commit

ca71b59

replace argparse with typer

8471f42

speedup import

892941d

update typer version

2ed1d5b

gemini-code-assist bot reviewed Feb 9, 2026

View reviewed changes

trinity/cli/launcher.py Show resolved Hide resolved

trinity/cli/launcher.py Show resolved Hide resolved

pan-x-c added 2 commits February 9, 2026 17:22

update swanlab

d554d34

log task info in workflow runner

e77f968

fix testcase duration

9d86531

chenyushuo approved these changes Feb 9, 2026

View reviewed changes

pan-x-c merged commit a4612cc into agentscope-ai:main Feb 9, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor Launcher with Typer#502

Refactor Launcher with Typer#502
pan-x-c merged 8 commits intoagentscope-ai:mainfrom
pan-x-c:feature/refactor_launcher

pan-x-c commented Feb 9, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Feb 9, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

pan-x-c commented Feb 9, 2026

Uh oh!

github-actions bot commented Feb 9, 2026

Uh oh!

pan-x-c commented Feb 9, 2026

Uh oh!

github-actions bot commented Feb 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

pan-x-c commented Feb 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Checklist

Uh oh!

gemini-code-assist bot commented Feb 9, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

pan-x-c commented Feb 9, 2026

Uh oh!

github-actions bot commented Feb 9, 2026

Summary

Skipped

Tests

Uh oh!

pan-x-c commented Feb 9, 2026

Uh oh!

github-actions bot commented Feb 9, 2026

Summary

Skipped

Tests

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pan-x-c commented Feb 9, 2026 •

edited

Loading