Tool choice required #334

matheper · 2026-01-15T17:15:32Z

This pull request refactors how runtime generation parameters (such as temperature, max_tokens, and tool_choice) are passed to and handled by LLM (Large Language Model) instances. The main improvement is to make the LLM instantiation more flexible by supporting arbitrary generation arguments via a unified mechanism, and to clarify the default tool usage behavior in HuggingFace and OpenAI LLMs. Additional tests are included to ensure the new instantiation logic works as expected.

LLM Instantiation and Generation Argument Handling:

Refactored LLM.instantiate in debug_gym/llms/base.py to accept arbitrary runtime generation keyword arguments via **runtime_generate_kwargs, removing the explicit temperature and max_tokens parameters and allowing more flexible configuration. [1] [2]
Updated scripts/replay.py and scripts/run.py to pass generate_kwargs from the config into LLM.instantiate using the new runtime_generate_kwargs mechanism. [1] [2]
Added/updated tests in tests/llms/test_base.py to cover the new instantiation logic, including passing both config and runtime generation kwargs.

LLM Generation Behavior:

Changed the default tool_choice parameter in OpenAILLM.generate to "auto" and ensured it is passed through to the underlying API call, making tool usage more configurable. [1] [2]
Overrode the generate method in HuggingFaceLLM to default tool_choice to "required" for tool usage, clarifying its intended behavior.

Minor Cleanups:

Removed an unused import from debug_gym/agents/utils.py and added a missing import in debug_gym/llms/huggingface.py. [1] [2]

…ation and generation with runtime parameters. - Refactor LLM class to accept additional runtime generation kwargs. - Update tool_choice parameter for HuggingFaceLLM (required) and OpenAILLM (auto). - Modify replay and run scripts to pass runtime_generate_kwargs from config. - Add tests for LLM instantiation with new runtime kwargs.

…s and update tests to reflect changes in llm_config

Copilot

Pull request overview

This PR refactors the LLM instantiation and generation parameter handling to provide more flexibility in configuring runtime generation parameters (such as temperature, max_tokens, and tool_choice). The changes make it easier to pass arbitrary generation kwargs through the system while clarifying the default tool usage behavior for different LLM implementations.

Changes:

Refactored LLM.instantiate to accept arbitrary runtime generation parameters via **runtime_generate_kwargs instead of explicit parameters
Made tool_choice configurable in OpenAI and HuggingFace LLM implementations with appropriate defaults
Removed unused import and added missing import for better code hygiene

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
debug_gym/llms/base.py	Refactored `instantiate` method to accept `**runtime_generate_kwargs` instead of explicit `temperature` and `max_tokens` parameters
debug_gym/llms/openai.py	Added `tool_choice` parameter to `generate` method with default value "auto"
debug_gym/llms/huggingface.py	Added import for `LLMResponse` and overrode `generate` method to default `tool_choice` to "required"
scripts/run.py	Updated to pass LLM config using new kwargs-based approach
scripts/replay.py	Updated to pass LLM config using new kwargs-based approach
debug_gym/agents/utils.py	Removed unused `AGENT_REGISTRY` import
tests/llms/test_base.py	Updated tests to verify new instantiation behavior with `tool_choice` parameter

Comments suppressed due to low confidence (1)

debug_gym/llms/base.py:293

Overridden method signature does not match call, where it is passed too many arguments. Overriding method method OpenAILLM.generate matches the call.
Overridden method signature does not match call, where it is passed an argument named 'tool_choice'. Overriding method method OpenAILLM.generate matches the call.

    def generate(self, messages, tools, **kwargs) -> LLMResponse:

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

matheper added 6 commits January 13, 2026 22:18

tool_choice required

b715d3b

add imports

b01a21b

removed duplicated code

41691de

Refactor LLM instantiation to remove redundant runtime_generate_kwarg…

f7e06d5

…s and update tests to reflect changes in llm_config

Merge branch 'main' into tool-choice-required

e780243

matheper requested a review from Copilot January 15, 2026 17:17

Copilot started reviewing on behalf of matheper January 15, 2026 17:17 View session

Copilot AI reviewed Jan 15, 2026

View reviewed changes

xingdi-eric-yuan approved these changes Jan 15, 2026

View reviewed changes

matheper merged commit f54c2d3 into main Jan 15, 2026
17 checks passed

matheper deleted the tool-choice-required branch January 15, 2026 18:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Tool choice required #334

Tool choice required #334

Uh oh!

matheper commented Jan 15, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Tool choice required #334

Tool choice required #334

Uh oh!

Conversation

matheper commented Jan 15, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants