-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Tool calling with LiteLLM and thinking models fail #765
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Can you please provide a full working script? Happy to take a look! |
@rm-openai see below
|
- Introduced a new configuration file for permissions in `.claude/settings.local.json`. - Enhanced `LitellmModel` to properly handle assistant messages with tool calls when reasoning is enabled, addressing issue openai#765. - Added a new comprehensive test suite for LiteLLM thinking models to ensure the fix works across various models and scenarios. - Tests include reproducing the original error, verifying successful tool calls, and checking the fix's applicability to different thinking models.
Enhances the fix for issue openai#765 to work universally with all LiteLLM thinking models that support function calling. Verified working: - Anthropic Claude Sonnet 4 (partial fix - progress from "found text" to "found tool_use") - OpenAI o4-mini (complete success - full tool calling with reasoning) The fix now automatically applies when ModelSettings(reasoning=...) is used with any LiteLLM model, making it future-proof for new thinking models that support both reasoning and function calling.
… maintainability - Cleaned up whitespace and formatting in the test suite for LiteLLM thinking models. - Ensured consistent use of commas and spacing in function calls and assertions. - Verified that the fix for issue openai#765 applies universally across all supported models. - Enhanced documentation within the tests to clarify the purpose and expected outcomes.
Root Cause Analysis and Current StatusTLDR: Thinking with tool calling for Anthropic is broken in LiteLLM. I've investigated this issue thoroughly and determined the root cause is in LiteLLM, not the openai-agents-python SDK. What's Actually Happening
Current WorkaroundsUntil LiteLLM fixes this upstream:
Related Issues
Why No Fix in This SDKI initially created a PR with a workaround, but decided against it because:
|
Thanks! Another workaorund could also be use Anthropic through the OpenAI Responses API Compatibility no? Haven't tried it but should work |
@knowsuchagency The bug you've linked is in the conversion between the responses api and completions api on the LiteLLM side. Although that is a legitimate issue, it isn't the reason why the LiteLLM issue exists. The LiteLLM abstraction in this sdk uses the acompletion api only from LiteLLM. I've tested LiteLLM directly with claude thinking models and the acompletion api preserves thinking blocks. The problem is that this SDK tries to convert completions to the responses api format and in doing so, it drops the specific properties on the LiteLLM models which hold the thinking block details. See this issue i filed about this: #678 I think the issue you pointed out on the LiteLLM side may help if the openai sdk switches to using the responses api on litellm but currently its using acompletion directly. Ultimately, i think the responses API types need some additional flexibility, to be able to preserve non-openai specific model provider details. OpenAI's responses api kinda has a reasoning summary but it doesn't expose the full reasoning blocks via api, hence the responses API doesn't really account for them properly. I believe newer claude models are also moving towards reasoning summaries so maybe some sort of consolidation could happen with the types.
While i agree with this somewhat. The counter argument is that if the SDK claims to support third party providers and LiteLLM supports enabling thinking + tools with acompletions for most model providers, then this SDK should at the minimum have support for such a common scenario. You can't claim to support third party providers but then not work with reasoning + tool calls. |
Yupp tried it via OpenAI Responses API , It works well. |
Do you mind giving a full example, with tool calls? From the looks of your snippet, you're using the openai completions api compatibility provided by Anthropic rather than anything to do with the responses api. |
Describe the bug
When running the agents SDK with tool calling and a thinking model through LITELLM (e.g. sonnet 4) getting this error
Debug information
Repro steps
Expected behavior
Everything works :)
The text was updated successfully, but these errors were encountered: