-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Unable to use reasoning models with tool calls using LitellmModel #678
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I have the same issue using the openai client for Javascript with Anthropic Sonnet 4
In bedrock, when using models like anthropic with reasoning, you can get the thinking text block from the LLM such as
then prior to making the tool_call, you add the thinking blocks back as an assistant message
For this modality to work with openAI, we need the thinking blocks to be passed through
|
any updates on this? I still get this error |
Please read this first
Describe the question
When the agent attempts to submit a toolcall result via the LitellmModel abstraction, a failure occurs (400 bad request) if reasoning is enabled. I've tested this in particular on Anthropic claude 3.7 with reasoning effort set to
high
.Debug information
v0.0.14
Repro steps
Here is a modified version of the litellm_provider.py example. The only changes i've made, aside from hardcoding the model, is to pass a reasoning effort. This value correctly gets sent to the underlying litellm and enables reasoning on Claude, however, after Claude decides to use the get_weather tool, the tool submission message chain loses all information related to the thinking blocks. This causes the anthropic API to return a 400
Example:
Expected behavior
I would expect the agent to return the weather in tokyo, without failure.
When using litellm directly, i'm able to accomplish tool calling with a reasoning model.
This issue unfortunately makes it impossible to properly use non openai reasoning models with agents.
Cause
I believe the cause of this error is that the message conversion steps remove the model provider specific details from the message chain, such as thinking blocks. These are maintained on the litellm message type but are lost during the following conversion steps done in the LitellmModel:
input item -> chat completion -> lite llm
lite llm -> chat completion -> output item
I think in order to properly support other model providers via Litellm, there needs to be a way to preserve model specific message properties between the various message models.
The text was updated successfully, but these errors were encountered: