Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use OpenAI's Structured Outputs feature to prevent validation errors #514

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

renkehohl
Copy link

Used the result_tools parameter to generate a json_schema which is passed to the OpenAI client as response_format. This guides the model to format it's responses according to the given schema, preventing validation errors.

Copy link
Member

@samuelcolvin samuelcolvin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interesting, are their any docs on why this might work better?

openai_messages = list(chain(*(self._map_message(m) for m in messages)))

model_settings = model_settings or {}

return await self.client.chat.completions.create(
response = await self.client.chat.completions.create(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you don't need this change.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we really don't need that.

else:
tool_choice = 'auto'

tool_choice = 'auto' if self.tools else None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to respect allow_text_result, I think right now you're not.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, my changes ignore the allow_text_result parameter. The reason for this is, that if we pass a result_type to the agent, text results will automatically be excluded. The output generation of the LLM will be constrained to the given schema. In the case of PydanticAI's Agents, the final response will then be wrapped in a ToolCallPart. In other words, the information provided by allow_text_results is implicitly given by the length of the result_tools parameter. Please tell me, if I missed anything.

allow_text_result,
tools,
)
response_format = self._map_response_format(result_tools[0]) if result_tools else None
Copy link
Contributor

@dmontagu dmontagu Dec 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there are multiple result tools (which I believe is the case when the result type is a union), we would definitely need to ensure that all the tool calls are present in the final response_format

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also

Suggested change
response_format = self._map_response_format(result_tools[0]) if result_tools else None
response_format = self._map_response_format(result_tools[0]) if result_tools and not allow_text_result else None

(Or, find a way for the response format to allow raw strings)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding multiple result tools: I couldn't find a case where the number of result tools is more than 1. Even in the case of a union. If you have any examples, where we have more than one result tools, please tell me.

Regarding the allow_text_results parameter: Please checkout @samuelcolvin's comment. The response format should allow raw strings, if we set result_type=str.

items.append(TextPart(choice.message.content))
if self.response_format:
name = self.response_format['json_schema']['name']
items.append(ToolCallPart.from_raw_args(name, choice.message.content))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried this locally, open_ai requires tool_call_id to be set, which needs to be passed through this method, choice does not really come with an id though? either we generate an id or use the choice index perhaps?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I run into the same case recently. Using the response_format parameter when calling OpenAI, constrains the output generation to follow a given schema. The final response is not intended to be a tool call. As you can see, we use choice.message.content as arguments, which doesn't provide a tool call id.

At the moment, PydanticAI wraps structured responses in ToolCallParts, which needs to be changed in my opinion, to better differentiate between actual tool calls and structured responses.

For solving this issue, we need to find a way for the agent to return the resonse instead of making a tool call from it.

@renkehohl
Copy link
Author

interesting, are their any docs on why this might work better?

OpenAI provides the response_format parameter to feature structured outputs. OpenAI's documentation is a bit sparse, but here are some resources:

Using the response_format parameter to generate structured responses is a cleaner approach than wrapping the result_type into a tool and relying on the model to select the correct tool for its final response.

Here are some further resources on structured outputs:

@samuelcolvin
Copy link
Member

I've asked OpenAI what they think, we'll see what they say...

@samuelcolvin
Copy link
Member

Would strict=True on tool calls solve your issues? #81 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants