Use OpenAI's Structured Outputs feature to prevent validation errors #514

renkehohl · 2024-12-20T11:37:52Z

Used the result_tools parameter to generate a json_schema which is passed to the OpenAI client as response_format. This guides the model to format it's responses according to the given schema, preventing validation errors.

samuelcolvin

interesting, are their any docs on why this might work better?

samuelcolvin · 2024-12-20T16:36:52Z

pydantic_ai_slim/pydantic_ai/models/openai.py

        openai_messages = list(chain(*(self._map_message(m) for m in messages)))

        model_settings = model_settings or {}

-        return await self.client.chat.completions.create(
+        response = await self.client.chat.completions.create(


you don't need this change.

Yes, we really don't need that.

samuelcolvin · 2024-12-21T15:25:47Z

pydantic_ai_slim/pydantic_ai/models/openai.py

-        else:
-            tool_choice = 'auto'
-
+        tool_choice = 'auto' if self.tools else None


We have to respect allow_text_result, I think right now you're not.

Yes, my changes ignore the allow_text_result parameter. The reason for this is, that if we pass a result_type to the agent, text results will automatically be excluded. The output generation of the LLM will be constrained to the given schema. In the case of PydanticAI's Agents, the final response will then be wrapped in a ToolCallPart. In other words, the information provided by allow_text_results is implicitly given by the length of the result_tools parameter. Please tell me, if I missed anything.

dmontagu · 2024-12-23T17:59:42Z

pydantic_ai_slim/pydantic_ai/models/openai.py

-            allow_text_result,
-            tools,
-        )
+        response_format = self._map_response_format(result_tools[0]) if result_tools else None


If there are multiple result tools (which I believe is the case when the result type is a union), we would definitely need to ensure that all the tool calls are present in the final response_format

Also

Suggested change

response_format = self._map_response_format(result_tools[0]) if result_tools else None

response_format = self._map_response_format(result_tools[0]) if result_tools and not allow_text_result else None

(Or, find a way for the response format to allow raw strings)

Regarding multiple result tools: I couldn't find a case where the number of result tools is more than 1. Even in the case of a union. If you have any examples, where we have more than one result tools, please tell me.

Regarding the allow_text_results parameter: Please checkout @samuelcolvin's comment. The response format should allow raw strings, if we set result_type=str.

anilmuppalla · 2024-12-29T20:12:57Z

pydantic_ai_slim/pydantic_ai/models/openai.py

-            items.append(TextPart(choice.message.content))
+            if self.response_format:
+                name = self.response_format['json_schema']['name']
+                items.append(ToolCallPart.from_raw_args(name, choice.message.content))


I tried this locally, open_ai requires tool_call_id to be set, which needs to be passed through this method, choice does not really come with an id though? either we generate an id or use the choice index perhaps?

I run into the same case recently. Using the response_format parameter when calling OpenAI, constrains the output generation to follow a given schema. The final response is not intended to be a tool call. As you can see, we use choice.message.content as arguments, which doesn't provide a tool call id.

At the moment, PydanticAI wraps structured responses in ToolCallParts, which needs to be changed in my opinion, to better differentiate between actual tool calls and structured responses.

For solving this issue, we need to find a way for the agent to return the resonse instead of making a tool call from it.

renkehohl · 2025-01-02T10:09:24Z

interesting, are their any docs on why this might work better?

OpenAI provides the response_format parameter to feature structured outputs. OpenAI's documentation is a bit sparse, but here are some resources:

Using the response_format parameter to generate structured responses is a cleaner approach than wrapping the result_type into a tool and relying on the model to select the correct tool for its final response.

Here are some further resources on structured outputs:

samuelcolvin · 2025-01-02T12:16:36Z

I've asked OpenAI what they think, we'll see what they say...

samuelcolvin · 2025-01-02T21:26:06Z

Would strict=True on tool calls solve your issues? #81 (comment)

renke94 added 8 commits December 20, 2024 12:34

Use OpenAI's Structured Outputs feature to prevent validation errors

300349e

Reformat openai.py

5f6eebc

Reformat openai.py

3c32dce

Reformat openai.py

70cfeec

Reformat openai.py

c33836b

Reformat openai.py

0954e45

Reformat openai.py

3f81f1a

Reformat openai.py

8057da1

samuelcolvin reviewed Dec 21, 2024

View reviewed changes

dmontagu reviewed Dec 23, 2024

View reviewed changes

anilmuppalla reviewed Dec 29, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use OpenAI's Structured Outputs feature to prevent validation errors #514

Use OpenAI's Structured Outputs feature to prevent validation errors #514

renkehohl commented Dec 20, 2024

samuelcolvin left a comment

samuelcolvin Dec 20, 2024

renkehohl Jan 2, 2025

samuelcolvin Dec 21, 2024

renkehohl Jan 2, 2025

dmontagu Dec 23, 2024 •

edited

Loading

dmontagu Dec 23, 2024

renkehohl Jan 2, 2025

anilmuppalla Dec 29, 2024

renkehohl Jan 2, 2025

renkehohl commented Jan 2, 2025

samuelcolvin commented Jan 2, 2025

samuelcolvin commented Jan 2, 2025

	response_format = self._map_response_format(result_tools[0]) if result_tools else None
	response_format = self._map_response_format(result_tools[0]) if result_tools and not allow_text_result else None

Use OpenAI's Structured Outputs feature to prevent validation errors #514

Are you sure you want to change the base?

Use OpenAI's Structured Outputs feature to prevent validation errors #514

Conversation

renkehohl commented Dec 20, 2024

samuelcolvin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dmontagu Dec 23, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

renkehohl commented Jan 2, 2025

samuelcolvin commented Jan 2, 2025

samuelcolvin commented Jan 2, 2025

dmontagu Dec 23, 2024 •

edited

Loading