Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate tool results when using structured result #179

Merged
merged 4 commits into from
Dec 8, 2024

Conversation

jlowin
Copy link
Collaborator

@jlowin jlowin commented Dec 8, 2024

Closes #174

Copy link
Member

@samuelcolvin samuelcolvin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM, some tests are failing, but you can probably fix most of them by running uv run pytest --fix.

LMK if you want me to finish this off.

# we have a final result, end the conversation
result_data = either.data
# Add all messages to the conversation
messages.extend(r for r in responses if isinstance(r, _messages.Message))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we extract the Messages from the list 3 times, maybe better to return tuple[_MarkFinalResult[ResultData] | None, list[_messages.Message]]?

Then we don't have to do any extraction aftwards.

@samuelcolvin
Copy link
Member

We should also test if the responses need to be adjacent in the messages, I didn't think they needed to be, but @sudoskys suggested they do. That can be a separate PR.

@jlowin
Copy link
Collaborator Author

jlowin commented Dec 8, 2024

Looking at those tests now and I like your suggestion about the tuple.

Responses do need to be adjacent for OpenAI, not sure if others have the same requirement (message expectations vary).

The OpenAI requirement is: a message with one or more tool calls must be followed immediately and only by messages containing responses to every tool call. For example if 2 parallel tool calls are made, the next two messages must be responses corresponding to those two tool call ids.

@jlowin jlowin marked this pull request as ready for review December 8, 2024 16:22
@jlowin
Copy link
Collaborator Author

jlowin commented Dec 8, 2024

I think this is GTG. In testing I identified another situation that can result in the same error message, in which the model requests a "regular" tool call and a "final" tool call in parallel in the same payload. Even though this PR adds the ToolResult for the "final" tool, the regular tool never gets a result and subsequent calls will fail. I will try to address in a separate PR.

@samuelcolvin
Copy link
Member

samuelcolvin commented Dec 8, 2024

In testing I identified another situation that can result in the same error message, in which the model requests a "regular" tool call and a "final" tool call in parallel in the same payload. Even though this PR adds the ToolResult for the "final" tool, the regular tool never gets a result and subsequent calls will fail. I will try to address in a separate PR.

I think that's what's reported in #161. There's some more discussion needed in that case.

@jlowin
Copy link
Collaborator Author

jlowin commented Dec 8, 2024

Ok cool, I have a proposal I'll add on that issue.

@samuelcolvin samuelcolvin enabled auto-merge (squash) December 8, 2024 17:11
@samuelcolvin
Copy link
Member

Thanks so much @jlowin.

@samuelcolvin samuelcolvin merged commit ff7015a into pydantic:main Dec 8, 2024
12 checks passed
@jlowin jlowin deleted the structured-history branch December 8, 2024 17:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Agents generate a broken message history with structured responses
2 participants