-
Notifications
You must be signed in to change notification settings - Fork 476
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
result_type as List #523
Comments
I was getting a similar error with
Final working code from pydantic import BaseModel
from pydantic_ai import Agent
from pydantic_ai.models.ollama import OllamaModel
ollama_model = OllamaModel(
model_name="qwen2.5-coder:14b",
)
class Pet(BaseModel):
name: str| None
animal_type: str| None
age: int| None
color: str | None
favorite_toy: str | None
class PetList(BaseModel):
pets: list[Pet]
# # Create a system prompt to guide the model
SYSTEM_PROMPT = """
You are a helper that extracts pet information from text and formats it as a list.
For each pet mentioned, extract:
- name
- animal type
- age
- color (if mentioned)
- favorite toy (if mentioned)
"""
agent3 = Agent(model=ollama_model, result_type=PetList, retries=3, system_prompt=SYSTEM_PROMPT)
result3 = agent3.run_sync('I have two pets. A cat named Luna who is 5 years old and loves playing with yarn. She has grey fur. I also have a 2 year old black cat named Loki who loves tennis balls.')
pet_data = result3.data
print(pet_data) |
Thanks for the answer @IsaaacD! Marking as resolved :) |
@IsaaacD just wanted to say thanks.. Also can confirm this code still fails with llama3.2:3b but works with "qwen2.5-coder:14b" as you noted. Appreciate your response and the update and usage approach with the system prompt too. I have no issue using qwen for this work. Still, the Ollama library will succeed with llama3.2:3b. So there still must be some underlying combination of model and support for functions or tool chains in here somewhere to account for the difference in behavior with the llama3.2 model between the Ollama library and pydantic-ai. |
I looked into it a bit more and added some debug options to trace what llama3.2 is doing. It appears llama3.2 is returning something that looks like valid JSON but PydanticAI is rejecting it or parsing it wrong? I'll put the changes I made and the list the ouput afterwards. I changed the top to import these and add debug, switched back to llama3.2: from devtools import debug
import logfire
from logging import basicConfig
from langchain_core.globals import set_debug
set_debug(True)
logfire.configure(send_to_logfire='if-token-present')
logfire.ConsoleOptions.min_log_level ='trace'
logfire.ConsoleOptions.verbose = True
basicConfig(handlers=[logfire.LogfireLoggingHandler()])
ollama_model = OllamaModel(
model_name="llama3.2:latest",
) Then around the call I added a agent3 = Agent(model=ollama_model, result_type=PetList, retries=3, system_prompt=SYSTEM_PROMPT)
try:
result3 = agent3.run_sync('I have two pets. A cat named Luna who is 5 years old and loves playing with yarn. She has grey fur. I also have a 2 year old black cat named Loki who loves tennis balls.')
pet_data = result3.data
print(pet_data)
except:
print("Error")
finally:
debug(agent3.last_run_messages) And then finally the pet_examplel.py:41 <module>
agent3.last_run_messages: [
SystemPrompt(
content=(
'\n'
'You are a helper that extracts pet information from text and formats it as a list.\n'
'For each pet mentioned, extract:\n'
'- name\n'
'- animal type\n'
'- age\n'
'- color (if mentioned)\n'
'- favorite toy (if mentioned)\n'
),
role='system',
),
UserPrompt(
content=(
'I have two pets. A cat named Luna who is 5 years old and loves playing with yarn. She has grey fur. I'
' also have a 2 year old black cat named Loki who loves tennis balls.'
),
timestamp=datetime.datetime(2024, 12, 23, 14, 55, 5, 637176, tzinfo=datetime.timezone.utc),
role='user',
),
ModelStructuredResponse(
calls=[
ToolCall(
tool_name='final_result',
args=ArgsJson(
args_json=(
'{"pets":"[{\\"name\\": \\"Luna\\", \\"animal_type\\": \\"cat\\", \\"age\\": \\"5\\", \\"color\\": \\"gre'
'y\\", \\"favorite toy\\": \\"yarn\\"}, {\\"name\\": \\"Loki\\", \\"animal_type\\": \\"cat\\", \\"age\\":'
' \\"2\\", \\"color\\": \\"black\\", \\"favorite toy\\": \\"tennis balls\\"}]"}'
),
),
tool_id='call_kiakrf40',
),
],
timestamp=datetime.datetime(2024, 12, 23, 14, 55, 6, tzinfo=datetime.timezone.utc),
role='model-structured-response',
),
RetryPrompt(
content=[
{
'type': 'list_type',
'loc': ('pets',),
'msg': 'Input should be a valid array',
'input': (
'[{"name": "Luna", "animal_type": "cat", "age": "5", "color": "grey", "favorite toy": "yarn"},'
' {"name": "Loki", "animal_type": "cat", "age": "2", "color": "black", "favorite toy": "tennis'
' balls"}]'
),
},
],
tool_name='final_result',
tool_id='call_kiakrf40',
timestamp=datetime.datetime(2024, 12, 23, 14, 55, 7, 871995, tzinfo=datetime.timezone.utc),
role='retry-prompt',
),
ModelTextResponse(
content=(
'Here is the answer to your question:\n'
'\n'
'**Your Pets:**\n'
'\n'
'1. Luna (5 years old) - grey cat\n'
'\t* Favorite Toy: Yarn\n'
'2. Loki (2 years old) - black cat\n'
'\t* Favorite Toy: Tennis Balls'
),
timestamp=datetime.datetime(2024, 12, 23, 14, 55, 8, tzinfo=datetime.timezone.utc),
role='model-text-response',
),
RetryPrompt(
content='Plain text responses are not permitted, please call one of the functions instead.',
tool_name=None,
tool_id=None,
timestamp=datetime.datetime(2024, 12, 23, 14, 55, 8, 287356, tzinfo=datetime.timezone.utc),
role='retry-prompt',
),
ModelStructuredResponse(
calls=[
ToolCall(
tool_name='final_result',
args=ArgsJson(
args_json=(
'{"pets":"[{\'name\': \'Luna\', \'animal_type\': \'cat\', \'age\': \'5\', \'color\': \'grey\', \'favorite t'
"oy': 'yarn'}, {'name': 'Loki', 'animal_type': 'cat', 'age': '2', 'color': 'black', 'favor"
'ite toy\': \'tennis balls\'}]"}'
),
),
tool_id='call_ssu65p8z',
),
],
timestamp=datetime.datetime(2024, 12, 23, 14, 55, 8, tzinfo=datetime.timezone.utc),
role='model-structured-response',
),
RetryPrompt(
content=[
{
'type': 'list_type',
'loc': ('pets',),
'msg': 'Input should be a valid array',
'input': (
"[{'name': 'Luna', 'animal_type': 'cat', 'age': '5', 'color': 'grey', 'favorite toy': 'yarn'},"
" {'name': 'Loki', 'animal_type': 'cat', 'age': '2', 'color': 'black', 'favorite toy': 'tennis"
" balls'}]"
),
},
],
tool_name='final_result',
tool_id='call_ssu65p8z',
timestamp=datetime.datetime(2024, 12, 23, 14, 55, 8, 905979, tzinfo=datetime.timezone.utc),
role='retry-prompt',
),
ModelTextResponse(
content=(
'def format_pet_info(pets):\n'
' formatted_pets = []\n'
' for pet in pets:\n'
' name = pet["name"]\n'
' animal_type = pet["animal_type"]\n'
' age = pet["age"]\n'
' color = pet.get("color")\n'
' favorite_toys = f"{pet.get(\'favorite toy\', \'No favorite toy mentioned\')}" if pet.get(\'favorit'
'e toy\') else "No favorite toy mentioned"\n'
' \n'
' formatted_pet = f"{name} ({age} years old)"\n'
' if color:\n'
' formatted_pet += f" - {color}"\n'
' \n'
' formatted_pets.append(formatted_pet)\n'
' \n'
' return formatted_pets\n'
'\n'
'pets = [\n'
" {'name': 'Luna', 'animal_type': 'cat', 'age': '5', 'color': 'grey', 'favorite toy': 'yarn'},\n"
" {'name': 'Loki', 'animal_type': 'cat', 'age': '2', 'color': 'black', 'favorite toy': 'tennis ball"
"s'}\n"
']\n'
'\n'
'print(format_pet_info(pets))'
),
timestamp=datetime.datetime(2024, 12, 23, 14, 55, 10, tzinfo=datetime.timezone.utc),
role='model-text-response',
),
] (list) len=9 Note: I did have to change the output in VS Code with the following {
"version": "0.2.0",
"configurations": [
{
"name": "testpython.py",
"type": "debugpy",
"request": "launch",
"program": "${file}",
"console": "integratedTerminal",
"env": {
"PYTHONUNBUFFERED": "0"
}
}
],
} It looks like the last couple of prompts should've been right proper JSON (the terminal output is word-wrapped so I don't think that's an issue). Might be something in the JSON parser or the escape characters llama3.2 is producing. I'll run this again right now with Qwen to compare. |
This is the resulting output from Qwen: pet_examplel.py:49 <module>
agent3.last_run_messages: [
SystemPrompt(
content=(
'\n'
'You are a helper that extracts pet information from text and formats it as a list.\n'
'For each pet mentioned, extract:\n'
'- name\n'
'- animal type\n'
'- age\n'
'- color (if mentioned)\n'
'- favorite toy (if mentioned)\n'
),
role='system',
),
UserPrompt(
content=(
'I have two pets. A cat named Luna who is 5 years old and loves playing with yarn. She has grey fur. I'
' also have a 2 year old black cat named Loki who loves tennis balls.'
),
timestamp=datetime.datetime(2024, 12, 23, 15, 10, 0, 191338, tzinfo=datetime.timezone.utc),
role='user',
),
ModelStructuredResponse(
calls=[
ToolCall(
tool_name='final_result',
args=ArgsJson(
args_json=(
'{"pets":[{"age":"5 years old","animal_type":"cat","color":"grey","favorite_toy":"yarn","n'
'ame":"Luna"},{"age":"2 years old","animal_type":"cat","color":"black","favorite_toy":"ten'
'nis balls","name":"Loki"}]}'
),
),
tool_id='call_1mb6236j',
),
],
timestamp=datetime.datetime(2024, 12, 23, 15, 10, 35, tzinfo=datetime.timezone.utc),
role='model-structured-response',
),
RetryPrompt(
content=[
{
'type': 'int_parsing',
'loc': (
'pets',
0,
'age',
),
'msg': 'Input should be a valid integer, unable to parse string as an integer',
'input': '5 years old',
},
{
'type': 'int_parsing',
'loc': (
'pets',
1,
'age',
),
'msg': 'Input should be a valid integer, unable to parse string as an integer',
'input': '2 years old',
},
],
tool_name='final_result',
tool_id='call_1mb6236j',
timestamp=datetime.datetime(2024, 12, 23, 15, 10, 37, 427123, tzinfo=datetime.timezone.utc),
role='retry-prompt',
),
ModelStructuredResponse(
calls=[
ToolCall(
tool_name='final_result',
args=ArgsJson(
args_json=(
'{"pets":[{"age":5,"animal_type":"cat","color":"grey","favorite_toy":"yarn","name":"Luna"}'
',{"age":2,"animal_type":"cat","color":"black","favorite_toy":"tennis balls","name":"Loki"'
'}]}'
),
),
tool_id='call_cb69k61m',
),
],
timestamp=datetime.datetime(2024, 12, 23, 15, 10, 41, tzinfo=datetime.timezone.utc),
role='model-structured-response',
),
ToolReturn(
tool_name='final_result',
content='Final result processed.',
tool_id='call_cb69k61m',
timestamp=datetime.datetime(2024, 12, 23, 15, 10, 41, 200655, tzinfo=datetime.timezone.utc),
role='tool-return',
),
] (list) len=6 The only main thing I see is that llama3.2 gives EDIT: I tried with changing |
@IsaaacD very nice investigation. Thanks! New to the use of LLMs in functions and workflow approaches. Is it true that the models work best in these cases when trained or fine tuned from the start to work in this manner? Combining that with good prompt engineering is what is needed to return results that parse into pydantic data structures? I'll dig around, but I assume there is a way in pydantic to catch that a data stuct wasn't fully populated after n tries. Or maybe it's just a try / except. Either way, appreciate your CSI skills here. I'll use those interrogate future events for sure! |
@fils I'm not likely the person to ask about LLMs, just a humble software developer. I do believe the training set matters, and Qwen was made to read and write code so it's what I use in my tasks. It's very weird tho, I found a Jupyter notebook (can't find the source now, I'm on my phone but I'll share it later) where it was trying to use Grok for natural language web scraping. I re-wrote it with Qwen 2.5 and python (outside Jupyter) and was having issues with it in code. I had one scenario where I fed the data to Qwen in a terminal with ollama and it spit out the results 100% accurate with one prompt but that was only once. I tried recreating that in code and then in the terminal afterwards and didn't have any success. That's where I came here to see about wrapping it in Pydantic AI and saw it you were experiencing something similar. Didn't know if it was a downstream dependency, but doesn't appear so. So long story short, training data matters, you won't generate PNGs with most LLMs, and it makes sense because it wasn't trained on them. I think code is of a similar vein, if llama3.2 didn't have much code or JSON in its training data, then it makes sense that it struggles with it. EDIT: Here's the notebook link, it's the last one where it was trying to parse the cars into a table https://github.com/curiousily/AI-Bootcamp/blob/master/20.scraping-with-llm.ipynb |
I was just reading the issues on the Pydantic AI repo and ran into an issue that @fils listed in his initial post that I didn't see until I saw the bottom of the thread and noticed his name. I think this is likely what's going on, so I'm not sure if there's much to expect for all use cases until something is implemented for Ollama models |
I don't know if this is related to #242 or not.
I am trying to replicate this from the Ollama examples (ref: https://ollama.com/blog/structured-outputs)
In pydantic-al I try
passing PetList to the result_type. This will fail.
If I just pass Pet, ie,
result_type=Pet
it will sometimes work, getting only 1 cat of course, but also fail sometimes.Any guidance on how to address this would be appreciated.
The text was updated successfully, but these errors were encountered: