Validation with nested Pydantic models (ollama, llama3.1) #607

Ynn · 2025-01-03T16:13:25Z

When using the pydantic_ai library with a nested BaseModel, an UnexpectedModelBehavior error occurs, despite the underlying model (e.g., ollama:llama3.1) being capable of handling the requested structure and providing valid output.

The example here https://ai.pydantic.dev/examples/pydantic-model/#running-the-example is fully functional with ollama:llama3.1, but this slight modification to include nested models fails to work:

import os
from typing import cast

import logfire
from pydantic import BaseModel

from pydantic_ai import Agent
from pydantic_ai.models import KnownModelName

logfire.configure(send_to_logfire='if-token-present')

class Country(BaseModel):
    name: str

class MyModel(BaseModel):
    city: str
    country: Country

model = cast(KnownModelName, os.getenv('PYDANTIC_AI_MODEL', 'ollama:llama3.1'))
print(f'Using model: {model}')
agent = Agent(model, result_type=MyModel)

if __name__ == '__main__':
    result = agent.run_sync('The windy city in the US of A.')
    print(result.data)
    print(result.usage())

I get this error:

UnexpectedModelBehavior: Exceeded maximum retries (1) for result validation

It seems that the model fails to generate the expected nested structure.

Validation of the Model's Capability:

To confirm that the underlying model (ollama:llama3.1) supports this functionality, the following custom implementation was tested:

from ollama import chat, Options
from typing import Type, TypeVar
from pydantic import BaseModel

T = TypeVar('T', bound=BaseModel)

class Agent:
    def __init__(self, model: str, result_type: Type[T], options: Options = None):
        self.model = model.split(":")[1]
        self.result_type = result_type
        self.options = options

    def run_sync(self, prompt: str) -> T:
        response = chat(
            messages=[
                {
                    'role': 'user',
                    'content': prompt
                }
            ],
            model=self.model,
            format=self.result_type.model_json_schema(),
            options=self.options
        )
        if response.message.content is None:
            raise Exception("No response from the model")
        else:
            o_data: str = response.message.content
            o: T = self.result_type.model_validate_json(o_data)
            return o

class Country(BaseModel):
    name: str

class MyModel(BaseModel):
    city: str
    country: Country

model = cast(KnownModelName, os.getenv('PYDANTIC_AI_MODEL', 'ollama:llama3.1'))
print(f'Using model: {model}')
agent = Agent(model, result_type=MyModel)

if __name__ == '__main__':
    result = agent.run_sync('The windy city in the US of A.')
    print(result)

This implementation returned the expected result without any errors:

city='Chicago' country=Country(name='United States')

Environment:

Python version: 3.10+
pydantic_ai[examples]>=0.0.17

Could I be misunderstanding or misusing the library?

The text was updated successfully, but these errors were encountered:

andrewdmalone · 2025-01-05T23:36:34Z

Your example works fine with Gemini 1.5 Flash for me but fails with Ollama 3.2. Might be specific to ollama.

Ynn · 2025-01-06T13:38:54Z

This is also my diagnosis. However, I am surprised that it works with Ollama's library but not with Pydantic AI. Nested structures are quite practical, and I must admit that this is what holds me back from delving deeper into Pydantic AI.

Should an issue be opened in Ollama, or is this something that can be resolved here?

andrewdmalone · 2025-01-07T06:54:19Z

So I think I've figured it out - it looks like it has to do with how the Ollama API formats JSON responses. Here's my testing example:

model = cast(KnownModelName, os.getenv('PYDANTIC_AI_MODEL', 'ollama:llama3.2'))
agent = Agent(model, result_type=MyModel, system_prompt="Return what city, state, and country are provided to you.")

with capture_run_messages() as messages:
    result = agent.run_sync(
        'Chicago, Illinois, USA'
    )
    print(result.data)
    print(messages)

Place a breakpoint here and look at the value of c.function.arguments in the debugger when running the example above. Check out the difference between what the OpenAI value is vs. what the Ollama value is. I see the following:

Running gpt-4o, I see:

‘{“city_metadata":{"city":"Chicago","country":"USA","state":"Illinois"}}'

Running Ollama 3.2, I see:

'{"city_metadata":"{\\"city\\": \\"Chicago\\", \\"state\\": \\"Illinois\\", \\"country\\": \\"USA\\"}"}'

The inner quotation marks (around the inner JSON object) and the double-escaped quotation marks appear to only come up for nested models. Single-layer models look the same between the two models. Add a third layer and it gets even messier with Ollama but OpenAI is still fine.

Under the hood, Pydantic-AI uses the OpenAI Python library to interact with the Ollama model, so I suspect this is related to the OpenAI library not working perfectly with Ollama. I'm fiddling with some fixes - I think it would be reasonable to make an OllamaAgentModel as a subclass of OpenAIAgentModel and override the _process_response method and change how the JSON responses are parsed. It's also possible the Ollama API provides a solution for this, which would maybe justify writing an AgentModel and Model specifically for Ollama rather than reusing the OpenAI one.

andrewdmalone · 2025-01-07T07:45:13Z

Opened a PR (#621) with a fix. Would love some feedback.

samuelcolvin · 2025-01-07T08:26:55Z

PLEASE try to look for similar issues before creating new issues!.

This is a duplicate of #242 as far as I can tell.

andrewdmalone · 2025-01-07T08:49:50Z

I think it's a partial duplicate. #242 seems more focused on the fact that Ollama doesn't always decide to use the tool when provided a response model but this issue is running into issues with Ollama providing a response that adheres to the provided response model but the OpenAI model isn't set up to decode properly.

Ynn · 2025-01-07T14:22:11Z

@samuelcolvin Apologies, I guess with the number of open issues, I focused more on the nested models part since everything else was working fine in my case. I didn’t really think this might tie into deeper issues with Ollama. I’ll take a closer look at #242 and see how it connects.

chrisaww · 2025-01-08T19:33:57Z

Seems similar to my issue 639 incorrectly closed.
I notice an equivalent issue on langchain with async
Web search results: Langchain has similar issues and recommends updating the Ollama class on the following web page
langchain-ai/langchain#13306

andrewdmalone mentioned this issue Jan 7, 2025

Improving Ollama nested model behavior #621

Closed

samuelcolvin closed this as completed Jan 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Validation with nested Pydantic models (ollama, llama3.1) #607

Validation with nested Pydantic models (ollama, llama3.1) #607

Ynn commented Jan 3, 2025

andrewdmalone commented Jan 5, 2025 •

edited

Loading

Ynn commented Jan 6, 2025

andrewdmalone commented Jan 7, 2025 •

edited

Loading

andrewdmalone commented Jan 7, 2025 •

edited

Loading

samuelcolvin commented Jan 7, 2025

andrewdmalone commented Jan 7, 2025 •

edited

Loading

Ynn commented Jan 7, 2025

chrisaww commented Jan 8, 2025

Validation with nested Pydantic models (ollama, llama3.1) #607

Validation with nested Pydantic models (ollama, llama3.1) #607

Comments

Ynn commented Jan 3, 2025

andrewdmalone commented Jan 5, 2025 • edited Loading

Ynn commented Jan 6, 2025

andrewdmalone commented Jan 7, 2025 • edited Loading

andrewdmalone commented Jan 7, 2025 • edited Loading

samuelcolvin commented Jan 7, 2025

andrewdmalone commented Jan 7, 2025 • edited Loading

Ynn commented Jan 7, 2025

chrisaww commented Jan 8, 2025

andrewdmalone commented Jan 5, 2025 •

edited

Loading

andrewdmalone commented Jan 7, 2025 •

edited

Loading

andrewdmalone commented Jan 7, 2025 •

edited

Loading

andrewdmalone commented Jan 7, 2025 •

edited

Loading