Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validation with nested Pydantic models (ollama, llama3.1) #607

Closed
Ynn opened this issue Jan 3, 2025 · 8 comments
Closed

Validation with nested Pydantic models (ollama, llama3.1) #607

Ynn opened this issue Jan 3, 2025 · 8 comments

Comments

@Ynn
Copy link

Ynn commented Jan 3, 2025

When using the pydantic_ai library with a nested BaseModel, an UnexpectedModelBehavior error occurs, despite the underlying model (e.g., ollama:llama3.1) being capable of handling the requested structure and providing valid output.

The example here https://ai.pydantic.dev/examples/pydantic-model/#running-the-example is fully functional with ollama:llama3.1, but this slight modification to include nested models fails to work:

import os
from typing import cast

import logfire
from pydantic import BaseModel

from pydantic_ai import Agent
from pydantic_ai.models import KnownModelName

logfire.configure(send_to_logfire='if-token-present')

class Country(BaseModel):
    name: str

class MyModel(BaseModel):
    city: str
    country: Country

model = cast(KnownModelName, os.getenv('PYDANTIC_AI_MODEL', 'ollama:llama3.1'))
print(f'Using model: {model}')
agent = Agent(model, result_type=MyModel)

if __name__ == '__main__':
    result = agent.run_sync('The windy city in the US of A.')
    print(result.data)
    print(result.usage())

I get this error:

UnexpectedModelBehavior: Exceeded maximum retries (1) for result validation

It seems that the model fails to generate the expected nested structure.

Validation of the Model's Capability:

To confirm that the underlying model (ollama:llama3.1) supports this functionality, the following custom implementation was tested:

from ollama import chat, Options
from typing import Type, TypeVar
from pydantic import BaseModel

T = TypeVar('T', bound=BaseModel)

class Agent:
    def __init__(self, model: str, result_type: Type[T], options: Options = None):
        self.model = model.split(":")[1]
        self.result_type = result_type
        self.options = options

    def run_sync(self, prompt: str) -> T:
        response = chat(
            messages=[
                {
                    'role': 'user',
                    'content': prompt
                }
            ],
            model=self.model,
            format=self.result_type.model_json_schema(),
            options=self.options
        )
        if response.message.content is None:
            raise Exception("No response from the model")
        else:
            o_data: str = response.message.content
            o: T = self.result_type.model_validate_json(o_data)
            return o

class Country(BaseModel):
    name: str

class MyModel(BaseModel):
    city: str
    country: Country

model = cast(KnownModelName, os.getenv('PYDANTIC_AI_MODEL', 'ollama:llama3.1'))
print(f'Using model: {model}')
agent = Agent(model, result_type=MyModel)

if __name__ == '__main__':
    result = agent.run_sync('The windy city in the US of A.')
    print(result)

This implementation returned the expected result without any errors:

city='Chicago' country=Country(name='United States')

Environment:

  • Python version: 3.10+
  • pydantic_ai[examples]>=0.0.17

Could I be misunderstanding or misusing the library?

@andrewdmalone
Copy link
Contributor

andrewdmalone commented Jan 5, 2025

Your example works fine with Gemini 1.5 Flash for me but fails with Ollama 3.2. Might be specific to ollama.

@Ynn
Copy link
Author

Ynn commented Jan 6, 2025

This is also my diagnosis. However, I am surprised that it works with Ollama's library but not with Pydantic AI. Nested structures are quite practical, and I must admit that this is what holds me back from delving deeper into Pydantic AI.

Should an issue be opened in Ollama, or is this something that can be resolved here?

@andrewdmalone
Copy link
Contributor

andrewdmalone commented Jan 7, 2025

So I think I've figured it out - it looks like it has to do with how the Ollama API formats JSON responses. Here's my testing example:

model = cast(KnownModelName, os.getenv('PYDANTIC_AI_MODEL', 'ollama:llama3.2'))
agent = Agent(model, result_type=MyModel, system_prompt="Return what city, state, and country are provided to you.")

with capture_run_messages() as messages:
    result = agent.run_sync(
        'Chicago, Illinois, USA'
    )
    print(result.data)
    print(messages)

Place a breakpoint here and look at the value of c.function.arguments in the debugger when running the example above. Check out the difference between what the OpenAI value is vs. what the Ollama value is. I see the following:

Running gpt-4o, I see:

‘{“city_metadata":{"city":"Chicago","country":"USA","state":"Illinois"}}'

Running Ollama 3.2, I see:

'{"city_metadata":"{\\"city\\": \\"Chicago\\", \\"state\\": \\"Illinois\\", \\"country\\": \\"USA\\"}"}'

The inner quotation marks (around the inner JSON object) and the double-escaped quotation marks appear to only come up for nested models. Single-layer models look the same between the two models. Add a third layer and it gets even messier with Ollama but OpenAI is still fine.

Under the hood, Pydantic-AI uses the OpenAI Python library to interact with the Ollama model, so I suspect this is related to the OpenAI library not working perfectly with Ollama. I'm fiddling with some fixes - I think it would be reasonable to make an OllamaAgentModel as a subclass of OpenAIAgentModel and override the _process_response method and change how the JSON responses are parsed. It's also possible the Ollama API provides a solution for this, which would maybe justify writing an AgentModel and Model specifically for Ollama rather than reusing the OpenAI one.

@andrewdmalone
Copy link
Contributor

andrewdmalone commented Jan 7, 2025

Opened a PR (#621) with a fix. Would love some feedback.

@samuelcolvin
Copy link
Member

PLEASE try to look for similar issues before creating new issues!.

This is a duplicate of #242 as far as I can tell.

@andrewdmalone
Copy link
Contributor

andrewdmalone commented Jan 7, 2025

I think it's a partial duplicate. #242 seems more focused on the fact that Ollama doesn't always decide to use the tool when provided a response model but this issue is running into issues with Ollama providing a response that adheres to the provided response model but the OpenAI model isn't set up to decode properly.

@Ynn
Copy link
Author

Ynn commented Jan 7, 2025

@samuelcolvin Apologies, I guess with the number of open issues, I focused more on the nested models part since everything else was working fine in my case. I didn’t really think this might tie into deeper issues with Ollama. I’ll take a closer look at #242 and see how it connects.

@chrisaww
Copy link

chrisaww commented Jan 8, 2025

Seems similar to my issue 639 incorrectly closed.
I notice an equivalent issue on langchain with async
Web search results: Langchain has similar issues and recommends updating the Ollama class on the following web page
langchain-ai/langchain#13306

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants