Ollama: Stream Always Fails #667

YanSte · 2025-01-13T09:26:52Z

Hi,

Using stream, structured, text with Ollama always fails.

With any models with tools.

Code:

from datetime import date
from pydantic import ValidationError
from typing_extensions import TypedDict
from pydantic_ai import Agent


class UserProfile(TypedDict, total=False):
    name: str
    dob: date
    bio: str

# Any models with tools
# 
agent = Agent('ollama:llama3.2', result_type=UserProfile)


async def main():
    user_input = 'My name is Ben, I was born on January 28th 1990, I like the chain the dog and the pyramid.'
    async with agent.run_stream(user_input) as result:
        async for message, last in result.stream():  
        # ....

Error:

09:22:36.733   preparing model and tools run_step=1
09:22:36.734   model request run_step=1
09:22:38.064   handle model response
09:22:38.066   preparing model and tools run_step=2
09:22:38.067   model request run_step=2
09:22:39.605   handle model response

---> [21](vscode-notebook-cell:?execution_count=18&line=21) async with agent.run_stream(user_input) as result:
     [22](vscode-notebook-cell:?execution_count=18&line=22)     async for message in result.stream():  
     [23](vscode-notebook-cell:?execution_count=18&line=23)         print(message)

File ~/.pyenv/versions/3.12.0/lib/python3.12/contextlib.py:204, in _AsyncGeneratorContextManager.__aenter__(self)
    [202](~/.pyenv/versions/3.12.0/lib/python3.12/contextlib.py:202) del self.args, self.kwds, self.func
    [203](~/.pyenv/versions/3.12.0/lib/python3.12/contextlib.py:203) try:

--> [204](~/.pyenv/versions/3.12.0/lib/python3.12/contextlib.py:204)     return await anext(self.gen)
    [205](~/.pyenv/versions/3.12.0/lib/python3.12/contextlib.py:205) except StopAsyncIteration:
    [206](~/.pyenv/versions/3.12.0/lib/python3.12/contextlib.py:206)     raise RuntimeError("generator didn't yield") from None

File ~:538, in Agent.run_stream(self, user_prompt, result_type, message_history, model, deps, model_settings, usage_limits, usage, infer_name)
    [535](:535) model_req_span.__exit__(None, None, None)
    [537](:537) with _logfire.span('handle model response') as handle_span:

--> [538](:538)     maybe_final_result = await self._handle_streamed_model_response(
    [539](:539)         model_response, run_context, result_schema
    [540](:540)     )
    [542](:542)     # Check if we got a final result
    [543](:543)     if isinstance(maybe_final_result, _MarkFinalResult):

File ~:1202, in Agent._handle_streamed_model_response(self, model_response, run_context, result_schema)
   [1200](:1200)     return _MarkFinalResult(model_response, None)
   [1201](:1201) else:
-> [1202](:1202)     self._incr_result_retry(run_context)
   [1203](:1203)     response = _messages.RetryPromptPart(
   [1204](:1204)         content='Plain text responses are not permitted, please call one of the functions instead.',
   [1205](:1205)     )
   [1206](:1206)     # stream the response, so usage is correct

File ~:1270, in Agent._incr_result_retry(self, run_context)
   [1268](:1268) run_context.retry += 1
   [1269](:1269) if run_context.retry > self._max_result_retries:
-> [1270](:1270)     raise exceptions.UnexpectedModelBehavior(
   [1271](:1271)         f'Exceeded maximum retries ({self._max_result_retries}) for result validation'
   [1272](:1272)     )

The text was updated successfully, but these errors were encountered:

SiddarthNarayanan01 · 2025-01-15T15:47:23Z

The issue doesn't seem to be with the stream, but rather that the model is not outputting the correct UserProfile result type, as seen by the error message: "Exceeded maximum retries for result validation".

A couple of things to note:

Your DOB field should be json serializable and deserializable, so make it a separate class
Try giving a system prompt that tells the model it should be extracting the data from the input query.
Try annotating your result type with descriptions (see below)

To annotate the result type you can do the following

from pydantic import BaseModel, Field

class DOB(BaseModel):
    year: int
    month: int = Field(ge=1, le=12)
    day: int = Field(ge=1, le=31) # Of course would still allow some invalid dates but that can be validated later

class UserProfile(BaseModel):
    name: str = Field(description="user's name")
    dob: DOB = Field(description="user's date of birth, output json format {year: number, month: number, day: number}")
    bio: str = Field(description="Miscellaneous user bio information")

Hope this helps!

SiddarthNarayanan01 · 2025-01-15T16:09:19Z

Ahh, I was looking at the documentation and I think realize what's going on. So the reason why the example that you shared would work, is because you can construct a datetime.date object from an iso formatted string, which is likely what Pydantic is doing under the hood. If you don't mention that the date should be ISO format, the model is likely trying to output a locale-specific date and thus the parsing doesn't work.

So the best solution is likely to mention in the annotation of your dob field to output "ISO formatted dates"

class UserProfile(BaseModel):
    name: str = Field(description="user's name")
    dob: date = Field(description="user's date of birth, ISO format")
    bio: str = Field(description="Miscellaneous user bio information")

YanSte · 2025-01-15T19:23:55Z

Thanks, @SiddarthNarayanan01, for getting back to me.

That was just an example.

Here’s another one with a similar issue.
Basemodel or TypedDict nothing work

class UserProfile(BaseModel):  
    name: str
    subname: str 

async def main():
    user_input = '...'
    async with agent.run_stream(user_input) as result:
        async for message, last in result.stream():

I've tried everything, but streaming with Ollama doesn't seem to work.
Also with the examples from Pydantic AI.

But run method works well.

YanSte · 2025-01-15T19:55:24Z

The error come from

async def _handle_streamed_model_response(

giving a models.StreamTextResponse

and self._allow_text_result return false.

YanSte · 2025-01-15T21:31:00Z

Only stream a structured response doen't work.

SiddarthNarayanan01 · 2025-01-15T21:36:19Z

Ah, I misunderstood your question, sorry about that! Interestingly enough, I too have issues getting streaming with Ollama to work, but instead of the errors you are facing I get the entire response in one go (no chunking).

Also, the moment the model calls a tool I get an empty response (still no errors). Not sure why this is happening. I'll look at it further.

YanSte · 2025-01-16T08:55:58Z

Can you reproduce my bug ?

SiddarthNarayanan01 · 2025-01-16T14:43:48Z

Yep I can reproduce your bug when running with structured output. Not sure if this is the source of the bug but when a model calls a tool, the TextPart is empty. In streaming mode it seems to just stop there and return that empty string back. That's probably why in your case you get the validation error since the model is returning an empty string when in reality you require a JSON structure.

YanSte · 2025-01-16T22:18:40Z

I’ve decided to stop using Ollama and have returned to LMStudio. I’ve also submitted a pull request: #705.

YanSte changed the title ~~Ollama: Impossible to stream~~ Ollama: Stream with Ollama Always Fails Jan 13, 2025

YanSte changed the title ~~Ollama: Stream with Ollama Always Fails~~ Ollama: Stream Always Fails Jan 13, 2025

YanSte mentioned this issue Jan 16, 2025

Use ollama's structured outputs feature #242

Closed

YanSte closed this as completed Jan 16, 2025

YanSte mentioned this issue Jan 17, 2025

Ollama- Tools & retries does not work properly #703

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ollama: Stream Always Fails #667

Ollama: Stream Always Fails #667

YanSte commented Jan 13, 2025 •

edited

Loading

SiddarthNarayanan01 commented Jan 15, 2025 •

edited

Loading

SiddarthNarayanan01 commented Jan 15, 2025 •

edited

Loading

YanSte commented Jan 15, 2025 •

edited

Loading

YanSte commented Jan 15, 2025

YanSte commented Jan 15, 2025

SiddarthNarayanan01 commented Jan 15, 2025

YanSte commented Jan 16, 2025

SiddarthNarayanan01 commented Jan 16, 2025

YanSte commented Jan 16, 2025

Ollama: Stream Always Fails #667

Ollama: Stream Always Fails #667

Comments

YanSte commented Jan 13, 2025 • edited Loading

SiddarthNarayanan01 commented Jan 15, 2025 • edited Loading

SiddarthNarayanan01 commented Jan 15, 2025 • edited Loading

YanSte commented Jan 15, 2025 • edited Loading

YanSte commented Jan 15, 2025

YanSte commented Jan 15, 2025

SiddarthNarayanan01 commented Jan 15, 2025

YanSte commented Jan 16, 2025

SiddarthNarayanan01 commented Jan 16, 2025

YanSte commented Jan 16, 2025

YanSte commented Jan 13, 2025 •

edited

Loading

SiddarthNarayanan01 commented Jan 15, 2025 •

edited

Loading

SiddarthNarayanan01 commented Jan 15, 2025 •

edited

Loading

YanSte commented Jan 15, 2025 •

edited

Loading