Skip to content

Conversation

@Kamal-Moha
Copy link
Contributor

This PR solves the issue that google model fails in analyzing/interpreting JSON file links when using DocumentUrl.

from pydantic_ai import Agent, DocumentUrl
from pydantic_ai.models.google import GoogleModel
from pydantic_ai.providers.google import GoogleProvider

provider = GoogleProvider(api_key=os.getenv('GOOGLE_API_KEY'))
model = GoogleModel('gemini-2.5-pro', provider=provider)
agent = Agent(model=model)

result = agent.run_sync(
    [
        'What is the main content of this document?',
        DocumentUrl(url='https://storage.googleapis.com/bhadala-test/transcript_playground-22xj-NR8y_20250925_091155.json'),
    ]
)
print(result.output)
#> This document is the technical report introducing Gemini 1.5, Google's latest large language model...

For example, the above code leads to a 500 Server Error from google when I enter a JSON link.

image

This PR solves this issue. I have taken inspiration from #2851 when implementing this.

or media_type in ('application/x-yaml', 'application/yaml')
)
@staticmethod
def _inline_text_file_part(text: str, *, media_type: str, identifier: str) -> ChatCompletionContentPartTextParam:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use this method

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my latest commit, I have removed this method because google model doesn't need it.

def __init__(self, num):
self.num = num
def multiply(self):
return self.num * 3
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems unrelated

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

if item.vendor_metadata:
part_dict['video_metadata'] = cast(VideoMetadataDict, item.vendor_metadata)
content.append(part_dict)
if self._is_text_like_media_type(item.media_type):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need tests for this behavior like we have in test_openai.py

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check the function test_google_model_json_document_url_input in test_google.py. That should work

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to use the same _inline_text_file_part we use in OpenAI, so that the text is properly formatted as representing a file.

I suggest moving it to a method on BinaryContent that returns the text with the fencing.

_is_text_like_media_type can become a method on BinaryContent and DocumentUrl as well.

When we check isinstance(item, DocumentUrl) and then do downloaded_text = await download_item(item, data_format='text'), we can create a BinaryContent from the result of download_item, and the call the new inline_text_file method on it.

if item.vendor_metadata:
part_dict['video_metadata'] = cast(VideoMetadataDict, item.vendor_metadata)
content.append(part_dict)
if self._is_text_like_media_type(item.media_type):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to use the same _inline_text_file_part we use in OpenAI, so that the text is properly formatted as representing a file.

I suggest moving it to a method on BinaryContent that returns the text with the fencing.

_is_text_like_media_type can become a method on BinaryContent and DocumentUrl as well.

When we check isinstance(item, DocumentUrl) and then do downloaded_text = await download_item(item, data_format='text'), we can create a BinaryContent from the result of download_item, and the call the new inline_text_file method on it.

@Kamal-Moha
Copy link
Contributor Author

@DouweM As suggested, I have now used _is_text_like_media_type and _inline_text_file_part in DocumentUrl and BinaryContent. Check now

@DouweM
Copy link
Collaborator

DouweM commented Oct 29, 2025

@Kamal-Moha Can you also please implement the other suggestions in https://github.com/pydantic/pydantic-ai/pull/3269/files#r2471009741, so we reduce duplication between the Google and OpenAI models?

Also don't forget to run make format to satisfy the linter!

@Kamal-Moha
Copy link
Contributor Author

@DouweM I have already implemented the suggestion in https://github.com/pydantic/pydantic-ai/pull/3269/files#r2471009741 which is about having the method _is_text_like_media_type in BinaryContent and DocumentUrl.I have already done that, please double check. If not satisfied, please clarify what you're looking for.

I have also formatted the code to satisfy the linter.

@DouweM
Copy link
Collaborator

DouweM commented Oct 30, 2025

@Kamal-Moha I suggested creating new methods on BinaryContent and DocumentUrl, so that we reduce the duplication between openai.py and google.py and have the "inline text fencing" logic in just one place. Can you please do that?

Also note that tests are failing.

@DouweM
Copy link
Collaborator

DouweM commented Nov 3, 2025

@Kamal-Moha Please have a look at the failing CI jobs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants