Skip to content

GoogleModel generates empty text parts that Vertex AI rejects with 400 error #2032

@Infinnerty

Description

@Infinnerty

Initial Checks

Description

When using GoogleModel with Vertex AI, requests fail with a 400 error because pydantic_ai generates empty text parts {'text': ''} that Google's API rejects.

Error Message

400 Bad Request. {
  "error": {
    "code": 400,
    "message": "Unable to submit request because it must have a text parameter. Add a text parameter and try again. Learn more: https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/gemini",
    "status": "INVALID_ARGUMENT"
  }
}

Root Cause

In pydantic_ai/models/google.py, the _map_messages() method at lines 362-364 adds empty text parts as a defensive measure:

# Google GenAI requires at least one part in the message.
if not message_parts:
    message_parts = [{'text': ''}]

However, Google's Vertex AI API strictly validates the text parameter and rejects empty strings, requiring non-empty text content.

Reproduction

This occurs when:

  • Messages have no content after processing
  • Message parts arrays become empty during conversion
  • Certain edge cases in message handling

Impact

  • Affects all users of GoogleModel with Vertex AI
  • Causes complete request failures with 400 errors
  • Blocks streaming and non-streaming requests

Suggested Fix

Replace the empty string with a minimal valid text:

# Instead of:
if not message_parts:
    message_parts = [{'text': ''}]

# Use:
if not message_parts:
    message_parts = [{'text': ' '}]  # Single space

Or add provider-specific handling for Google's stricter validation requirements.

Temporary Workaround

We've implemented a subclass that filters empty text parts:

class FixedGoogleModel(GoogleModel):
    async def _map_messages(self, messages):
        system_instruction, contents = await super()._map_messages(messages)

        # Filter out empty text parts and ensure valid structure
        cleaned_contents = []
        for content_item in contents:
            if "parts" not in content_item:
                cleaned_contents.append(content_item)
                continue

            original_parts = content_item.get("parts", [])
            cleaned_parts = [
                part for part in original_parts
                if not (isinstance(part, dict) and part.get("text") == "" and len(part) == 1)
            ]

            if not cleaned_parts:
                cleaned_parts = [{"text": " "}]

            content_item["parts"] = cleaned_parts
            cleaned_contents.append(content_item)

        if not cleaned_contents:
            cleaned_contents = [{"role": "user", "parts": [{"text": " "}]}]

        return system_instruction, cleaned_contents

This workaround successfully resolves the 400 errors while maintaining API compatibility.

Example Code

import asyncio
from pydantic_ai import Agent
from pydantic_ai.models.google import GoogleModel
from pydantic_ai.providers.google import GoogleProvider

async def reproduce_bug():
    # Set up GoogleModel with Vertex AI
    provider = GoogleProvider(
        vertexai=True,
        location="us-central1"  # or your preferred location
    )
    model = GoogleModel("gemini-pro-2.5", provider=provider)
    
    # Create an agent
    agent = Agent(model=model)
    
    # This scenario can trigger empty message parts
    # Case 1: Empty string message
    try:
        result = await agent.run("")
        print("Empty string succeeded:", result)
    except Exception as e:
        print("Empty string failed:", str(e))
    
    # Case 2: Message that becomes empty after processing
    try:
        result = await agent.run("   ")  # Only whitespace
        print("Whitespace succeeded:", result)
    except Exception as e:
        print("Whitespace failed:", str(e))

# Run the reproduction
if __name__ == "__main__":
    asyncio.run(reproduce_bug())

Python, Pydantic AI & LLM client version

Pydantic AI version: v0.3.1
Python version: 3.12
Provider: Google Vertex AI
Model: gemini-pro-2.5 (via GoogleModel/Vertex)

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions