Skip to content

_to_litellm_response_format produces invalid strict schema for OpenAI models with nested Pydantic models #4573

@karthikhic

Description

@karthikhic

🔴 Required Information

Describe the Bug:

_to_litellm_response_format() generates JSON schemas that OpenAI's strict structured outputs API rejects when the output_schema Pydantic model has nested models, optional fields with defaults, or $ref references.

The function calls model_json_schema() and sends the result with "strict": True, but only patches additionalProperties: false at the root level. OpenAI strict mode requires three transformations that are not performed on the full schema tree:

  1. additionalProperties: false on all objects (including nested models in $defs)
  2. All properties listed in required (Pydantic omits fields with defaults like Optional[X] = None or default="end")
  3. $ref nodes must have no sibling keywords (Pydantic generates {"$ref": "...", "description": "..."})

Steps to Reproduce:

  1. Install google-adk==1.25.0
  2. Create a Pydantic model with nested sub-models and optional fields
  3. Use it as output_schema on an LlmAgent with a LiteLLM OpenAI model
  4. Run the agent — OpenAI returns 400 Bad Request
from pydantic import BaseModel, Field
from typing import Optional
from google.adk.agents.llm_agent import LlmAgent
from google.adk.models.lite_llm import LiteLlm

class Inner(BaseModel):
    value: str = Field(description="A value")
    optional_field: Optional[str] = Field(default=None, description="Optional")

class Outer(BaseModel):
    inner: Inner = Field(description="Nested model")
    name: str

agent = LlmAgent(
    name="test",
    model=LiteLlm(model="openai/gpt-4.1"),
    instruction="Return structured output",
    output_schema=Outer,
)

Error sequence (each surfaces after fixing the previous):

BadRequestError: Invalid schema for response_format 'Outer':
In context=(), 'additionalProperties' is required to be supplied and to be false.
BadRequestError: Invalid schema for response_format 'Outer':
In context=(), 'required' is required to be supplied and to be an array
including every key in properties. Missing 'optional_field'.
BadRequestError: Invalid schema for response_format 'Outer':
context=('properties', 'inner'), $ref cannot have keywords {'description'}.

Expected Behavior:

_to_litellm_response_format() should produce a schema that satisfies all OpenAI strict-mode requirements, so that output_schema with nested Pydantic models works out of the box with OpenAI models via LiteLLM.

Observed Behavior:

OpenAI returns 400 Bad Request. The schema sent to OpenAI has "strict": True but is missing required transformations on nested objects. Only the root-level additionalProperties is patched (lines 1516-1523 of lite_llm.py).

Environment Details:

  • ADK Library Version: 1.25.0
  • Desktop OS: macOS (also reproduces on Linux in production ECS containers)
  • Python Version: 3.12.7

Model Information:

  • Are you using LiteLLM: Yes
  • Which model is being used: gpt-4.1 via LiteLLM with custom api_base

🟡 Optional Information

Regression:

Yes. This worked in ADK 1.17.0. In that version, _get_completion_inputs() passed the raw Pydantic class directly to LiteLLM:

# ADK 1.17.0 — worked
response_format = llm_request.config.response_schema  # Pydantic class passed through

LiteLLM received the Pydantic class and handled the full conversion internally (using OpenAI's to_strict_json_schema() under the hood). This correctly produced a valid strict schema.

The regression was introduced across two commits:

Logs:

litellm.BadRequestError: OpenAIException - Invalid schema for response_format
'Outer': In context=(), 'additionalProperties' is required to be supplied
and to be false.

Minimal Reproduction Code:

import asyncio
from pydantic import BaseModel, Field
from typing import Optional, List
from google.adk.agents.llm_agent import LlmAgent
from google.adk.models.lite_llm import LiteLlm
from google.adk.runners import InMemoryRunner
from google.genai import types

class InnerTargets(BaseModel):
    services: Optional[List[str]] = Field(default=None, description="Service names")

class InnerGoal(BaseModel):
    theory: str = Field(description="Hypothesis")
    targets: InnerTargets = Field(description="Investigation targets")

class Rationale(BaseModel):
    goals: List[str] = Field(description="Goals")
    reasoning: str = Field(description="Why this approach")

class OutputSchema(BaseModel):
    goal: InnerGoal = Field(description="The goal")
    pattern: str = Field(description="Generated pattern")
    rationale: Rationale = Field(description="Rationale")
    notes: Optional[str] = Field(default=None, description="Optional notes")

agent = LlmAgent(
    name="test_agent",
    model=LiteLlm(model="openai/gpt-4.1"),
    instruction="Generate a structured investigation plan.",
    output_schema=OutputSchema,
)

async def main():
    runner = InMemoryRunner(app_name="test", agent=agent)
    await runner.session_service.create_session(
        app_name="test", user_id="user1", session_id="s1"
    )
    user_msg = types.Content(role="user", parts=[types.Part(text="Plan an investigation")])
    async for event in runner.run_async(user_id="user1", session_id="s1", new_message=user_msg):
        print(event)

asyncio.run(main())

Suggested Fix:

The _to_litellm_response_format() function should recursively transform the entire schema tree for OpenAI strict mode. A minimal fix would replace the current root-only patch:

# Current (incomplete — only patches root):
if schema_dict.get("type") == "object" and "additionalProperties" not in schema_dict:
    schema_dict["additionalProperties"] = False

With a recursive transformation:

def _enforce_strict_openai_schema(schema: dict) -> None:
    """Recursively make schema compatible with OpenAI strict structured outputs."""
    if not isinstance(schema, dict):
        return
    if schema.get("type") == "object" and "properties" in schema:
        schema["additionalProperties"] = False
        schema["required"] = sorted(schema["properties"].keys())
    if "$ref" in schema:
        for key in list(schema.keys()):
            if key != "$ref":
                del schema[key]
        return
    for defn in schema.get("$defs", {}).values():
        _enforce_strict_openai_schema(defn)
    for prop in schema.get("properties", {}).values():
        _enforce_strict_openai_schema(prop)
    for key in ("anyOf", "oneOf", "allOf"):
        for item in schema.get(key, []):
            _enforce_strict_openai_schema(item)
    if "items" in schema:
        _enforce_strict_openai_schema(schema["items"])

Alternatively, for OpenAI models, the function could delegate to OpenAI's own openai.lib._pydantic.to_strict_json_schema() when the input is a Pydantic class, which handles all of these cases.

How often has this issue occurred?:

  • Always (100%) — any output_schema with nested Pydantic models fails with OpenAI models via LiteLLM.

Additional Context:

  • The issue only affects OpenAI-compatible models (the Gemini code path uses "type": "json_object" without strict and is unaffected)
  • Simple flat models (no nested BaseModel, no optional fields with defaults) may work because the root-level patch is sufficient
  • Our workaround is overriding model_json_schema() on a custom base class to post-process the schema before the ADK sees it

Metadata

Metadata

Assignees

No one assigned

    Labels

    models[Component] Issues related to model support

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions