Add JSON Schema Support for Structured AI Outputs

**What you'd like to see**

**Proposal:** Add JSON Schema support for structured AI outputs. This would allow agents to define output schemas that get validated at the API level for supported models (OpenAI, Grok), ensuring consistent and reliable structured responses from AI models.

Key components:

- New `schemas` section in YAML configuration files with native YAML syntax
- Schema references using `$ref` for composition and reuse
- Model-specific schema configuration in agent definitions
- Support for both inline schemas and external schema files
- Configuration validation that fails when schemas are specified for unsupported models

Example configuration:

```yaml
schemas:
  event:
    type: object
    required: [title]
    properties:
      title: 
        type: string
        description: "Event title"
  response:
    type: object
    required: [events]
    properties:
      events: 
        type: array
        items: 
          $ref: "#/schemas/event"

agents:
  root:
    model: openai
    description: "Event extractor"
    instruction: "Extract events from text"
    schema:
      output: event
```

**Why you'd like to see it**

Structured outputs would help me build more reliable AI agents that can consistently return data in expected formats. Currently, I have to implement post-processing validation and retry logic when AI responses don't match expected structures, which adds complexity and reduces reliability. This feature would eliminate the need for manual validation and retry mechanisms, making agent responses more predictable and reducing development overhead.

The feature would be particularly valuable for:

- Data extraction agents that need consistent JSON output
- API integration scenarios requiring specific response formats
- Multi-agent workflows where downstream agents depend on structured data from upstream agents

**Workarounds?**

Currently using manual JSON parsing and validation in post-processing steps, along with retry logic when responses don't match expected formats. This involves:

- Custom validation functions for each expected output structure
- Retry mechanisms with different prompts when validation fails
- Error handling for malformed JSON responses
- Manual schema enforcement in agent instructions

**Additional context**

- The approach is inspired by `jsonSchema` in [GitHub's gh-models project](https://github.com/github/gh-models/blob/f38cd97dc0417a116879b5a6dedec0c8753cad4b/examples/json_schema_prompt.yml#L4)
- OpenAI ([structured output](https://platform.openai.com/docs/guides/structured-outputs) using `response_format`)
- Grok [$structured outputs$](https://docs.x.ai/docs/guides/structured-outputs)

Implementation would use `github.com/xeipuuv/gojsonschema` for full JSON Schema Draft 7 support, with schema resolution and `$ref` handling for composition. The feature would be backwards compatible - existing agent configurations without schema definitions would continue to work unchanged.

Future considerations include extending support to input validation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add JSON Schema Support for Structured AI Outputs #430

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add JSON Schema Support for Structured AI Outputs #430

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions