Skip to content

Fix: ConversationMessageInput character limit too small for large responses #460

@manavgup

Description

@manavgup

Problem

When using max_new_tokens=800, responses exceed the 10,000 character validation limit:

pydantic_core._pydantic_core.ValidationError: 1 validation error for ConversationMessageInput
content
  String should have at most 10000 characters
  File: rag_solution/schemas/conversation_schema.py:237

Root Cause

  1. ConversationMessageInput.content has max_length=10000 (conversation_schema.py:237)
  2. This was set when max_new_tokens=100 (producing ~400-600 char responses)
  3. With max_new_tokens=800, responses can be 8x longer (~3,000-6,000 chars minimum)
  4. Token-to-character ratio varies: 1 token ≈ 4-7 characters (depends on content)
  5. With 2197 tokens, actual response was ~11,000-15,000 characters

Impact

  • Severity: HIGH - Blocks conversation storage for any response > 10k chars
  • Affects: Search results, chat responses, conversation history
  • User Experience: Users cannot see results

Proposed Solution

Increase character limit to accommodate larger responses:

# Before
content: str = Field(..., min_length=1, max_length=10000, description="Message content")

# After
content: str = Field(..., min_length=1, max_length=50000, description="Message content")

Why 50,000 Characters?

  • max_new_tokens=800 → ~3,200-5,600 chars typically
  • Markdown formatting adds overhead (~20-30%)
  • Source citations add content
  • Safety margin for complex responses
  • 50,000 chars ≈ ~7,000-12,000 tokens (safe upper bound)

Alternative Solution

Make limit configurable based on max_new_tokens:

# Dynamic limit based on token settings
max_chars = settings.max_new_tokens * 7  # Conservative estimate
content: str = Field(..., max_length=max_chars)

Files Involved

  • backend/rag_solution/schemas/conversation_schema.py:237

Temporary Fix Applied

Increased limit to max_length=50000 in conversation_schema.py:237

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions