Skip to content

Implement adaptive chunked compaction #746

@bug-ops

Description

@bug-ops

Parent: #740 (P1)

Problem

Single summarization request for large context may exceed model token limits or produce lower quality summaries.

Solution

  • Split compaction input into token-budgeted chunks (max ~4K tokens each)
  • Summarize chunks in parallel via tokio::join!
  • Merge partial summaries with a final consolidation prompt
  • Add 1.2x safety margin to token budget calculations
  • Detect oversized single messages (>50% of chunk budget) and handle separately

Affected crates

  • zeph-core (compaction logic in agent/context.rs)

Acceptance criteria

  • Compaction works correctly with >32K token contexts
  • Parallel chunk summarization
  • Final merge produces coherent summary
  • Graceful handling of oversized messages

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions