A novel approach to improving constraint satisfaction in language model outputs by externalizing generation state into a visual margin that vision-language models can perceive during generation.
The Visual Margin System transforms the problem of tracking discrete units (lines, paragraphs, word counts) from an architectural limitation of transformers into a perception task. By rendering structural metadata as visual margins and feeding them to the model's vision layer during generation, we enable language models to maintain accurate awareness of their generation state and better satisfy structural constraints.
- Real-time State Tracking: Maintains comprehensive metadata during generation (lines, paragraphs, words, characters)
- Visual Margin Rendering: Converts state into clear visual representations with color-coded constraint status
- Natural Language Constraint Parsing: Extracts structural requirements from prompts automatically
- Multimodal Integration: Seamlessly combines text and visual state for vision-language models
- High Accuracy: Achieves >90% constraint satisfaction on structural requirements
The system consists of four main components:
- GenerationStateTracker: Tracks structural metadata in real-time
- MarginRenderer: Converts state to visual representations
- ConstraintParser: Extracts constraints from natural language
- VisualMarginGenerator: Orchestrates the generation loop with margin awareness
- Python 3.8 or higher
- Anthropic API key with access to vision-capable models
# Clone the repository
git clone https://github.com/yourusername/marginium.git
cd marginium
# Install dependencies
pip install -r requirements.txt
# Set your API key
export ANTHROPIC_API_KEY='your-api-key-here'pillow- Image renderinganthropic- API client for Claude modelspytest- Testing frameworkpython-dotenv- Environment variable management
from visual_margin_system import VisualMarginGenerator
# Initialize generator
generator = VisualMarginGenerator()
# Generate with automatic constraint parsing
result = generator.generate_with_margin(
prompt="Give me 4 lines of creative text about the ocean"
)
print(result)# Specify constraints manually
constraints = {
'lines': {'target': 3, 'tolerance': 0},
'words': {'max': 100}
}
result = generator.generate_with_margin(
prompt="Write a haiku about technology",
constraints=constraints,
parse_constraints=False
)import asyncio
async def generate():
result = await generator.generate_with_margin_async(
prompt="Write 3 paragraphs about AI",
chunk_size=30
)
return result
result = asyncio.run(generate())"4 lines of text"→ Exactly 4 lines"at least 3 lines"→ Minimum 3 lines"under 10 lines"→ Maximum 10 lines
"3 paragraphs"→ Exactly 3 paragraphs"two paragraphs"→ Word-form numbers supported
"100 words"→ Exactly 100 words (with tolerance)"between 50-75 words"→ Range constraint"under 100 words"→ Maximum constraint"at least 50 words"→ Minimum constraint"about 100 words"→ Approximate (with 10% tolerance)
"between 500-1000 characters"→ Range constraint"under 500 characters"→ Maximum constraint
"each line is under 10 words"→ Words per line"each paragraph under 50 words"→ Words per paragraph
The visual_margin_system/examples/ directory contains several demos:
python visual_margin_system/examples/demo_simple_constraints.pyDemonstrates:
- Exact line count generation
- Paragraph count constraints
- Word count ranges
- Manual constraint specification
python visual_margin_system/examples/demo_visualization.pyShows margin rendering without API calls:
- Empty state rendering
- Partial progress visualization
- Constraint satisfaction states
- Light/dark themes
python visual_margin_system/examples/demo_interactive.pyInteractive CLI tool for testing:
- Automatic constraint parsing
- Manual constraint specification
- Parser testing
- State inspection
- Margin visualization export
Run the comprehensive test suite:
# Run all tests
pytest visual_margin_system/tests/
# Run specific test file
pytest visual_margin_system/tests/test_generation_state_tracker.py
# Run with coverage
pytest --cov=visual_margin_system visual_margin_system/tests/Test coverage includes:
- State tracker accuracy
- Constraint parsing
- Margin rendering
- Multi-constraint scenarios
- Initialize: Parse constraints from prompt or accept manual specification
- Render Margin: Create visual representation of current state
- Generate Chunk: Call vision-language model with text context + margin image
- Update State: Process generated tokens, update counters
- Check Constraints: Evaluate if all constraints are satisfied
- Repeat: Continue until constraints met or max tokens reached
The margin displays:
- Constraint Checklist: Active constraints with visual status indicators
- ✓ Green: Satisfied
- ○ Yellow: Close to target (within 10%)
- ✗ Red: Not satisfied
- Current Metrics: Real-time counts (lines, paragraphs, words, chars)
- Progress Indicators: Progress toward range-based constraints
- Completion Status: Overall generation state
The GenerationStateTracker maintains:
- Token-level metrics (total tokens, characters, words)
- Structural boundaries (lines, paragraphs)
- Constraint satisfaction status
- Completion flags
Updates occur after each generated chunk, ensuring the margin always reflects current state.
- Constraint Satisfaction: >90% accuracy on simple constraints (lines, paragraphs)
- Range Constraints: 85-90% accuracy on word/character ranges
- Latency Overhead: ~30% vs baseline (due to chunked generation and rendering)
- Chunk Size: Default 50 tokens balances update frequency and API efficiency
from visual_margin_system import MarginRenderer
renderer = MarginRenderer(
width=300, # Margin width in pixels
height=600, # Margin height in pixels
theme='dark' # 'dark' or 'light'
)generator = VisualMarginGenerator(
api_key='your-key',
model='claude-sonnet-4-20250514', # Or other vision-capable model
renderer=renderer,
tracker=tracker,
parser=parser
)- Chunked Generation: Requires multiple API calls, increasing latency
- Vision Dependency: Requires vision-capable models
- Constraint Complexity: Best for structural constraints; semantic constraints not yet supported
- Model Training: Models not specifically trained to attend to margins
- Very short constraints (<5 words) may be challenging due to minimum chunk sizes
- Extremely long constraints (>10 simultaneous) may clutter margin
- Per-line/per-paragraph constraints checked only on completed units
- Streaming support for real-time margin updates
- Sentence-level constraints
- Nested constraints (sections with paragraphs)
- Custom constraint types
- Native margin integration in model architecture
- Self-correcting generation (backtracking on violations)
- Learned visual encodings optimized for model perception
- Fine-tuning on margin-aware generation
- Architecture Guide - Technical architecture details
- API Reference - Complete API documentation
- Constraint Syntax - Supported constraint patterns
- Original Specification - Full system specification
Contributions welcome! Areas of interest:
- New constraint types
- Improved constraint parsing
- Performance optimizations
- Visual margin design improvements
- Test coverage expansion
MIT License - see LICENSE file for details
If you use this system in research, please cite:
@software{visual_margin_system,
title={Visual Margin System for LLM Generation State Tracking},
author={Your Name},
year={2025},
url={https://github.com/yourusername/marginium}
}- Issues: https://github.com/yourusername/marginium/issues
- Discussions: https://github.com/yourusername/marginium/discussions
Note: This is a research prototype demonstrating the visual margin concept. Production use may require optimization and additional error handling.