-
-
Notifications
You must be signed in to change notification settings - Fork 97
Mode Based Routing Usage Guide
AI Runner now features intelligent mode-based routing that automatically directs your queries to specialized agents based on intent. This results in:
- ✅ Faster responses (4-6 focused tools vs 37 global tools)
- ✅ Better accuracy (specialized agents with domain context)
- ✅ Clearer reasoning (focused tool sets reduce confusion)
- ✅ Trajectory tracking (monitor agent decision paths)
from airunner.components.llm.managers.workflow_manager import WorkflowManager
# Enable mode-based routing
manager = WorkflowManager(
use_mode_routing=True, # Enable intelligent routing
)
workflow = manager.build_workflow()
result = workflow.invoke({
"messages": [{"role": "user", "content": "Write a short story about robots"}]
})
# Automatically routed to Author Mode → uses writing tools# Force code mode (skip intent classification)
manager = WorkflowManager(
use_mode_routing=True,
mode_override="code", # Always use code mode
)When to Use: Creative writing, editing, style improvement
Example Queries:
- "Write a short story about space exploration"
- "Improve this essay's readability"
- "Check this article for grammar errors"
- "Find better words for 'said' in my dialogue"
Available Tools (4):
-
improve_writing()- Style and clarity suggestions -
check_grammar()- Grammar and spelling checks -
find_synonyms()- Thesaurus lookup -
analyze_writing_style()- Tone, readability analysis
Example:
result = workflow.invoke({
"messages": [{
"role": "user",
"content": "Write a poem about autumn"
}]
})
# Mode: author
# Tools used: analyze_writing_style, improve_writing
# Output: Creative poem with refined languageWhen to Use: Programming, debugging, code review
Example Queries:
- "Debug this Python function"
- "Format this code with Black"
- "What's the complexity of this algorithm?"
- "Create a Python file for a web scraper"
Available Tools (6):
-
execute_python()- Safe code execution -
format_code()- Code formatting (Black) -
lint_code()- Code linting (Pylint) -
create_code_file()- Create code files -
read_code_file()- Read code files -
analyze_code_complexity()- Complexity metrics
Example:
result = workflow.invoke({
"messages": [{
"role": "user",
"content": "Debug this code:\ndef sum(a, b):\n return a + c"
}]
})
# Mode: code
# Tools used: lint_code, execute_python
# Output: Identifies undefined variable 'c', suggests fixWhen to Use: Information gathering, source synthesis, citations
Example Queries:
- "Research the impacts of climate change"
- "Compare sources on AI ethics"
- "Cite these sources in APA format"
- "Extract key points from these articles"
Available Tools (5):
-
synthesize_sources()- Combine multiple sources -
cite_sources()- Format citations (APA/MLA/Chicago) -
organize_research()- Structure findings by theme -
extract_key_points()- Extract main ideas -
compare_sources()- Compare perspectives
Example:
result = workflow.invoke({
"messages": [{
"role": "user",
"content": "Research quantum computing applications and cite sources"
}]
})
# Mode: research
# Tools used: synthesize_sources, cite_sources
# Output: Structured research summary with citationsWhen to Use: Fact-based questions, answer verification
Example Queries:
- "What causes rainbows?"
- "Who invented the telephone?"
- "When did World War II end?"
- "Is this answer accurate: [claim]?"
Available Tools (6):
-
verify_answer()- Fact check against sources -
score_answer_confidence()- Confidence scoring -
extract_answer_from_context()- Reading comprehension -
generate_clarifying_questions()- Ask for more context -
rank_answer_candidates()- Rank possible answers -
identify_answer_type()- Question classification
Example:
result = workflow.invoke({
"messages": [{
"role": "user",
"content": "What causes the northern lights?"
}]
})
# Mode: qa
# Tools used: identify_answer_type, verify_answer, score_answer_confidence
# Output: Verified factual answer with confidence scoreWhen to Use: Ambiguous queries, multi-mode requests, fallback
Activated When:
- Intent confidence < 0.6 (configurable threshold)
- Query spans multiple modes ("Write code and explain the algorithm")
- User explicitly requests general mode
- Intent classification unclear
Available Tools: All cross-mode tools (SYSTEM, IMAGE, MATH, CONVERSATION, etc.)
Example:
result = workflow.invoke({
"messages": [{
"role": "user",
"content": "Help me with my project" # Ambiguous
}]
})
# Mode: general (confidence too low)
# Tools: All available
# Output: Asks clarifying questionsfrom airunner.components.eval.utils.tracking import track_trajectory
result, trajectory = track_trajectory(
workflow,
{"messages": [{"role": "user", "content": "Debug this code"}]},
)
print(f"Classified Mode: {trajectory['metadata']['mode']}")
# Output: code
print(f"Confidence: {trajectory['metadata']['confidence']}")
# Output: 0.92
print(f"Reasoning: {trajectory['metadata']['reasoning']}")
# Output: "User explicitly requested debugging, a code-related task"Lower threshold = more aggressive routing (might misclassify) Higher threshold = more conservative (uses general mode more)
from airunner.components.llm.managers.mode_router import classify_intent
# Default: 0.6
intent = classify_intent(query, model, threshold=0.5) # More aggressive
intent = classify_intent(query, model, threshold=0.8) # More conservativeresult, trajectory = track_trajectory(workflow, input_data)
print(f"Tools Called: {trajectory['tools']}")
# Output: ['lint_code', 'execute_python']
print(f"Node Path: {trajectory['nodes']}")
# Output: ['classify_intent', 'code_node', 'analyze_code', 'call_model', 'lint_code']Mode persists across turns if context unchanged:
# Turn 1
result1 = workflow.invoke({
"messages": [{"role": "user", "content": "Write a story about cats"}]
})
# Mode: author
# Turn 2 (continues in author mode)
result2 = workflow.invoke({
"messages": [
{"role": "user", "content": "Write a story about cats"},
{"role": "assistant", "content": result1["messages"][-1]},
{"role": "user", "content": "Make it more dramatic"}
]
})
# Mode: author (context indicates continuation)# Turn 1: Author mode
result1 = workflow.invoke({
"messages": [{"role": "user", "content": "Write a poem"}]
})
# Turn 2: Explicitly switch to code mode
result2 = workflow.invoke({
"messages": [
*result1["messages"],
{"role": "user", "content": "Now debug this Python code: ..."}
]
})
# Mode: code (detected mode switch)manager = WorkflowManager(
use_mode_routing=True, # Enable/disable routing
mode_override="code", # Force specific mode (optional)
chat_model=custom_model, # Custom LLM model
)# In GUI settings
settings = {
"llm": {
"use_mode_routing": True,
"mode_confidence_threshold": 0.6,
"default_mode": "general",
}
}Problem: Query classified to incorrect mode
Solutions:
-
Use
mode_overrideto force correct mode:manager = WorkflowManager(use_mode_routing=True, mode_override="code")
-
Lower confidence threshold for more aggressive routing:
intent = classify_intent(query, model, threshold=0.5)
-
Rephrase query with mode keywords:
❌ "Fix this" → Ambiguous ✅ "Debug this Python code" → Clear code intent
Problem: Most queries routed to general fallback
Causes:
- Threshold too high (default 0.6)
- Ambiguous phrasing
- Multi-mode requests
Solutions:
-
Lower threshold:
intent = classify_intent(query, model, threshold=0.5)
-
Add mode-specific keywords:
❌ "Help with my work" → Ambiguous ✅ "Help me write this essay" → Author mode -
Use mode override if you know the mode:
manager = WorkflowManager(use_mode_routing=True, mode_override="research")
Problem: Agent can't find expected tool
Check:
-
Tool registered with correct category:
@tool(category=ToolCategory.CODE) # Correct category -
Tool imported in
tools/__init__.py:from .code_tools import execute_python
-
Debug available tools:
from airunner.components.llm.core.tool_registry import ToolRegistry registry = ToolRegistry() code_tools = registry.get_tools_by_category(ToolCategory.CODE) print([t.name for t in code_tools])
Problem: Mode routing adds latency
Optimizations:
-
Use faster model for intent classification:
from langchain_openai import ChatOpenAI fast_model = ChatOpenAI(model="gpt-3.5-turbo") # For classification powerful_model = ChatOpenAI(model="gpt-4") # For tasks
-
Skip classification with mode_override (if mode known):
manager = WorkflowManager(use_mode_routing=True, mode_override="code") # Skips intent classification call
-
Cache common patterns (future enhancement)
- 37 tools available every query
- Tool selection slow (large search space)
- Model confused by irrelevant tools
- Average response time: ~3-5 seconds
- 4-6 tools per mode
- Tool selection fast (focused search)
- Model focused on relevant tools
- Average response time: ~2-3 seconds
- Intent classification overhead: ~200-500ms
Net Improvement: 20-40% faster responses + better accuracy
Before:
manager = WorkflowManager()
workflow = manager.build_workflow()After:
manager = WorkflowManager(use_mode_routing=True)
workflow = manager.build_workflow()That's it! Backward compatible - no breaking changes.
# Toggle per session based on feature flag
use_routing = settings.get("enable_mode_routing", False)
manager = WorkflowManager(use_mode_routing=use_routing)# Test with diverse queries
test_queries = [
"Write a poem", # Should route to author
"Debug this code", # Should route to code
"Research AI ethics", # Should route to research
"What is photosynthesis?", # Should route to qa
]
for query in test_queries:
result, trajectory = track_trajectory(
workflow,
{"messages": [{"role": "user", "content": query}]}
)
print(f"{query} → {trajectory['metadata']['mode']}")# Enable author mode
manager = WorkflowManager(use_mode_routing=True)
workflow = manager.build_workflow()
# Write creative content
result = workflow.invoke({
"messages": [{
"role": "user",
"content": "Write a suspenseful opening paragraph for a mystery novel"
}]
})
# Agent uses:
# - analyze_writing_style() to determine tone
# - improve_writing() for clarity
# - find_synonyms() for variety# Enable code mode
manager = WorkflowManager(use_mode_routing=True)
workflow = manager.build_workflow()
# Debug code
result = workflow.invoke({
"messages": [{
"role": "user",
"content": """
Debug this Python function:
def factorial(n):
if n = 0:
return 1
return n * factorial(n - 1)
"""
}]
})
# Agent uses:
# - lint_code() to find syntax error (= vs ==)
# - execute_python() to test fix
# - analyze_code_complexity() for optimization suggestions# Enable research mode
manager = WorkflowManager(use_mode_routing=True)
workflow = manager.build_workflow()
# Gather research
result = workflow.invoke({
"messages": [{
"role": "user",
"content": "Research renewable energy trends and cite sources in APA format"
}]
})
# Agent uses:
# - synthesize_sources() to combine findings
# - organize_research() to structure by topic
# - cite_sources() for APA citations# Enable QA mode
manager = WorkflowManager(use_mode_routing=True)
workflow = manager.build_workflow()
# Answer factual question
result = workflow.invoke({
"messages": [{
"role": "user",
"content": "What causes ocean tides?"
}]
})
# Agent uses:
# - identify_answer_type() → "explanation"
# - verify_answer() against sources
# - score_answer_confidence() → 0.95If you know the mode in advance, skip classification:
# Building a code-only chat interface
manager = WorkflowManager(use_mode_routing=True, mode_override="code")Track mode classification accuracy:
from airunner.components.eval.utils.tracking import track_trajectory
result, trajectory = track_trajectory(workflow, input_data)
# Log to monitoring system
logger.info(f"Mode: {trajectory['metadata']['mode']}")
logger.info(f"Confidence: {trajectory['metadata']['confidence']}")
logger.info(f"Tools: {trajectory['tools']}")Better queries = better routing:
❌ "Help me" → General mode (ambiguous)
✅ "Help me debug this code" → Code mode
✅ "Help me write this essay" → Author mode
Build context across turns:
# Turn 1: Research
result1 = workflow.invoke({"messages": [...]})
# Turn 2: Write (uses research context)
result2 = workflow.invoke({"messages": [..., "Now write an article from this research"]})Test ambiguous and multi-mode queries:
test_cases = [
"Create a Python program and explain how it works", # Multi-mode
"Help me", # Ambiguous
"", # Empty
]- Mode-Based Agent Architecture (Planning Doc)
- Trajectory Evaluation Guide
- Evaluation Testing Strategy
- Architecture Overview
Last Updated: 2025-11-01
Version: 1.0.0
Status: Production Ready