From 536256fb70013d758af7c3da7db5b8d5573016a3 Mon Sep 17 00:00:00 2001
From: manavgup <manavg@gmail.com>
Date: Thu, 27 Nov 2025 13:05:52 -0500
Subject: [PATCH] docs: Add agentic RAG architecture documentation
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Add comprehensive architecture documentation for the Agentic RAG Platform:

- agentic-ui-architecture.md: React component hierarchy, state management,
  and API integration for agent features
- backend-architecture-diagram.md: Overall backend architecture with
  Mermaid diagrams showing service layers and data flow
- mcp-integration-architecture.md: MCP client/server integration strategy,
  PR comparison (#671 vs #684), and Context Forge integration
- rag-modulo-mcp-server-architecture.md: Exposing RAG capabilities as MCP
  server with tools (rag_search, rag_ingest, etc.) and resources
- search-agent-hooks-architecture.md: 3-stage agent pipeline (pre-search,
  post-search, response) with database schema and execution flow
- system-architecture.md: Complete system architecture overview with
  technology stack and data flows

These documents guide implementation of:
- PR #695 (SPIFFE/SPIRE agent identity)
- PR #671 (MCP Gateway client)
- Issue #697 (Agent execution hooks)
- Issue #698 (MCP Server)
- Issue #699 (Agentic UI)

Closes #696

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
---
 docs/architecture/agentic-ui-architecture.md  | 1470 +++++++++++++++++
 .../backend-architecture-diagram.md           |  517 ++++++
 .../mcp-integration-architecture.md           |  200 +++
 .../rag-modulo-mcp-server-architecture.md     |  689 ++++++++
 .../search-agent-hooks-architecture.md        |  416 +++++
 docs/architecture/system-architecture.md      |  425 +++++
 6 files changed, 3717 insertions(+)
 create mode 100644 docs/architecture/agentic-ui-architecture.md
 create mode 100644 docs/architecture/backend-architecture-diagram.md
 create mode 100644 docs/architecture/mcp-integration-architecture.md
 create mode 100644 docs/architecture/rag-modulo-mcp-server-architecture.md
 create mode 100644 docs/architecture/search-agent-hooks-architecture.md
 create mode 100644 docs/architecture/system-architecture.md

diff --git a/docs/architecture/agentic-ui-architecture.md b/docs/architecture/agentic-ui-architecture.md
new file mode 100644
index 00000000..ceabf6ec
--- /dev/null
+++ b/docs/architecture/agentic-ui-architecture.md
@@ -0,0 +1,1470 @@
+# Agentic UI Architecture
+
+**Date**: November 2025
+**Status**: Architecture Design
+**Version**: 1.0
+**Related Documents**:
+
+- [MCP Integration Architecture](./mcp-integration-architecture.md)
+- [SearchService Agent Hooks Architecture](./search-agent-hooks-architecture.md)
+- [RAG Modulo MCP Server Architecture](./rag-modulo-mcp-server-architecture.md)
+
+## Overview
+
+This document describes the frontend architecture for transforming RAG Modulo into a fully
+agentic RAG solution. It covers the React component hierarchy, state management, user
+interactions, and integration patterns needed to support:
+
+1. **Agent Configuration** - Per-collection agent assignment and configuration
+2. **Artifact Display** - Rendering and downloading agent-generated artifacts
+3. **Execution Visibility** - Real-time pipeline stage and agent status indicators
+4. **Agent Management** - Dashboard for managing user's agents and viewing analytics
+
+## Current Frontend Architecture
+
+### Existing Components (Reference)
+
+```
+frontend/src/components/
+├── agents/
+│   └── LightweightAgentOrchestration.tsx   # Existing workflow-focused agent UI
+├── search/
+│   ├── LightweightSearchInterface.tsx      # Main search chat interface
+│   ├── ChainOfThoughtAccordion.tsx         # CoT reasoning display
+│   ├── SourcesAccordion.tsx                # Document sources
+│   ├── CitationsAccordion.tsx              # Citation display
+│   └── TokenAnalysisAccordion.tsx          # Token usage metrics
+├── collections/
+│   ├── LightweightCollections.tsx          # Collection list
+│   └── LightweightCollectionDetail.tsx     # Collection settings
+└── ui/
+    ├── Card.tsx, Button.tsx, Modal.tsx     # Reusable UI components
+    └── ...
+```
+
+### Design System
+
+- **Framework**: React 18 with TypeScript
+- **Styling**: Tailwind CSS with Carbon Design System colors
+- **Icons**: Heroicons (@heroicons/react)
+- **State**: React hooks + Context (NotificationContext)
+- **Routing**: React Router DOM
+
+## New Component Architecture
+
+### Component Hierarchy
+
+```
+┌─────────────────────────────────────────────────────────────────────────────┐
+│                          App Layout                                          │
+│  ┌───────────────────────────────────────────────────────────────────────┐  │
+│  │  LightweightLayout (existing)                                         │  │
+│  │  ┌─────────────────────────────────────────────────────────────────┐  │  │
+│  │  │  Routes                                                         │  │  │
+│  │  │                                                                 │  │  │
+│  │  │  /search                                                        │  │  │
+│  │  │  └── LightweightSearchInterface (ENHANCED)                      │  │  │
+│  │  │      ├── SearchInput                                            │  │  │
+│  │  │      ├── MessageList                                            │  │  │
+│  │  │      │   └── MessageCard                                        │  │  │
+│  │  │      │       ├── ChainOfThoughtAccordion                        │  │  │
+│  │  │      │       ├── SourcesAccordion                               │  │  │
+│  │  │      │       ├── AgentArtifactsPanel (NEW)                      │  │  │
+│  │  │      │       │   └── ArtifactCard (NEW)                         │  │  │
+│  │  │      │       └── AgentExecutionIndicator (NEW)                  │  │  │
+│  │  │      └── AgentPipelineStatus (NEW)                              │  │  │
+│  │  │                                                                 │  │  │
+│  │  │  /collections/:id/settings                                      │  │  │
+│  │  │  └── LightweightCollectionDetail (ENHANCED)                     │  │  │
+│  │  │      └── CollectionAgentsTab (NEW)                              │  │  │
+│  │  │          ├── AgentList (NEW)                                    │  │  │
+│  │  │          ├── AgentConfigModal (NEW)                             │  │  │
+│  │  │          └── AgentMarketplace (NEW)                             │  │  │
+│  │  │                                                                 │  │  │
+│  │  │  /agents                                                        │  │  │
+│  │  │  └── AgentDashboard (NEW)                                       │  │  │
+│  │  │      ├── MyAgentsPanel (NEW)                                    │  │  │
+│  │  │      ├── AgentAnalytics (NEW)                                   │  │  │
+│  │  │      └── AgentAuditLog (NEW)                                    │  │  │
+│  │  │                                                                 │  │  │
+│  │  │  /agents/marketplace                                            │  │  │
+│  │  │  └── AgentMarketplacePage (NEW)                                 │  │  │
+│  │  │      ├── AgentCatalog (NEW)                                     │  │  │
+│  │  │      └── AgentDetailModal (NEW)                                 │  │  │
+│  │  └─────────────────────────────────────────────────────────────────┘  │  │
+│  └───────────────────────────────────────────────────────────────────────┘  │
+└─────────────────────────────────────────────────────────────────────────────┘
+```
+
+### File Structure
+
+```
+frontend/src/
+├── components/
+│   ├── agents/
+│   │   ├── LightweightAgentOrchestration.tsx  # Existing (keep for workflows)
+│   │   ├── AgentDashboard.tsx                  # NEW: Main agent management page
+│   │   ├── MyAgentsPanel.tsx                   # NEW: User's configured agents
+│   │   ├── AgentAnalytics.tsx                  # NEW: Agent usage stats
+│   │   ├── AgentAuditLog.tsx                   # NEW: Execution history
+│   │   ├── AgentMarketplacePage.tsx            # NEW: Browse available agents
+│   │   ├── AgentCatalog.tsx                    # NEW: Grid of available agents
+│   │   ├── AgentDetailModal.tsx                # NEW: Agent info and add button
+│   │   ├── CollectionAgentsTab.tsx             # NEW: Collection settings tab
+│   │   ├── AgentList.tsx                       # NEW: Agents for a collection
+│   │   ├── AgentConfigModal.tsx                # NEW: Configure agent settings
+│   │   └── AgentPriorityDragDrop.tsx           # NEW: Drag to reorder priority
+│   │
+│   ├── search/
+│   │   ├── LightweightSearchInterface.tsx      # ENHANCED: Add artifact support
+│   │   ├── AgentArtifactsPanel.tsx             # NEW: Container for artifacts
+│   │   ├── ArtifactCard.tsx                    # NEW: Single artifact display
+│   │   ├── ArtifactPreviewModal.tsx            # NEW: Preview images/PDFs
+│   │   ├── AgentExecutionIndicator.tsx         # NEW: Per-message agent badges
+│   │   └── AgentPipelineStatus.tsx             # NEW: Real-time pipeline stages
+│   │
+│   └── ui/
+│       ├── ProgressSteps.tsx                   # NEW: Pipeline stage indicator
+│       └── FileDownloadButton.tsx              # NEW: Base64 download handler
+│
+├── services/
+│   ├── apiClient.ts                            # ENHANCED: Add agent API methods
+│   └── agentApiClient.ts                       # NEW: Agent-specific API calls
+│
+├── types/
+│   └── agent.ts                                # NEW: Agent TypeScript interfaces
+│
+└── contexts/
+    └── AgentContext.tsx                        # NEW: Agent state management
+```
+
+## New Components Specification
+
+### 1. Search Interface Enhancements
+
+#### AgentArtifactsPanel
+
+Container for displaying agent-generated artifacts within search results.
+
+```typescript
+// frontend/src/components/search/AgentArtifactsPanel.tsx
+
+interface AgentArtifact {
+  agent_id: string;
+  type: 'pptx' | 'pdf' | 'png' | 'mp3' | 'html' | 'txt';
+  data: string;  // base64 encoded
+  filename: string;
+  metadata: Record<string, any>;
+}
+
+interface AgentArtifactsPanelProps {
+  artifacts: AgentArtifact[];
+  isLoading?: boolean;
+}
+
+const AgentArtifactsPanel: React.FC<AgentArtifactsPanelProps> = ({
+  artifacts,
+  isLoading
+}) => {
+  if (!artifacts?.length && !isLoading) return null;
+
+  return (
+    <div className="mt-4 border-t border-gray-20 pt-4">
+      <div className="flex items-center space-x-2 mb-3">
+        <DocumentIcon className="w-4 h-4 text-purple-60" />
+        <h4 className="text-sm font-medium text-gray-100">
+          Generated Artifacts ({artifacts.length})
+        </h4>
+      </div>
+
+      {isLoading ? (
+        <div className="grid grid-cols-2 md:grid-cols-4 gap-3">
+          {[1, 2].map(i => (
+            <Skeleton key={i} className="h-24 rounded-lg" />
+          ))}
+        </div>
+      ) : (
+        <div className="grid grid-cols-2 md:grid-cols-4 gap-3">
+          {artifacts.map((artifact, index) => (
+            <ArtifactCard key={index} artifact={artifact} />
+          ))}
+        </div>
+      )}
+    </div>
+  );
+};
+```
+
+#### ArtifactCard
+
+Individual artifact display with preview and download actions.
+
+```typescript
+// frontend/src/components/search/ArtifactCard.tsx
+
+interface ArtifactCardProps {
+  artifact: AgentArtifact;
+}
+
+const ArtifactCard: React.FC<ArtifactCardProps> = ({ artifact }) => {
+  const [previewOpen, setPreviewOpen] = useState(false);
+
+  const getIcon = () => {
+    switch (artifact.type) {
+      case 'pptx': return <PresentationChartBarIcon />;
+      case 'pdf': return <DocumentTextIcon />;
+      case 'png': return <PhotoIcon />;
+      case 'mp3': return <MusicalNoteIcon />;
+      case 'html': return <CodeBracketIcon />;
+      default: return <DocumentIcon />;
+    }
+  };
+
+  const getLabel = () => {
+    switch (artifact.type) {
+      case 'pptx': return 'PowerPoint';
+      case 'pdf': return 'PDF Report';
+      case 'png': return 'Chart';
+      case 'mp3': return 'Audio';
+      case 'html': return 'HTML';
+      default: return 'File';
+    }
+  };
+
+  const canPreview = ['png', 'pdf'].includes(artifact.type);
+
+  const handleDownload = () => {
+    const mimeTypes: Record<string, string> = {
+      pptx: 'application/vnd.openxmlformats-officedocument.presentationml.presentation',
+      pdf: 'application/pdf',
+      png: 'image/png',
+      mp3: 'audio/mpeg',
+      html: 'text/html',
+      txt: 'text/plain'
+    };
+
+    const blob = base64ToBlob(artifact.data, mimeTypes[artifact.type]);
+    const url = URL.createObjectURL(blob);
+    const a = document.createElement('a');
+    a.href = url;
+    a.download = artifact.filename;
+    a.click();
+    URL.revokeObjectURL(url);
+  };
+
+  return (
+    <>
+      <div className="bg-gray-10 rounded-lg p-3 hover:bg-gray-20 transition-colors">
+        <div className="flex items-center space-x-2 mb-2">
+          <div className="p-1.5 bg-purple-10 rounded text-purple-60">
+            {getIcon()}
+          </div>
+          <div className="flex-1 min-w-0">
+            <p className="text-xs font-medium text-gray-100 truncate">
+              {getLabel()}
+            </p>
+            <p className="text-xs text-gray-60 truncate">
+              {artifact.filename}
+            </p>
+          </div>
+        </div>
+
+        <div className="flex space-x-2">
+          {canPreview && (
+            <button
+              onClick={() => setPreviewOpen(true)}
+              className="flex-1 text-xs px-2 py-1 bg-gray-20 hover:bg-gray-30 rounded text-gray-100"
+            >
+              Preview
+            </button>
+          )}
+          <button
+            onClick={handleDownload}
+            className="flex-1 text-xs px-2 py-1 bg-blue-60 hover:bg-blue-70 rounded text-white"
+          >
+            Download
+          </button>
+        </div>
+
+        {artifact.metadata && (
+          <p className="text-xs text-gray-60 mt-2">
+            {artifact.metadata.slides && `${artifact.metadata.slides} slides`}
+            {artifact.metadata.width && `${artifact.metadata.width}x${artifact.metadata.height}`}
+          </p>
+        )}
+      </div>
+
+      {previewOpen && (
+        <ArtifactPreviewModal
+          artifact={artifact}
+          onClose={() => setPreviewOpen(false)}
+        />
+      )}
+    </>
+  );
+};
+```
+
+#### AgentPipelineStatus
+
+Real-time pipeline stage indicator shown during search.
+
+```typescript
+// frontend/src/components/search/AgentPipelineStatus.tsx
+
+type PipelineStage = 'pre_search' | 'search' | 'post_search' | 'generation' | 'response_agents' | 'complete';
+
+interface AgentPipelineStatusProps {
+  currentStage: PipelineStage;
+  stages: {
+    id: PipelineStage;
+    label: string;
+    agentCount: number;
+    status: 'pending' | 'running' | 'completed' | 'error';
+    duration?: number;
+  }[];
+  isVisible: boolean;
+}
+
+const AgentPipelineStatus: React.FC<AgentPipelineStatusProps> = ({
+  currentStage,
+  stages,
+  isVisible
+}) => {
+  if (!isVisible) return null;
+
+  return (
+    <div className="bg-blue-10 border border-blue-20 rounded-lg p-4 mb-4">
+      <div className="flex items-center space-x-2 mb-3">
+        <BoltIcon className="w-4 h-4 text-blue-60 animate-pulse" />
+        <span className="text-sm font-medium text-blue-60">
+          Agent Pipeline Processing
+        </span>
+      </div>
+
+      <div className="flex items-center justify-between">
+        {stages.map((stage, index) => (
+          <React.Fragment key={stage.id}>
+            <div className="flex flex-col items-center">
+              <div className={`
+                w-8 h-8 rounded-full flex items-center justify-center text-xs font-medium
+                ${stage.status === 'completed' ? 'bg-green-50 text-white' : ''}
+                ${stage.status === 'running' ? 'bg-blue-60 text-white animate-pulse' : ''}
+                ${stage.status === 'pending' ? 'bg-gray-20 text-gray-60' : ''}
+                ${stage.status === 'error' ? 'bg-red-50 text-white' : ''}
+              `}>
+                {stage.status === 'completed' ? (
+                  <CheckIcon className="w-4 h-4" />
+                ) : stage.status === 'running' ? (
+                  <ArrowPathIcon className="w-4 h-4 animate-spin" />
+                ) : (
+                  stage.agentCount
+                )}
+              </div>
+              <span className="text-xs text-gray-70 mt-1 text-center max-w-[80px]">
+                {stage.label}
+              </span>
+              {stage.duration && (
+                <span className="text-xs text-gray-60">
+                  {stage.duration}ms
+                </span>
+              )}
+            </div>
+
+            {index < stages.length - 1 && (
+              <div className={`
+                flex-1 h-0.5 mx-2
+                ${stages[index + 1].status !== 'pending' ? 'bg-blue-60' : 'bg-gray-20'}
+              `} />
+            )}
+          </React.Fragment>
+        ))}
+      </div>
+    </div>
+  );
+};
+```
+
+#### AgentExecutionIndicator
+
+Badge showing which agents processed a response.
+
+```typescript
+// frontend/src/components/search/AgentExecutionIndicator.tsx
+
+interface AgentExecution {
+  agent_id: string;
+  agent_name: string;
+  stage: 'pre_search' | 'post_search' | 'response';
+  duration_ms: number;
+  success: boolean;
+}
+
+interface AgentExecutionIndicatorProps {
+  executions: AgentExecution[];
+}
+
+const AgentExecutionIndicator: React.FC<AgentExecutionIndicatorProps> = ({
+  executions
+}) => {
+  if (!executions?.length) return null;
+
+  const [expanded, setExpanded] = useState(false);
+
+  const successCount = executions.filter(e => e.success).length;
+  const totalDuration = executions.reduce((sum, e) => sum + e.duration_ms, 0);
+
+  return (
+    <div className="mt-2">
+      <button
+        onClick={() => setExpanded(!expanded)}
+        className="flex items-center space-x-2 text-xs text-gray-60 hover:text-gray-100"
+      >
+        <CpuChipIcon className="w-3 h-3" />
+        <span>
+          {successCount}/{executions.length} agents • {totalDuration}ms
+        </span>
+        <ChevronDownIcon className={`w-3 h-3 transition-transform ${expanded ? 'rotate-180' : ''}`} />
+      </button>
+
+      {expanded && (
+        <div className="mt-2 space-y-1 pl-5">
+          {executions.map((exec, index) => (
+            <div
+              key={index}
+              className="flex items-center space-x-2 text-xs"
+            >
+              <span className={`w-1.5 h-1.5 rounded-full ${exec.success ? 'bg-green-50' : 'bg-red-50'}`} />
+              <span className="text-gray-70">{exec.agent_name}</span>
+              <span className="text-gray-50">({exec.stage})</span>
+              <span className="text-gray-60">{exec.duration_ms}ms</span>
+            </div>
+          ))}
+        </div>
+      )}
+    </div>
+  );
+};
+```
+
+### 2. Collection Agent Configuration
+
+#### CollectionAgentsTab
+
+Tab component for collection settings page to configure agents.
+
+```typescript
+// frontend/src/components/agents/CollectionAgentsTab.tsx
+
+interface CollectionAgentsTabProps {
+  collectionId: string;
+}
+
+const CollectionAgentsTab: React.FC<CollectionAgentsTabProps> = ({
+  collectionId
+}) => {
+  const [agents, setAgents] = useState<CollectionAgent[]>([]);
+  const [availableAgents, setAvailableAgents] = useState<AgentManifest[]>([]);
+  const [isLoading, setIsLoading] = useState(true);
+  const [showAddModal, setShowAddModal] = useState(false);
+  const [editingAgent, setEditingAgent] = useState<CollectionAgent | null>(null);
+  const { addNotification } = useNotification();
+
+  useEffect(() => {
+    loadAgents();
+  }, [collectionId]);
+
+  const loadAgents = async () => {
+    setIsLoading(true);
+    try {
+      const [collectionAgents, allAgents] = await Promise.all([
+        agentApiClient.getCollectionAgents(collectionId),
+        agentApiClient.getAvailableAgents()
+      ]);
+      setAgents(collectionAgents);
+      setAvailableAgents(allAgents);
+    } catch (error) {
+      addNotification('error', 'Error', 'Failed to load agents');
+    } finally {
+      setIsLoading(false);
+    }
+  };
+
+  const handleToggleAgent = async (agentConfigId: string, enabled: boolean) => {
+    try {
+      await agentApiClient.updateAgentConfig(agentConfigId, { enabled });
+      setAgents(prev => prev.map(a =>
+        a.id === agentConfigId ? { ...a, enabled } : a
+      ));
+    } catch (error) {
+      addNotification('error', 'Error', 'Failed to update agent');
+    }
+  };
+
+  const handleReorderAgents = async (reorderedAgents: CollectionAgent[]) => {
+    try {
+      // Update priorities based on new order
+      const updates = reorderedAgents.map((agent, index) => ({
+        id: agent.id,
+        priority: index
+      }));
+      await agentApiClient.batchUpdatePriorities(updates);
+      setAgents(reorderedAgents);
+    } catch (error) {
+      addNotification('error', 'Error', 'Failed to reorder agents');
+    }
+  };
+
+  return (
+    <div className="space-y-6">
+      {/* Header */}
+      <div className="flex items-center justify-between">
+        <div>
+          <h3 className="text-lg font-semibold text-gray-100">Collection Agents</h3>
+          <p className="text-sm text-gray-70">
+            Configure AI agents that enhance search and generate artifacts
+          </p>
+        </div>
+        <button
+          onClick={() => setShowAddModal(true)}
+          className="btn-primary flex items-center space-x-2"
+        >
+          <PlusIcon className="w-4 h-4" />
+          <span>Add Agent</span>
+        </button>
+      </div>
+
+      {/* Agent List by Stage */}
+      {isLoading ? (
+        <div className="space-y-4">
+          {[1, 2, 3].map(i => <Skeleton key={i} className="h-20" />)}
+        </div>
+      ) : (
+        <>
+          {/* Pre-Search Agents */}
+          <AgentStageSection
+            title="Pre-Search Agents"
+            description="Transform queries before vector search"
+            stage="pre_search"
+            agents={agents.filter(a => a.trigger_stage === 'pre_search')}
+            onToggle={handleToggleAgent}
+            onEdit={setEditingAgent}
+            onReorder={handleReorderAgents}
+          />
+
+          {/* Post-Search Agents */}
+          <AgentStageSection
+            title="Post-Search Agents"
+            description="Process retrieved documents before answer generation"
+            stage="post_search"
+            agents={agents.filter(a => a.trigger_stage === 'post_search')}
+            onToggle={handleToggleAgent}
+            onEdit={setEditingAgent}
+            onReorder={handleReorderAgents}
+          />
+
+          {/* Response Agents */}
+          <AgentStageSection
+            title="Response Agents"
+            description="Generate artifacts from search results (runs in parallel)"
+            stage="response"
+            agents={agents.filter(a => a.trigger_stage === 'response')}
+            onToggle={handleToggleAgent}
+            onEdit={setEditingAgent}
+            onReorder={handleReorderAgents}
+          />
+        </>
+      )}
+
+      {/* Add Agent Modal */}
+      {showAddModal && (
+        <AgentMarketplaceModal
+          availableAgents={availableAgents}
+          collectionId={collectionId}
+          onAdd={() => {
+            loadAgents();
+            setShowAddModal(false);
+          }}
+          onClose={() => setShowAddModal(false)}
+        />
+      )}
+
+      {/* Edit Agent Modal */}
+      {editingAgent && (
+        <AgentConfigModal
+          agent={editingAgent}
+          onSave={() => {
+            loadAgents();
+            setEditingAgent(null);
+          }}
+          onClose={() => setEditingAgent(null)}
+        />
+      )}
+    </div>
+  );
+};
+```
+
+#### AgentStageSection
+
+Section component for agents at a specific pipeline stage.
+
+```typescript
+// frontend/src/components/agents/AgentStageSection.tsx
+
+interface AgentStageSectionProps {
+  title: string;
+  description: string;
+  stage: 'pre_search' | 'post_search' | 'response';
+  agents: CollectionAgent[];
+  onToggle: (id: string, enabled: boolean) => void;
+  onEdit: (agent: CollectionAgent) => void;
+  onReorder: (agents: CollectionAgent[]) => void;
+}
+
+const AgentStageSection: React.FC<AgentStageSectionProps> = ({
+  title,
+  description,
+  stage,
+  agents,
+  onToggle,
+  onEdit,
+  onReorder
+}) => {
+  const stageIcons = {
+    pre_search: <FunnelIcon className="w-5 h-5" />,
+    post_search: <AdjustmentsHorizontalIcon className="w-5 h-5" />,
+    response: <DocumentDuplicateIcon className="w-5 h-5" />
+  };
+
+  const stageColors = {
+    pre_search: 'bg-yellow-10 text-yellow-60',
+    post_search: 'bg-blue-10 text-blue-60',
+    response: 'bg-purple-10 text-purple-60'
+  };
+
+  return (
+    <div className="card p-4">
+      <div className="flex items-center space-x-3 mb-4">
+        <div className={`p-2 rounded-lg ${stageColors[stage]}`}>
+          {stageIcons[stage]}
+        </div>
+        <div>
+          <h4 className="font-medium text-gray-100">{title}</h4>
+          <p className="text-sm text-gray-60">{description}</p>
+        </div>
+      </div>
+
+      {agents.length === 0 ? (
+        <div className="text-center py-6 text-gray-60">
+          <CubeTransparentIcon className="w-8 h-8 mx-auto mb-2 opacity-50" />
+          <p className="text-sm">No agents configured for this stage</p>
+        </div>
+      ) : (
+        <DragDropContext onDragEnd={(result) => {
+          if (!result.destination) return;
+          const items = Array.from(agents);
+          const [reordered] = items.splice(result.source.index, 1);
+          items.splice(result.destination.index, 0, reordered);
+          onReorder(items);
+        }}>
+          <Droppable droppableId={stage}>
+            {(provided) => (
+              <div
+                {...provided.droppableProps}
+                ref={provided.innerRef}
+                className="space-y-2"
+              >
+                {agents.map((agent, index) => (
+                  <Draggable key={agent.id} draggableId={agent.id} index={index}>
+                    {(provided, snapshot) => (
+                      <div
+                        ref={provided.innerRef}
+                        {...provided.draggableProps}
+                        className={`
+                          flex items-center p-3 bg-gray-10 rounded-lg
+                          ${snapshot.isDragging ? 'shadow-lg ring-2 ring-blue-60' : ''}
+                        `}
+                      >
+                        <div
+                          {...provided.dragHandleProps}
+                          className="mr-3 cursor-grab text-gray-40 hover:text-gray-60"
+                        >
+                          <Bars3Icon className="w-4 h-4" />
+                        </div>
+
+                        <div className="flex-1">
+                          <p className="font-medium text-gray-100">{agent.name}</p>
+                          <p className="text-xs text-gray-60">{agent.description}</p>
+                        </div>
+
+                        <div className="flex items-center space-x-3">
+                          <span className="text-xs text-gray-50">
+                            Priority: {agent.priority}
+                          </span>
+
+                          <Switch
+                            checked={agent.enabled}
+                            onChange={(enabled) => onToggle(agent.id, enabled)}
+                            className={`
+                              ${agent.enabled ? 'bg-green-50' : 'bg-gray-30'}
+                              relative inline-flex h-5 w-9 items-center rounded-full
+                            `}
+                          >
+                            <span
+                              className={`
+                                ${agent.enabled ? 'translate-x-5' : 'translate-x-1'}
+                                inline-block h-3 w-3 transform rounded-full bg-white transition
+                              `}
+                            />
+                          </Switch>
+
+                          <button
+                            onClick={() => onEdit(agent)}
+                            className="p-1 text-gray-60 hover:text-gray-100"
+                          >
+                            <CogIcon className="w-4 h-4" />
+                          </button>
+                        </div>
+                      </div>
+                    )}
+                  </Draggable>
+                ))}
+                {provided.placeholder}
+              </div>
+            )}
+          </Droppable>
+        </DragDropContext>
+      )}
+    </div>
+  );
+};
+```
+
+#### AgentConfigModal
+
+Modal for configuring agent-specific settings.
+
+```typescript
+// frontend/src/components/agents/AgentConfigModal.tsx
+
+interface AgentConfigModalProps {
+  agent: CollectionAgent;
+  onSave: () => void;
+  onClose: () => void;
+}
+
+const AgentConfigModal: React.FC<AgentConfigModalProps> = ({
+  agent,
+  onSave,
+  onClose
+}) => {
+  const [config, setConfig] = useState(agent.config);
+  const [isSaving, setIsSaving] = useState(false);
+  const { addNotification } = useNotification();
+
+  // Generate form fields from agent's config schema
+  const renderConfigField = (key: string, schema: any) => {
+    const value = config.settings?.[key] ?? schema.default;
+
+    switch (schema.type) {
+      case 'integer':
+        return (
+          <div key={key}>
+            <label className="block text-sm font-medium text-gray-100 mb-1">
+              {schema.title || key}
+            </label>
+            <input
+              type="number"
+              min={schema.minimum}
+              max={schema.maximum}
+              value={value}
+              onChange={(e) => setConfig({
+                ...config,
+                settings: { ...config.settings, [key]: parseInt(e.target.value) }
+              })}
+              className="input-field w-full"
+            />
+            {schema.description && (
+              <p className="text-xs text-gray-60 mt-1">{schema.description}</p>
+            )}
+          </div>
+        );
+
+      case 'boolean':
+        return (
+          <div key={key} className="flex items-center justify-between">
+            <div>
+              <label className="block text-sm font-medium text-gray-100">
+                {schema.title || key}
+              </label>
+              {schema.description && (
+                <p className="text-xs text-gray-60">{schema.description}</p>
+              )}
+            </div>
+            <Switch
+              checked={value}
+              onChange={(checked) => setConfig({
+                ...config,
+                settings: { ...config.settings, [key]: checked }
+              })}
+            />
+          </div>
+        );
+
+      case 'string':
+        if (schema.enum) {
+          return (
+            <div key={key}>
+              <label className="block text-sm font-medium text-gray-100 mb-1">
+                {schema.title || key}
+              </label>
+              <select
+                value={value}
+                onChange={(e) => setConfig({
+                  ...config,
+                  settings: { ...config.settings, [key]: e.target.value }
+                })}
+                className="input-field w-full"
+              >
+                {schema.enum.map((opt: string) => (
+                  <option key={opt} value={opt}>{opt}</option>
+                ))}
+              </select>
+            </div>
+          );
+        }
+        return (
+          <div key={key}>
+            <label className="block text-sm font-medium text-gray-100 mb-1">
+              {schema.title || key}
+            </label>
+            <input
+              type="text"
+              value={value}
+              onChange={(e) => setConfig({
+                ...config,
+                settings: { ...config.settings, [key]: e.target.value }
+              })}
+              className="input-field w-full"
+            />
+          </div>
+        );
+
+      default:
+        return null;
+    }
+  };
+
+  const handleSave = async () => {
+    setIsSaving(true);
+    try {
+      await agentApiClient.updateAgentConfig(agent.id, { config });
+      addNotification('success', 'Saved', 'Agent configuration updated');
+      onSave();
+    } catch (error) {
+      addNotification('error', 'Error', 'Failed to save configuration');
+    } finally {
+      setIsSaving(false);
+    }
+  };
+
+  return (
+    <Modal open onClose={onClose} size="md">
+      <div className="p-6">
+        <h3 className="text-lg font-semibold text-gray-100 mb-4">
+          Configure {agent.name}
+        </h3>
+
+        <div className="space-y-4">
+          {/* Agent info */}
+          <div className="bg-gray-10 p-3 rounded-lg">
+            <p className="text-sm text-gray-70">{agent.description}</p>
+            <div className="flex items-center space-x-4 mt-2 text-xs text-gray-60">
+              <span>Stage: {agent.trigger_stage}</span>
+              <span>Type: {agent.config.type}</span>
+            </div>
+          </div>
+
+          {/* Dynamic config fields */}
+          {agent.config_schema?.properties && (
+            <div className="space-y-4">
+              {Object.entries(agent.config_schema.properties).map(([key, schema]) =>
+                renderConfigField(key, schema)
+              )}
+            </div>
+          )}
+        </div>
+
+        <div className="flex justify-end space-x-3 mt-6">
+          <button onClick={onClose} className="btn-secondary">
+            Cancel
+          </button>
+          <button
+            onClick={handleSave}
+            disabled={isSaving}
+            className="btn-primary"
+          >
+            {isSaving ? 'Saving...' : 'Save Configuration'}
+          </button>
+        </div>
+      </div>
+    </Modal>
+  );
+};
+```
+
+### 3. Agent Management Dashboard
+
+#### AgentDashboard
+
+Main page for managing user's agents across all collections.
+
+```typescript
+// frontend/src/components/agents/AgentDashboard.tsx
+
+const AgentDashboard: React.FC = () => {
+  const [activeTab, setActiveTab] = useState<'my-agents' | 'analytics' | 'audit'>('my-agents');
+
+  return (
+    <div className="min-h-screen bg-gray-10 p-6">
+      <div className="max-w-6xl mx-auto">
+        {/* Header */}
+        <div className="mb-6">
+          <h1 className="text-2xl font-semibold text-gray-100">Agent Management</h1>
+          <p className="text-gray-70">
+            Configure and monitor AI agents for your document collections
+          </p>
+        </div>
+
+        {/* Tabs */}
+        <div className="mb-6">
+          <nav className="flex space-x-4 border-b border-gray-20">
+            {[
+              { id: 'my-agents', label: 'My Agents', icon: CubeIcon },
+              { id: 'analytics', label: 'Analytics', icon: ChartBarIcon },
+              { id: 'audit', label: 'Audit Log', icon: ClipboardDocumentListIcon },
+            ].map((tab) => (
+              <button
+                key={tab.id}
+                onClick={() => setActiveTab(tab.id as any)}
+                className={`
+                  flex items-center space-x-2 px-4 py-3 text-sm font-medium border-b-2 -mb-px
+                  ${activeTab === tab.id
+                    ? 'border-blue-60 text-blue-60'
+                    : 'border-transparent text-gray-70 hover:text-gray-100'
+                  }
+                `}
+              >
+                <tab.icon className="w-4 h-4" />
+                <span>{tab.label}</span>
+              </button>
+            ))}
+          </nav>
+        </div>
+
+        {/* Tab Content */}
+        {activeTab === 'my-agents' && <MyAgentsPanel />}
+        {activeTab === 'analytics' && <AgentAnalytics />}
+        {activeTab === 'audit' && <AgentAuditLog />}
+      </div>
+    </div>
+  );
+};
+```
+
+### 4. Agent Marketplace
+
+#### AgentMarketplacePage
+
+Browse and discover available agents.
+
+```typescript
+// frontend/src/components/agents/AgentMarketplacePage.tsx
+
+interface AgentManifest {
+  agent_id: string;
+  name: string;
+  version: string;
+  description: string;
+  capabilities: string[];
+  config_schema: Record<string, any>;
+  input_schema: Record<string, any>;
+  output_schema: Record<string, any>;
+  category: 'pre_search' | 'post_search' | 'response';
+  icon?: string;
+  author?: string;
+  downloads?: number;
+}
+
+const AgentMarketplacePage: React.FC = () => {
+  const [agents, setAgents] = useState<AgentManifest[]>([]);
+  const [filter, setFilter] = useState<string>('all');
+  const [search, setSearch] = useState('');
+  const [selectedAgent, setSelectedAgent] = useState<AgentManifest | null>(null);
+
+  useEffect(() => {
+    loadAgents();
+  }, []);
+
+  const loadAgents = async () => {
+    const data = await agentApiClient.getAvailableAgents();
+    setAgents(data);
+  };
+
+  const filteredAgents = agents.filter(agent => {
+    const matchesFilter = filter === 'all' || agent.category === filter;
+    const matchesSearch = !search ||
+      agent.name.toLowerCase().includes(search.toLowerCase()) ||
+      agent.description.toLowerCase().includes(search.toLowerCase());
+    return matchesFilter && matchesSearch;
+  });
+
+  const categories = [
+    { id: 'all', label: 'All Agents' },
+    { id: 'pre_search', label: 'Pre-Search' },
+    { id: 'post_search', label: 'Post-Search' },
+    { id: 'response', label: 'Response' },
+  ];
+
+  return (
+    <div className="min-h-screen bg-gray-10 p-6">
+      <div className="max-w-6xl mx-auto">
+        {/* Header */}
+        <div className="mb-6">
+          <h1 className="text-2xl font-semibold text-gray-100">Agent Marketplace</h1>
+          <p className="text-gray-70">
+            Discover and add AI agents to enhance your RAG workflows
+          </p>
+        </div>
+
+        {/* Filters */}
+        <div className="flex items-center space-x-4 mb-6">
+          <div className="relative flex-1 max-w-md">
+            <MagnifyingGlassIcon className="absolute left-3 top-1/2 -translate-y-1/2 w-4 h-4 text-gray-60" />
+            <input
+              type="text"
+              placeholder="Search agents..."
+              value={search}
+              onChange={(e) => setSearch(e.target.value)}
+              className="input-field w-full pl-10"
+            />
+          </div>
+
+          <div className="flex space-x-2">
+            {categories.map(cat => (
+              <button
+                key={cat.id}
+                onClick={() => setFilter(cat.id)}
+                className={`
+                  px-3 py-1.5 text-sm rounded-full
+                  ${filter === cat.id
+                    ? 'bg-blue-60 text-white'
+                    : 'bg-gray-20 text-gray-70 hover:bg-gray-30'
+                  }
+                `}
+              >
+                {cat.label}
+              </button>
+            ))}
+          </div>
+        </div>
+
+        {/* Agent Grid */}
+        <div className="grid grid-cols-1 md:grid-cols-2 lg:grid-cols-3 gap-4">
+          {filteredAgents.map(agent => (
+            <div
+              key={agent.agent_id}
+              className="card p-4 hover:shadow-md transition-shadow cursor-pointer"
+              onClick={() => setSelectedAgent(agent)}
+            >
+              <div className="flex items-start space-x-3">
+                <div className="p-2 bg-purple-10 rounded-lg text-purple-60">
+                  <CubeIcon className="w-6 h-6" />
+                </div>
+                <div className="flex-1">
+                  <h3 className="font-medium text-gray-100">{agent.name}</h3>
+                  <p className="text-xs text-gray-60">v{agent.version}</p>
+                </div>
+              </div>
+
+              <p className="text-sm text-gray-70 mt-3 line-clamp-2">
+                {agent.description}
+              </p>
+
+              <div className="flex items-center justify-between mt-4">
+                <span className={`
+                  px-2 py-0.5 text-xs rounded
+                  ${agent.category === 'pre_search' ? 'bg-yellow-10 text-yellow-60' : ''}
+                  ${agent.category === 'post_search' ? 'bg-blue-10 text-blue-60' : ''}
+                  ${agent.category === 'response' ? 'bg-purple-10 text-purple-60' : ''}
+                `}>
+                  {agent.category.replace('_', '-')}
+                </span>
+
+                <button className="text-sm text-blue-60 hover:text-blue-70">
+                  View Details →
+                </button>
+              </div>
+            </div>
+          ))}
+        </div>
+
+        {/* Agent Detail Modal */}
+        {selectedAgent && (
+          <AgentDetailModal
+            agent={selectedAgent}
+            onClose={() => setSelectedAgent(null)}
+          />
+        )}
+      </div>
+    </div>
+  );
+};
+```
+
+## API Integration
+
+### Agent API Client
+
+```typescript
+// frontend/src/services/agentApiClient.ts
+
+import apiClient from './apiClient';
+
+export interface AgentManifest {
+  agent_id: string;
+  name: string;
+  version: string;
+  description: string;
+  capabilities: string[];
+  category: 'pre_search' | 'post_search' | 'response';
+  config_schema: Record<string, any>;
+}
+
+export interface CollectionAgent {
+  id: string;
+  agent_id: string;
+  name: string;
+  description: string;
+  config: {
+    type: 'mcp' | 'builtin';
+    context_forge_tool_id?: string;
+    settings: Record<string, any>;
+  };
+  config_schema?: Record<string, any>;
+  enabled: boolean;
+  trigger_stage: 'pre_search' | 'post_search' | 'response';
+  priority: number;
+}
+
+export interface AgentExecution {
+  id: string;
+  agent_id: string;
+  agent_name: string;
+  collection_id: string;
+  trigger_stage: string;
+  success: boolean;
+  duration_ms: number;
+  error?: string;
+  created_at: string;
+}
+
+const agentApiClient = {
+  // Available agents
+  getAvailableAgents: async (): Promise<AgentManifest[]> => {
+    const response = await apiClient.get('/api/v1/agents/');
+    return response.data;
+  },
+
+  getAgentsByCapability: async (capability: string): Promise<AgentManifest[]> => {
+    const response = await apiClient.get(`/api/v1/agents/capabilities/${capability}`);
+    return response.data;
+  },
+
+  // User's agent configurations
+  getUserAgentConfigs: async (): Promise<CollectionAgent[]> => {
+    const response = await apiClient.get('/api/v1/agents/configs');
+    return response.data;
+  },
+
+  createAgentConfig: async (config: Partial<CollectionAgent>): Promise<CollectionAgent> => {
+    const response = await apiClient.post('/api/v1/agents/configs', config);
+    return response.data;
+  },
+
+  updateAgentConfig: async (
+    configId: string,
+    updates: Partial<CollectionAgent>
+  ): Promise<CollectionAgent> => {
+    const response = await apiClient.patch(`/api/v1/agents/configs/${configId}`, updates);
+    return response.data;
+  },
+
+  deleteAgentConfig: async (configId: string): Promise<void> => {
+    await apiClient.delete(`/api/v1/agents/configs/${configId}`);
+  },
+
+  // Collection agents
+  getCollectionAgents: async (collectionId: string): Promise<CollectionAgent[]> => {
+    const response = await apiClient.get(`/api/v1/agents/collections/${collectionId}/agents`);
+    return response.data;
+  },
+
+  addAgentToCollection: async (
+    collectionId: string,
+    agentConfigId: string
+  ): Promise<void> => {
+    await apiClient.post(`/api/v1/agents/collections/${collectionId}/agents`, {
+      agent_config_id: agentConfigId
+    });
+  },
+
+  removeAgentFromCollection: async (
+    collectionId: string,
+    agentConfigId: string
+  ): Promise<void> => {
+    await apiClient.delete(
+      `/api/v1/agents/collections/${collectionId}/agents/${agentConfigId}`
+    );
+  },
+
+  batchUpdatePriorities: async (
+    updates: { id: string; priority: number }[]
+  ): Promise<void> => {
+    await apiClient.patch('/api/v1/agents/configs/priorities', { updates });
+  },
+
+  // Analytics
+  getAgentAnalytics: async (
+    agentConfigId?: string,
+    dateRange?: { start: string; end: string }
+  ): Promise<any> => {
+    const params = new URLSearchParams();
+    if (agentConfigId) params.append('agent_config_id', agentConfigId);
+    if (dateRange) {
+      params.append('start', dateRange.start);
+      params.append('end', dateRange.end);
+    }
+    const response = await apiClient.get(`/api/v1/agents/analytics?${params}`);
+    return response.data;
+  },
+
+  // Audit log
+  getAgentExecutions: async (
+    options?: {
+      agentConfigId?: string;
+      collectionId?: string;
+      limit?: number;
+      offset?: number;
+    }
+  ): Promise<AgentExecution[]> => {
+    const params = new URLSearchParams();
+    if (options?.agentConfigId) params.append('agent_config_id', options.agentConfigId);
+    if (options?.collectionId) params.append('collection_id', options.collectionId);
+    if (options?.limit) params.append('limit', options.limit.toString());
+    if (options?.offset) params.append('offset', options.offset.toString());
+    const response = await apiClient.get(`/api/v1/agents/executions?${params}`);
+    return response.data;
+  },
+};
+
+export default agentApiClient;
+```
+
+### Enhanced Search Response Schema
+
+```typescript
+// frontend/src/types/search.ts
+
+export interface SearchResponse {
+  answer: string;
+  sources: Source[];
+  cot_steps?: CotStep[];
+
+  // NEW: Agent-related fields
+  agent_artifacts?: AgentArtifact[];
+  agent_executions?: AgentExecution[];
+  pipeline_metadata?: {
+    pre_search_agents: number;
+    post_search_agents: number;
+    response_agents: number;
+    total_agent_time_ms: number;
+  };
+}
+
+export interface AgentArtifact {
+  agent_id: string;
+  type: 'pptx' | 'pdf' | 'png' | 'mp3' | 'html' | 'txt';
+  data: string;
+  filename: string;
+  metadata: Record<string, any>;
+}
+
+export interface AgentExecution {
+  agent_id: string;
+  agent_name: string;
+  stage: 'pre_search' | 'post_search' | 'response';
+  duration_ms: number;
+  success: boolean;
+  error?: string;
+}
+```
+
+## State Management
+
+### AgentContext
+
+Context for managing agent-related state across the application.
+
+```typescript
+// frontend/src/contexts/AgentContext.tsx
+
+interface AgentState {
+  availableAgents: AgentManifest[];
+  userConfigs: CollectionAgent[];
+  isLoading: boolean;
+  error: string | null;
+}
+
+interface AgentContextType extends AgentState {
+  loadAvailableAgents: () => Promise<void>;
+  loadUserConfigs: () => Promise<void>;
+  createConfig: (config: Partial<CollectionAgent>) => Promise<CollectionAgent>;
+  updateConfig: (id: string, updates: Partial<CollectionAgent>) => Promise<void>;
+  deleteConfig: (id: string) => Promise<void>;
+}
+
+const AgentContext = createContext<AgentContextType | null>(null);
+
+export const AgentProvider: React.FC<{ children: React.ReactNode }> = ({ children }) => {
+  const [state, setState] = useState<AgentState>({
+    availableAgents: [],
+    userConfigs: [],
+    isLoading: false,
+    error: null
+  });
+
+  const loadAvailableAgents = async () => {
+    setState(s => ({ ...s, isLoading: true }));
+    try {
+      const agents = await agentApiClient.getAvailableAgents();
+      setState(s => ({ ...s, availableAgents: agents, isLoading: false }));
+    } catch (error) {
+      setState(s => ({ ...s, error: 'Failed to load agents', isLoading: false }));
+    }
+  };
+
+  const loadUserConfigs = async () => {
+    setState(s => ({ ...s, isLoading: true }));
+    try {
+      const configs = await agentApiClient.getUserAgentConfigs();
+      setState(s => ({ ...s, userConfigs: configs, isLoading: false }));
+    } catch (error) {
+      setState(s => ({ ...s, error: 'Failed to load configs', isLoading: false }));
+    }
+  };
+
+  // ... other methods
+
+  return (
+    <AgentContext.Provider value={{
+      ...state,
+      loadAvailableAgents,
+      loadUserConfigs,
+      createConfig,
+      updateConfig,
+      deleteConfig
+    }}>
+      {children}
+    </AgentContext.Provider>
+  );
+};
+
+export const useAgents = () => {
+  const context = useContext(AgentContext);
+  if (!context) {
+    throw new Error('useAgents must be used within AgentProvider');
+  }
+  return context;
+};
+```
+
+## Accessibility
+
+### Keyboard Navigation
+
+- All agent cards and buttons are focusable
+- Drag-and-drop has keyboard alternatives (up/down arrow keys)
+- Modal focus trapping implemented
+- Screen reader announcements for status changes
+
+### ARIA Labels
+
+```tsx
+// Example: Artifact card
+<div
+  role="article"
+  aria-label={`${artifact.type} artifact: ${artifact.filename}`}
+>
+  <button
+    aria-label={`Download ${artifact.filename}`}
+    onClick={handleDownload}
+  >
+    Download
+  </button>
+</div>
+
+// Example: Pipeline status
+<div
+  role="progressbar"
+  aria-valuenow={completedStages}
+  aria-valuemax={totalStages}
+  aria-label="Agent pipeline progress"
+>
+  ...
+</div>
+```
+
+## Responsive Design
+
+### Breakpoints
+
+| Breakpoint | Width | Layout Changes |
+|------------|-------|----------------|
+| Mobile | < 640px | Single column, stacked artifacts |
+| Tablet | 640-1024px | 2-column grid, collapsible panels |
+| Desktop | > 1024px | 3-column grid, full sidebar |
+
+### Mobile Considerations
+
+- Artifact preview uses full-screen modal on mobile
+- Drag-and-drop replaced with move up/down buttons on touch
+- Pipeline status collapses to minimal indicator
+- Agent config modal is full-screen on mobile
+
+## Performance
+
+### Lazy Loading
+
+- Agent marketplace loads agents in pages of 20
+- Artifact preview images loaded on-demand
+- Audit log uses virtual scrolling for large lists
+
+### Caching
+
+- Available agents cached for 5 minutes
+- User configs cached with SWR for real-time updates
+- Artifact data not cached (too large)
+
+### Bundle Optimization
+
+- Agent components code-split by route
+- react-beautiful-dnd loaded only when drag-drop needed
+- Large icons tree-shaken
+
+## Related Documents
+
+- [MCP Integration Architecture](./mcp-integration-architecture.md)
+- [SearchService Agent Hooks Architecture](./search-agent-hooks-architecture.md)
+- [RAG Modulo MCP Server Architecture](./rag-modulo-mcp-server-architecture.md)
diff --git a/docs/architecture/backend-architecture-diagram.md b/docs/architecture/backend-architecture-diagram.md
new file mode 100644
index 00000000..0cd94bb3
--- /dev/null
+++ b/docs/architecture/backend-architecture-diagram.md
@@ -0,0 +1,517 @@
+# RAG Modulo Backend Architecture
+
+This document provides a comprehensive architecture diagram and description of the RAG Modulo
+backend system.
+
+## Architecture Overview
+
+The RAG Modulo backend is a FastAPI-based application that implements a Retrieval-Augmented
+Generation (RAG) system with a modular, stage-based pipeline architecture. The system supports
+multiple LLM providers, vector databases, and document processing strategies.
+
+## Component Architecture Diagram
+
+```mermaid
+graph TB
+    subgraph "Client Layer"
+        WEB[Web Frontend]
+        CLI[CLI Client]
+        API_CLIENT[API Clients]
+    end
+
+    subgraph "API Gateway Layer"
+        FASTAPI[FastAPI Application<br/>main.py]
+
+        subgraph "Middleware Stack"
+            CORS[LoggingCORSMiddleware]
+            SESSION[SessionMiddleware]
+            AUTH[AuthenticationMiddleware<br/>SPIFFE/OIDC Support]
+        end
+    end
+
+    subgraph "Router Layer"
+        AUTH_R[Auth Router]
+        SEARCH_R[Search Router]
+        COLLECTION_R[Collection Router]
+        CHAT_R[Chat Router]
+        CONV_R[Conversation Router]
+        PODCAST_R[Podcast Router]
+        VOICE_R[Voice Router]
+        AGENT_R[Agent Router]
+        USER_R[User Router]
+        TEAM_R[Team Router]
+        DASH_R[Dashboard Router]
+        HEALTH_R[Health Router]
+        WS_R[WebSocket Router]
+    end
+
+    subgraph "Service Layer"
+        SEARCH_SVC[SearchService]
+        CONV_SVC[ConversationService]
+        MSG_ORCH[MessageProcessingOrchestrator]
+        COLLECTION_SVC[CollectionService]
+        FILE_SVC[FileManagementService]
+        PODCAST_SVC[PodcastService]
+        VOICE_SVC[VoiceService]
+        AGENT_SVC[AgentService]
+        USER_SVC[UserService]
+        TEAM_SVC[TeamService]
+        DASH_SVC[DashboardService]
+        PIPELINE_SVC[PipelineService]
+        COT_SVC[ChainOfThoughtService]
+        ANSWER_SYNTH[AnswerSynthesizer]
+        CITATION_SVC[CitationAttributionService]
+    end
+
+    subgraph "Pipeline Architecture"
+        PIPELINE_EXEC[PipelineExecutor]
+
+        subgraph "Pipeline Stages"
+            STAGE1[PipelineResolutionStage]
+            STAGE2[QueryEnhancementStage]
+            STAGE3[RetrievalStage]
+            STAGE4[RerankingStage]
+            STAGE5[ReasoningStage]
+            STAGE6[GenerationStage]
+        end
+
+        SEARCH_CTX[SearchContext]
+    end
+
+    subgraph "Data Ingestion Pipeline"
+        DOC_STORE[DocumentStore]
+        DOC_PROC[DocumentProcessor]
+
+        subgraph "Document Processors"
+            PDF_PROC[PdfProcessor]
+            DOCLING_PROC[DoclingProcessor]
+            WORD_PROC[WordProcessor]
+            EXCEL_PROC[ExcelProcessor]
+            TXT_PROC[TxtProcessor]
+        end
+
+        CHUNKING[Chunking Strategies<br/>Sentence/Semantic/Hierarchical]
+    end
+
+    subgraph "Retrieval Layer"
+        RETRIEVER[Retriever]
+        RERANKER[Reranker]
+        QUERY_REWRITER[QueryRewriter]
+    end
+
+    subgraph "Generation Layer"
+        LLM_FACTORY[LLMProviderFactory]
+
+        subgraph "LLM Providers"
+            WATSONX[WatsonX Provider]
+            OPENAI[OpenAI Provider]
+            ANTHROPIC[Anthropic Provider]
+        end
+
+        AUDIO_FACTORY[AudioFactory]
+
+        subgraph "Audio Providers"
+            ELEVENLABS[ElevenLabs Audio]
+            OPENAI_AUDIO[OpenAI Audio]
+            OLLAMA_AUDIO[Ollama Audio]
+        end
+    end
+
+    subgraph "Repository Layer"
+        USER_REPO[UserRepository]
+        COLLECTION_REPO[CollectionRepository]
+        FILE_REPO[FileRepository]
+        CONV_REPO[ConversationRepository]
+        AGENT_REPO[AgentRepository]
+        PODCAST_REPO[PodcastRepository]
+        VOICE_REPO[VoiceRepository]
+        TEAM_REPO[TeamRepository]
+        PIPELINE_REPO[PipelineRepository]
+        LLM_REPO[LLMProviderRepository]
+    end
+
+    subgraph "Data Persistence"
+        POSTGRES[(PostgreSQL<br/>Metadata & Config)]
+        VECTOR_DB[(Vector Database)]
+
+        subgraph "Vector DB Implementations"
+            MILVUS[Milvus]
+            PINECONE[Pinecone]
+            WEAVIATE[Weaviate]
+            ELASTICSEARCH[Elasticsearch]
+            CHROMA[Chroma]
+        end
+    end
+
+    subgraph "External Services"
+        SPIRE[SPIRE Server<br/>SPIFFE Identity]
+        OIDC[OIDC Provider<br/>IBM AppID]
+        MINIO[MinIO<br/>Object Storage]
+    end
+
+    subgraph "Core Infrastructure"
+        CONFIG[Settings/Config]
+        LOGGING[Logging Utils]
+        IDENTITY[Identity Service]
+        EXCEPTIONS[Custom Exceptions]
+    end
+
+    %% Client to API Gateway
+    WEB --> FASTAPI
+    CLI --> FASTAPI
+    API_CLIENT --> FASTAPI
+
+    %% Middleware Flow
+    FASTAPI --> CORS
+    CORS --> SESSION
+    SESSION --> AUTH
+
+    %% Router Registration
+    AUTH --> AUTH_R
+    AUTH --> SEARCH_R
+    AUTH --> COLLECTION_R
+    AUTH --> CHAT_R
+    AUTH --> CONV_R
+    AUTH --> PODCAST_R
+    AUTH --> VOICE_R
+    AUTH --> AGENT_R
+    AUTH --> USER_R
+    AUTH --> TEAM_R
+    AUTH --> DASH_R
+    AUTH --> HEALTH_R
+    AUTH --> WS_R
+
+    %% Router to Service
+    SEARCH_R --> SEARCH_SVC
+    CHAT_R --> CONV_SVC
+    CONV_R --> CONV_SVC
+    CONV_SVC --> MSG_ORCH
+    MSG_ORCH --> SEARCH_SVC
+    COLLECTION_R --> COLLECTION_SVC
+    COLLECTION_SVC --> FILE_SVC
+    PODCAST_R --> PODCAST_SVC
+    VOICE_R --> VOICE_SVC
+    AGENT_R --> AGENT_SVC
+    USER_R --> USER_SVC
+    TEAM_R --> TEAM_SVC
+    DASH_R --> DASH_SVC
+
+    %% Search Service to Pipeline
+    SEARCH_SVC --> PIPELINE_EXEC
+    PIPELINE_EXEC --> STAGE1
+    STAGE1 --> STAGE2
+    STAGE2 --> STAGE3
+    STAGE3 --> STAGE4
+    STAGE4 --> STAGE5
+    STAGE5 --> STAGE6
+    PIPELINE_EXEC --> SEARCH_CTX
+
+    %% Pipeline Stages to Services
+    STAGE1 --> PIPELINE_SVC
+    STAGE2 --> PIPELINE_SVC
+    STAGE3 --> PIPELINE_SVC
+    STAGE4 --> PIPELINE_SVC
+    STAGE5 --> COT_SVC
+    STAGE6 --> ANSWER_SYNTH
+
+    %% Pipeline Service to Retrieval
+    PIPELINE_SVC --> RETRIEVER
+    PIPELINE_SVC --> RERANKER
+    PIPELINE_SVC --> QUERY_REWRITER
+
+    %% Retrieval to Vector DB
+    RETRIEVER --> VECTOR_DB
+    VECTOR_DB --> MILVUS
+    VECTOR_DB --> PINECONE
+    VECTOR_DB --> WEAVIATE
+    VECTOR_DB --> ELASTICSEARCH
+    VECTOR_DB --> CHROMA
+
+    %% Generation Layer
+    ANSWER_SYNTH --> LLM_FACTORY
+    LLM_FACTORY --> WATSONX
+    LLM_FACTORY --> OPENAI
+    LLM_FACTORY --> ANTHROPIC
+    PODCAST_SVC --> LLM_FACTORY
+    VOICE_SVC --> AUDIO_FACTORY
+    AUDIO_FACTORY --> ELEVENLABS
+    AUDIO_FACTORY --> OPENAI_AUDIO
+    AUDIO_FACTORY --> OLLAMA_AUDIO
+
+    %% Data Ingestion
+    FILE_SVC --> DOC_STORE
+    DOC_STORE --> DOC_PROC
+    DOC_PROC --> PDF_PROC
+    DOC_PROC --> DOCLING_PROC
+    DOC_PROC --> WORD_PROC
+    DOC_PROC --> EXCEL_PROC
+    DOC_PROC --> TXT_PROC
+    DOC_PROC --> CHUNKING
+    DOC_STORE --> VECTOR_DB
+
+    %% Service to Repository
+    USER_SVC --> USER_REPO
+    COLLECTION_SVC --> COLLECTION_REPO
+    FILE_SVC --> FILE_REPO
+    CONV_SVC --> CONV_REPO
+    AGENT_SVC --> AGENT_REPO
+    PODCAST_SVC --> PODCAST_REPO
+    VOICE_SVC --> VOICE_REPO
+    TEAM_SVC --> TEAM_REPO
+    PIPELINE_SVC --> PIPELINE_REPO
+    PIPELINE_SVC --> LLM_REPO
+
+    %% Repository to Database
+    USER_REPO --> POSTGRES
+    COLLECTION_REPO --> POSTGRES
+    FILE_REPO --> POSTGRES
+    CONV_REPO --> POSTGRES
+    AGENT_REPO --> POSTGRES
+    PODCAST_REPO --> POSTGRES
+    VOICE_REPO --> POSTGRES
+    TEAM_REPO --> POSTGRES
+    PIPELINE_REPO --> POSTGRES
+    LLM_REPO --> POSTGRES
+
+    %% Authentication
+    AUTH --> SPIRE
+    AUTH --> OIDC
+    AGENT_SVC --> SPIRE
+
+    %% Storage
+    FILE_SVC --> MINIO
+    PODCAST_SVC --> MINIO
+    VOICE_SVC --> MINIO
+
+    %% Core Infrastructure
+    FASTAPI --> CONFIG
+    FASTAPI --> LOGGING
+    AUTH --> IDENTITY
+    SEARCH_SVC --> EXCEPTIONS
+    CONV_SVC --> EXCEPTIONS
+
+    style FASTAPI fill:#4A90E2
+    style PIPELINE_EXEC fill:#50C878
+    style VECTOR_DB fill:#FF6B6B
+    style POSTGRES fill:#4ECDC4
+    style LLM_FACTORY fill:#FFD93D
+    style DOC_STORE fill:#9B59B6
+```
+
+## Architecture Layers
+
+### 1. API Gateway Layer
+
+**FastAPI Application (`main.py`)**
+
+- Entry point for all HTTP requests
+- Manages application lifespan (startup/shutdown)
+- Configures middleware stack
+- Registers all routers
+- Initializes database and LLM providers
+
+**Middleware Stack:**
+
+- **LoggingCORSMiddleware**: Handles CORS and request/response logging
+- **SessionMiddleware**: Manages user sessions
+- **AuthenticationMiddleware**: Validates user authentication via SPIFFE/OIDC
+
+### 2. Router Layer
+
+The router layer provides RESTful API endpoints organized by domain:
+
+- **Auth Router**: User authentication and authorization
+- **Search Router**: RAG search operations
+- **Collection Router**: Document collection management
+- **Chat Router**: Conversational interface
+- **Conversation Router**: Conversation history and context
+- **Podcast Router**: AI-powered podcast generation
+- **Voice Router**: Voice synthesis operations
+- **Agent Router**: SPIFFE-based agent management
+- **User Router**: User profile management
+- **Team Router**: Team collaboration features
+- **Dashboard Router**: Analytics and metrics
+- **Health Router**: System health checks
+- **WebSocket Router**: Real-time updates
+
+### 3. Service Layer
+
+Business logic services that orchestrate operations:
+
+- **SearchService**: Coordinates RAG search operations
+- **ConversationService**: Manages conversation sessions and messages
+- **MessageProcessingOrchestrator**: Orchestrates message processing with context
+- **CollectionService**: Manages document collections
+- **FileManagementService**: Handles file uploads and processing
+- **PodcastService**: Generates podcasts from documents
+- **VoiceService**: Manages voice synthesis
+- **AgentService**: Manages AI agents with SPIFFE identity
+- **PipelineService**: Executes RAG pipeline stages
+- **ChainOfThoughtService**: Implements reasoning capabilities
+- **AnswerSynthesizer**: Generates final answers from retrieved context
+- **CitationAttributionService**: Attributes sources to answers
+
+### 4. Pipeline Architecture
+
+**Stage-Based RAG Pipeline:**
+
+The system uses a modular, stage-based pipeline architecture:
+
+1. **PipelineResolutionStage**: Resolves user's default pipeline configuration
+2. **QueryEnhancementStage**: Rewrites/enhances queries for better retrieval
+3. **RetrievalStage**: Retrieves documents from vector database
+4. **RerankingStage**: Reranks results for relevance
+5. **ReasoningStage**: Applies Chain of Thought reasoning if needed
+6. **GenerationStage**: Generates final answer using LLM
+
+**PipelineExecutor**: Orchestrates stage execution with context passing
+
+**SearchContext**: Maintains state across pipeline stages
+
+### 5. Data Ingestion Pipeline
+
+**DocumentStore**: Manages document ingestion workflow
+
+**DocumentProcessor**: Routes documents to appropriate processors:
+
+- **PdfProcessor**: PDF extraction with OCR support
+- **DoclingProcessor**: Advanced document processing (tables, images)
+- **WordProcessor**: Microsoft Word documents
+- **ExcelProcessor**: Spreadsheet processing
+- **TxtProcessor**: Plain text files
+
+**Chunking Strategies**:
+
+- Sentence-based (recommended)
+- Semantic chunking
+- Hierarchical chunking
+- Token-based chunking
+- Fixed-size chunking
+
+### 6. Retrieval Layer
+
+- **Retriever**: Performs vector similarity search
+- **Reranker**: Reranks results for better relevance
+- **QueryRewriter**: Enhances queries for better retrieval
+
+### 7. Generation Layer
+
+**LLMProviderFactory**: Factory for creating LLM provider instances
+
+- **WatsonX Provider**: IBM WatsonX integration
+- **OpenAI Provider**: OpenAI API integration
+- **Anthropic Provider**: Claude API integration
+
+**AudioFactory**: Factory for audio generation
+
+- **ElevenLabs Audio**: Voice synthesis
+- **OpenAI Audio**: TTS integration
+- **Ollama Audio**: Local TTS
+
+### 8. Repository Layer
+
+Data access layer using Repository pattern:
+
+- **UserRepository**: User data operations
+- **CollectionRepository**: Collection management
+- **FileRepository**: File metadata operations
+- **ConversationRepository**: Conversation data (unified, optimized)
+- **AgentRepository**: Agent management
+- **PodcastRepository**: Podcast metadata
+- **VoiceRepository**: Voice configuration
+- **TeamRepository**: Team operations
+- **PipelineRepository**: Pipeline configuration
+- **LLMProviderRepository**: LLM provider settings
+
+### 9. Data Persistence
+
+**PostgreSQL**:
+
+- Stores metadata (users, collections, files, conversations)
+- Manages configuration (pipelines, LLM settings)
+- Handles relationships and transactions
+
+**Vector Database** (Abstracted via VectorStore interface):
+
+- **Milvus**: Primary vector database
+- **Pinecone**: Cloud vector database
+- **Weaviate**: GraphQL vector database
+- **Elasticsearch**: Search engine with vector support
+- **Chroma**: Lightweight vector database
+
+### 10. External Services
+
+- **SPIRE Server**: SPIFFE workload identity for agent authentication
+- **OIDC Provider**: IBM AppID for user authentication
+- **MinIO**: Object storage for files and audio
+
+### 11. Core Infrastructure
+
+- **Settings/Config**: Centralized configuration management
+- **Logging Utils**: Structured logging with context
+- **Identity Service**: User/agent identity management
+- **Custom Exceptions**: Domain-specific error handling
+
+## Data Flow
+
+### Search Request Flow
+
+1. **Client** → FastAPI → **Search Router**
+2. **Search Router** → **SearchService**
+3. **SearchService** → **PipelineExecutor**
+4. **PipelineExecutor** executes stages:
+   - Pipeline Resolution → Query Enhancement → Retrieval → Reranking → Reasoning → Generation
+5. **RetrievalStage** → **Retriever** → **Vector Database**
+6. **GenerationStage** → **AnswerSynthesizer** → **LLM Provider**
+7. Response flows back through layers to client
+
+### Document Ingestion Flow
+
+1. **Client** → **Collection Router** → **CollectionService** → **FileManagementService**
+2. **FileManagementService** → **DocumentStore**
+3. **DocumentStore** → **DocumentProcessor** → **Specific Processor** (PDF/Word/etc.)
+4. **Processor** → **Chunking Strategy** → **Document Chunks**
+5. **DocumentStore** → **Vector Database** (embeddings + metadata)
+6. **FileManagementService** → **FileRepository** → **PostgreSQL** (metadata)
+
+### Conversation Flow
+
+1. **Client** → **Conversation Router** → **ConversationService**
+2. **ConversationService** → **MessageProcessingOrchestrator**
+3. **MessageProcessingOrchestrator** → **SearchService** (with context)
+4. **SearchService** executes pipeline with conversation context
+5. Response saved via **ConversationRepository** → **PostgreSQL**
+
+## Key Design Patterns
+
+1. **Repository Pattern**: Data access abstraction
+2. **Factory Pattern**: LLM and Vector DB instantiation
+3. **Strategy Pattern**: Chunking strategies, LLM providers
+4. **Pipeline Pattern**: Stage-based RAG processing
+5. **Dependency Injection**: Services and repositories
+6. **Middleware Pattern**: Cross-cutting concerns (auth, logging, CORS)
+
+## Scalability Considerations
+
+- **Stateless Services**: Services are stateless for horizontal scaling
+- **Database Connection Pooling**: SQLAlchemy connection management
+- **Async/Await**: Asynchronous operations for I/O-bound tasks
+- **Vector DB Abstraction**: Easy switching between vector databases
+- **LLM Provider Abstraction**: Support for multiple LLM providers
+- **Modular Pipeline**: Stages can be optimized independently
+
+## Security Features
+
+- **SPIFFE/SPIRE**: Machine-to-machine authentication for agents
+- **OIDC**: User authentication via IBM AppID
+- **Session Management**: Secure session handling
+- **CORS**: Controlled cross-origin access
+- **Input Validation**: Pydantic schemas for request validation
+- **Error Handling**: Secure error messages without information leakage
+
+## Configuration Management
+
+- **Environment Variables**: `.env` file support
+- **Pydantic Settings**: Type-safe configuration
+- **Runtime Configuration**: Dynamic configuration updates
+- **User-Specific Settings**: Per-user LLM and pipeline configuration
diff --git a/docs/architecture/mcp-integration-architecture.md b/docs/architecture/mcp-integration-architecture.md
new file mode 100644
index 00000000..e4be1eb8
--- /dev/null
+++ b/docs/architecture/mcp-integration-architecture.md
@@ -0,0 +1,200 @@
+# MCP Integration Architecture
+
+**Date**: November 2025
+**Status**: Architecture Design
+**Version**: 1.0
+**Related PRs**: #671, #684, #695
+
+## Overview
+
+This document describes the architecture for integrating Model Context Protocol (MCP) into
+RAG Modulo. The integration enables bidirectional MCP communication:
+
+1. **RAG Modulo as MCP Client**: Consuming external MCP tools (PowerPoint generation, charts, translation)
+2. **RAG Modulo as MCP Server**: Exposing RAG capabilities to external AI tools (Claude Desktop, workflow systems)
+
+## PR Comparison and Decision
+
+### PR #671 vs #684 Analysis
+
+| Aspect | PR #671 | PR #684 | Decision |
+|--------|---------|---------|----------|
+| **File Organization** | `mcp/` dedicated directory | `services/` directory | #684 naming preferred |
+| **Lines Changed** | 2,502 | 2,846 | Similar |
+| **Test Functions** | 63 | 50 | #671 has more tests |
+| **Mergeable** | Yes | Unknown | #671 confirmed |
+
+### Decision: Adopt #684 File Naming with #671 Test Coverage
+
+We will use #684's file naming convention (`mcp_gateway_client.py`, `search_result_enricher.py`)
+placed in the `services/` directory, as this follows the existing service-based architecture
+pattern. However, we should incorporate the additional test coverage from #671.
+
+## High-Level Architecture
+
+```
+┌─────────────────────────────────────────────────────────────────────────────┐
+│                           MCP Context Forge                                  │
+│                        (Central Gateway/Registry)                            │
+│                                                                              │
+│  Registered Servers:                                                         │
+│  ┌──────────────────────────────────────────────────────────────────────┐   │
+│  │ Internal (RAG Modulo consumes):                                      │   │
+│  │   • ppt-generator-mcp (PowerPoint)                                   │   │
+│  │   • chart-generator-mcp (Visualizations)                             │   │
+│  │   • translator-mcp (Language translation)                            │   │
+│  │   • web-enricher-mcp (Real-time data)                                │   │
+│  └──────────────────────────────────────────────────────────────────────┘   │
+│  ┌──────────────────────────────────────────────────────────────────────┐   │
+│  │ External (RAG Modulo exposes):                                       │   │
+│  │   • rag-modulo-mcp (search, ingest, podcast, collections)            │   │
+│  └──────────────────────────────────────────────────────────────────────┘   │
+└─────────────────────────────────────────────────────────────────────────────┘
+         ▲                                                    ▲
+         │                                                    │
+         │ RAG Modulo calls                    External tools call
+         │ external MCP tools                  RAG Modulo MCP server
+         │                                                    │
+         ▼                                                    ▼
+┌─────────────────────────────────────────────────────────────────────────────┐
+│                          RAG Modulo Backend                                  │
+│                                                                              │
+│  ┌───────────────────────────────────────────────────────────────────────┐  │
+│  │   MCP Client                        │    MCP Server                   │  │
+│  │   services/mcp_gateway_client.py    │    mcp_server/server.py         │  │
+│  │   services/search_result_enricher.py│    mcp_server/tools.py          │  │
+│  │                                     │                                 │  │
+│  │   Consumes: ppt-generator,          │    Exposes: rag_search,         │  │
+│  │   chart-generator, etc.             │    rag_ingest, rag_podcast      │  │
+│  └───────────────────────────────────────────────────────────────────────┘  │
+│                                                                              │
+│  ┌─────────────────────────────────────────────────────────────────────────┐│
+│  │                     Core Services                                       ││
+│  │   SearchService, DocumentService, PodcastService, CollectionService    ││
+│  └─────────────────────────────────────────────────────────────────────────┘│
+└─────────────────────────────────────────────────────────────────────────────┘
+         ▲
+         │
+         ▼
+┌─────────────────────────────────────────────────────────────────────────────┐
+│                          RAG Modulo Frontend                                 │
+│                                                                              │
+│  • Triggers searches → gets artifacts back                                  │
+│  • Configures which agents run per collection                               │
+│  • Downloads/previews generated artifacts                                   │
+└─────────────────────────────────────────────────────────────────────────────┘
+```
+
+## File Structure
+
+```
+backend/rag_solution/
+├── services/
+│   ├── mcp_gateway_client.py          # Client to call external MCP tools
+│   ├── search_result_enricher.py      # Post-search enrichment agent
+│   └── ... (existing services)
+│
+├── mcp_server/                         # RAG Modulo as MCP server
+│   ├── __init__.py
+│   ├── server.py                       # MCP server setup, transport handling
+│   ├── tools.py                        # Tool definitions (rag_search, rag_ingest, etc.)
+│   ├── resources.py                    # MCP resources (collection metadata, etc.)
+│   └── auth.py                         # SPIFFE/Bearer token validation
+│
+├── schemas/
+│   ├── mcp_schema.py                   # Schemas for MCP requests/responses
+│   └── ...
+│
+└── router/
+    ├── mcp_router.py                   # REST endpoints for MCP management
+    └── ...
+
+tests/unit/
+├── services/
+│   ├── test_mcp_gateway_client.py
+│   └── test_search_result_enricher.py
+├── router/
+│   └── test_mcp_router.py
+└── mcp_server/
+    ├── test_server.py
+    └── test_tools.py
+```
+
+## MCP Client Components
+
+### MCPGatewayClient
+
+Thin wrapper with circuit breaker pattern for calling external MCP tools via Context Forge.
+
+**Key Features**:
+
+- Circuit breaker: 5 failure threshold, 60s recovery timeout
+- Health checks: 5-second timeout
+- Default timeout: 30 seconds on all calls
+- Graceful degradation on failures
+
+### SearchResultEnricher
+
+Content Enricher pattern implementation for augmenting search results with external data.
+
+**Capabilities**:
+
+- Real-time data enrichment (stock prices, weather, etc.)
+- External knowledge base queries
+- Document metadata enhancement
+
+## MCP Server Components
+
+RAG Modulo exposes its capabilities as MCP tools for external consumption.
+
+### Exposed Tools
+
+| Tool | Description | Parameters |
+|------|-------------|------------|
+| `rag_search` | Search documents in a collection | `collection_id`, `query`, `top_k`, `use_cot` |
+| `rag_ingest` | Add documents to a collection | `collection_id`, `documents` |
+| `rag_list_collections` | List accessible collections | `include_stats` |
+| `rag_generate_podcast` | Generate podcast from collection | `collection_id`, `topic`, `duration_minutes` |
+| `rag_smart_questions` | Get suggested follow-up questions | `collection_id`, `context` |
+
+### Exposed Resources
+
+| Resource URI | Description |
+|--------------|-------------|
+| `rag://collection/{id}/documents` | Document metadata for a collection |
+| `rag://collection/{id}/stats` | Collection statistics |
+| `rag://search/{query}/results` | Cached search results |
+
+### Authentication
+
+- **SPIFFE JWT-SVID** (PR #695): For agent-to-agent calls
+- **Bearer token**: For user-delegated access from Claude Desktop, etc.
+
+## Integration with Context Forge
+
+IBM's MCP Context Forge serves as the central gateway providing:
+
+- Protocol translation (stdio, SSE, WebSocket, HTTP)
+- Tool registry and discovery
+- Bearer token auth with JWT + RBAC
+- Rate limiting with Redis backing
+- OpenTelemetry integration
+- Admin UI for management
+- Redis-backed federation for distributed deployment
+
+## Security Considerations
+
+1. **Network Isolation**: Context Forge runs in same VPC as RAG Modulo backend
+2. **JWT Authentication**: Secure token-based auth for all API calls
+3. **RBAC**: Team-based access control for sensitive tools
+4. **Secrets Management**: MCP server credentials managed by Context Forge
+5. **Audit Logging**: All tool invocations logged via OpenTelemetry
+6. **Capability Validation**: SPIFFE capabilities mapped to MCP tool permissions
+
+## Related Documents
+
+- [SearchService Agent Hooks Architecture](./search-agent-hooks-architecture.md)
+- [RAG Modulo MCP Server Architecture](./rag-modulo-mcp-server-architecture.md)
+- [SPIRE Integration Architecture](./spire-integration-architecture.md)
+- [Agent MCP Architecture Design](../design/agent-mcp-architecture.md)
+- [MCP Context Forge Integration Design](../design/mcp-context-forge-integration.md)
diff --git a/docs/architecture/rag-modulo-mcp-server-architecture.md b/docs/architecture/rag-modulo-mcp-server-architecture.md
new file mode 100644
index 00000000..4bbff346
--- /dev/null
+++ b/docs/architecture/rag-modulo-mcp-server-architecture.md
@@ -0,0 +1,689 @@
+# RAG Modulo MCP Server Architecture
+
+**Date**: November 2025
+**Status**: Architecture Design
+**Version**: 1.0
+**Related Documents**: [MCP Integration Architecture](./mcp-integration-architecture.md), [SPIRE Integration Architecture](./spire-integration-architecture.md)
+
+## Overview
+
+This document describes the architecture for exposing RAG Modulo's capabilities as an MCP
+(Model Context Protocol) server. This enables external AI tools like Claude Desktop, workflow
+automation systems, and other MCP clients to interact with RAG Modulo's search, ingestion,
+and content generation features.
+
+## Use Cases
+
+### External MCP Clients
+
+| Client | Use Case |
+|--------|----------|
+| **Claude Desktop** | User asks Claude to search their company documents |
+| **n8n/Zapier** | Workflow automation: ingest email attachments, search on triggers |
+| **Custom AI Bots** | Slack/Teams bots that query document collections |
+| **Agent Frameworks** | LangChain, AutoGPT agents using RAG Modulo as knowledge source |
+
+### Example Scenarios
+
+**Scenario 1: Claude Desktop**
+
+```
+User in Claude Desktop:
+"Search my company's financial documents for Q4 projections"
+
+Claude Desktop:
+1. Discovers rag_search tool via MCP
+2. Calls rag_search(collection_id="...", query="Q4 projections")
+3. Receives answer + sources from RAG Modulo
+4. Presents to user with citations
+```
+
+**Scenario 2: Workflow Automation**
+
+```
+Trigger: New email received with attachment
+Action 1: Extract attachment, upload to temp storage
+Action 2: Call rag_ingest to add document to collection
+Action 3: Call rag_search to check for related content
+Action 4: Send Slack notification with summary
+```
+
+**Scenario 3: Multi-Agent System**
+
+```
+Orchestrator Agent:
+1. Calls rag_list_collections to find relevant collection
+2. Calls rag_search to gather information
+3. Calls rag_generate_podcast to create audio summary
+4. Combines results for final user response
+```
+
+## Architecture
+
+```
+┌─────────────────────────────────────────────────────────────────────────────┐
+│                         EXTERNAL MCP CLIENTS                                 │
+│                                                                              │
+│  ┌─────────────────┐  ┌─────────────────┐  ┌─────────────────┐              │
+│  │  Claude Desktop │  │  Custom AI Bot  │  │  Workflow Tool  │              │
+│  │                 │  │                 │  │  (n8n, Zapier)  │              │
+│  └────────┬────────┘  └────────┬────────┘  └────────┬────────┘              │
+│           │                    │                    │                        │
+│           └────────────────────┼────────────────────┘                        │
+│                                │                                             │
+│                                ▼ MCP Protocol (stdio/SSE/HTTP)               │
+└─────────────────────────────────────────────────────────────────────────────┘
+                                 │
+                                 ▼
+┌─────────────────────────────────────────────────────────────────────────────┐
+│                    RAG Modulo Native MCP Server                              │
+│                    backend/rag_solution/mcp_server/                          │
+│                                                                              │
+│  ┌─────────────────────────────────────────────────────────────────────────┐│
+│  │                           Tools                                         ││
+│  │                                                                         ││
+│  │  ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐           ││
+│  │  │   rag_search    │ │   rag_ingest    │ │ rag_list_colls  │           ││
+│  │  └─────────────────┘ └─────────────────┘ └─────────────────┘           ││
+│  │  ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐           ││
+│  │  │ rag_gen_podcast │ │ rag_smart_q's   │ │ rag_get_doc     │           ││
+│  │  └─────────────────┘ └─────────────────┘ └─────────────────┘           ││
+│  └─────────────────────────────────────────────────────────────────────────┘│
+│                                                                              │
+│  ┌─────────────────────────────────────────────────────────────────────────┐│
+│  │                          Resources                                      ││
+│  │                                                                         ││
+│  │  rag://collection/{id}/documents    - Document metadata                 ││
+│  │  rag://collection/{id}/stats        - Collection statistics             ││
+│  │  rag://search/{query}/results       - Cached search results             ││
+│  └─────────────────────────────────────────────────────────────────────────┘│
+│                                                                              │
+│  ┌─────────────────────────────────────────────────────────────────────────┐│
+│  │                       Authentication                                    ││
+│  │                                                                         ││
+│  │  • SPIFFE JWT-SVID (agent-to-agent) ◀── PR #695                        ││
+│  │  • Bearer token (user-delegated access)                                ││
+│  │  • API key (service accounts)                                          ││
+│  └─────────────────────────────────────────────────────────────────────────┘│
+└─────────────────────────────────────────────────────────────────────────────┘
+                                 │
+                                 ▼
+┌─────────────────────────────────────────────────────────────────────────────┐
+│                  RAG Modulo Backend Services                                 │
+│         (SearchService, DocumentService, PodcastService, etc.)              │
+└─────────────────────────────────────────────────────────────────────────────┘
+```
+
+## Exposed Tools
+
+### rag_search
+
+Search documents in a RAG Modulo collection.
+
+```yaml
+name: rag_search
+description: Search documents in a RAG Modulo collection using semantic search with optional Chain-of-Thought reasoning
+
+parameters:
+  collection_id:
+    type: string
+    description: UUID of the collection to search
+    required: true
+  query:
+    type: string
+    description: Natural language search query
+    required: true
+  top_k:
+    type: integer
+    description: Number of results to return
+    required: false
+    default: 5
+  use_cot:
+    type: boolean
+    description: Enable Chain-of-Thought reasoning for complex queries
+    required: false
+    default: false
+
+returns:
+  answer:
+    type: string
+    description: Synthesized answer from retrieved documents
+  sources:
+    type: array
+    description: List of source documents with titles and relevance scores
+  cot_steps:
+    type: array
+    description: Reasoning steps (if use_cot=true)
+```
+
+### rag_ingest
+
+Add documents to a collection.
+
+```yaml
+name: rag_ingest
+description: Add one or more documents to a RAG Modulo collection
+
+parameters:
+  collection_id:
+    type: string
+    description: UUID of the target collection
+    required: true
+  documents:
+    type: array
+    description: List of documents to ingest
+    required: true
+    items:
+      type: object
+      properties:
+        title:
+          type: string
+          description: Document title
+        content:
+          type: string
+          description: Document content (text)
+        metadata:
+          type: object
+          description: Optional metadata (author, date, tags, etc.)
+
+returns:
+  ingested_count:
+    type: integer
+    description: Number of documents successfully ingested
+  document_ids:
+    type: array
+    description: UUIDs of ingested documents
+  errors:
+    type: array
+    description: Any errors encountered during ingestion
+```
+
+### rag_list_collections
+
+List collections accessible to the authenticated agent/user.
+
+```yaml
+name: rag_list_collections
+description: List document collections the authenticated agent can access
+
+parameters:
+  include_stats:
+    type: boolean
+    description: Include document counts and last updated timestamps
+    required: false
+    default: false
+
+returns:
+  collections:
+    type: array
+    items:
+      type: object
+      properties:
+        id:
+          type: string
+          description: Collection UUID
+        name:
+          type: string
+          description: Collection name
+        description:
+          type: string
+          description: Collection description
+        document_count:
+          type: integer
+          description: Number of documents (if include_stats=true)
+        last_updated:
+          type: string
+          description: ISO timestamp of last update (if include_stats=true)
+```
+
+### rag_generate_podcast
+
+Generate an audio podcast from collection content.
+
+```yaml
+name: rag_generate_podcast
+description: Generate an AI-powered audio podcast from collection documents
+
+parameters:
+  collection_id:
+    type: string
+    description: UUID of the source collection
+    required: true
+  topic:
+    type: string
+    description: Focus topic for the podcast (optional - uses all content if not specified)
+    required: false
+  duration_minutes:
+    type: integer
+    description: Target podcast duration in minutes
+    required: false
+    default: 5
+    minimum: 1
+    maximum: 30
+
+returns:
+  audio_url:
+    type: string
+    description: URL to download the generated audio file
+  transcript:
+    type: string
+    description: Full text transcript of the podcast
+  duration:
+    type: number
+    description: Actual duration in seconds
+```
+
+### rag_smart_questions
+
+Get AI-suggested follow-up questions based on context.
+
+```yaml
+name: rag_smart_questions
+description: Generate intelligent follow-up questions based on collection content and conversation context
+
+parameters:
+  collection_id:
+    type: string
+    description: UUID of the collection
+    required: true
+  context:
+    type: string
+    description: Current conversation context or recent query
+    required: false
+  count:
+    type: integer
+    description: Number of questions to generate
+    required: false
+    default: 3
+    minimum: 1
+    maximum: 10
+
+returns:
+  questions:
+    type: array
+    items:
+      type: string
+    description: List of suggested follow-up questions
+```
+
+### rag_get_document
+
+Retrieve a specific document's content and metadata.
+
+```yaml
+name: rag_get_document
+description: Retrieve full content and metadata for a specific document
+
+parameters:
+  document_id:
+    type: string
+    description: UUID of the document
+    required: true
+
+returns:
+  id:
+    type: string
+    description: Document UUID
+  title:
+    type: string
+    description: Document title
+  content:
+    type: string
+    description: Full document text content
+  metadata:
+    type: object
+    description: Document metadata
+  collection_id:
+    type: string
+    description: Parent collection UUID
+  created_at:
+    type: string
+    description: ISO timestamp of document creation
+```
+
+## Exposed Resources
+
+MCP resources provide read-only access to RAG Modulo data.
+
+### rag://collection/{id}/documents
+
+Document metadata for a collection.
+
+```json
+{
+  "uri": "rag://collection/abc123/documents",
+  "name": "Collection Documents",
+  "description": "List of documents in the collection",
+  "mimeType": "application/json"
+}
+```
+
+**Content**:
+
+```json
+{
+  "collection_id": "abc123",
+  "documents": [
+    {
+      "id": "doc1",
+      "title": "Q4 Financial Report",
+      "created_at": "2024-10-15T10:00:00Z",
+      "word_count": 5000,
+      "metadata": { "author": "Finance Team" }
+    }
+  ],
+  "total_count": 150
+}
+```
+
+### rag://collection/{id}/stats
+
+Collection statistics.
+
+```json
+{
+  "uri": "rag://collection/abc123/stats",
+  "name": "Collection Statistics",
+  "description": "Usage statistics for the collection",
+  "mimeType": "application/json"
+}
+```
+
+**Content**:
+
+```json
+{
+  "collection_id": "abc123",
+  "document_count": 150,
+  "total_words": 500000,
+  "total_chunks": 2500,
+  "last_ingestion": "2024-11-20T14:30:00Z",
+  "query_count_30d": 1250,
+  "avg_query_time_ms": 450
+}
+```
+
+### rag://search/{query}/results
+
+Cached search results (for efficiency when same query is repeated).
+
+```json
+{
+  "uri": "rag://search/q4+projections/results",
+  "name": "Cached Search Results",
+  "description": "Cached results for recent search query",
+  "mimeType": "application/json"
+}
+```
+
+## Authentication
+
+### SPIFFE JWT-SVID (Agent-to-Agent)
+
+For AI agents authenticated via SPIFFE/SPIRE (PR #695):
+
+```
+Authorization: Bearer <JWT-SVID>
+
+JWT Claims:
+{
+  "sub": "spiffe://rag-modulo.example.com/agent/search-enricher/abc123",
+  "aud": ["rag-modulo-mcp"],
+  "exp": 1732800000
+}
+```
+
+The MCP server validates the JWT-SVID and extracts:
+
+- Agent SPIFFE ID
+- Capabilities (from agents table)
+- Owner user ID (for collection access)
+
+### Bearer Token (User-Delegated)
+
+For external clients acting on behalf of users:
+
+```
+Authorization: Bearer <user-access-token>
+```
+
+User tokens are issued via existing OAuth flow and include:
+
+- User ID
+- Scopes (read, write, admin)
+- Expiration
+
+### API Key (Service Accounts)
+
+For service-to-service integration:
+
+```
+X-API-Key: <service-api-key>
+```
+
+API keys are associated with:
+
+- Service account user
+- Allowed collections
+- Rate limits
+
+## Authorization
+
+### Capability-Based Access Control
+
+SPIFFE agents have capabilities that map to MCP tool permissions:
+
+| Capability | Allowed Tools |
+|------------|---------------|
+| `search:read` | `rag_search`, `rag_list_collections`, `rag_get_document` |
+| `search:write` | `rag_ingest` |
+| `llm:invoke` | `rag_generate_podcast`, `rag_smart_questions` |
+| `collection:read` | All read operations on owned collections |
+| `collection:write` | Create/modify collections |
+
+### Collection Access
+
+Agents can only access collections where:
+
+1. They are owned by the agent's owner_user_id
+2. They are shared with the agent's team_id
+3. The collection is marked as public
+
+## File Structure
+
+```
+backend/rag_solution/mcp_server/
+├── __init__.py
+├── server.py           # MCP server setup, transport handling
+├── tools.py            # Tool definitions and implementations
+├── resources.py        # Resource definitions
+├── auth.py             # SPIFFE/Bearer/API key validation
+└── schemas.py          # Request/response schemas
+
+tests/unit/mcp_server/
+├── __init__.py
+├── test_server.py
+├── test_tools.py
+├── test_resources.py
+└── test_auth.py
+```
+
+## Server Implementation
+
+### Transport Options
+
+| Transport | Use Case | Port |
+|-----------|----------|------|
+| **stdio** | Claude Desktop, local CLI | N/A |
+| **SSE** | Web clients, real-time updates | 8010 |
+| **HTTP** | REST-like integration | 8010 |
+
+### Example Server Setup
+
+```python
+# backend/rag_solution/mcp_server/server.py
+
+from mcp import Server, Tool, Resource
+from mcp.transports import StdioTransport, SSETransport
+
+from .tools import (
+    rag_search,
+    rag_ingest,
+    rag_list_collections,
+    rag_generate_podcast,
+    rag_smart_questions,
+    rag_get_document,
+)
+from .resources import collection_documents, collection_stats, search_results
+from .auth import validate_auth
+
+server = Server("rag-modulo")
+
+# Register tools
+server.register_tool(rag_search)
+server.register_tool(rag_ingest)
+server.register_tool(rag_list_collections)
+server.register_tool(rag_generate_podcast)
+server.register_tool(rag_smart_questions)
+server.register_tool(rag_get_document)
+
+# Register resources
+server.register_resource(collection_documents)
+server.register_resource(collection_stats)
+server.register_resource(search_results)
+
+# Auth middleware
+server.use(validate_auth)
+
+# Run server
+if __name__ == "__main__":
+    transport = StdioTransport()  # Or SSETransport(port=8010)
+    server.run(transport)
+```
+
+## Integration with Context Forge
+
+Register RAG Modulo MCP server with Context Forge for federation:
+
+```bash
+curl -X POST http://localhost:8001/api/v1/servers \
+  -H "Authorization: Bearer $CONTEXT_FORGE_TOKEN" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "name": "RAG Modulo",
+    "type": "mcp",
+    "endpoint": "http://rag-modulo-backend:8010",
+    "config": {
+      "protocol": "sse",
+      "auth_required": true
+    }
+  }'
+```
+
+## SPIFFE + MCP Coexistence
+
+### Identity Flow
+
+```
+┌─────────────────────────────────────────────────────────────────────────────┐
+│                     Identity Architecture                                    │
+│                                                                              │
+│  ┌─────────────────────────────────────────────────────────────────────────┐│
+│  │                    Human Users                                          ││
+│  │  - Authenticate via OIDC/OAuth (existing auth)                         ││
+│  │  - JWT with user claims                                                ││
+│  │  - Access collections they own                                         ││
+│  └─────────────────────────────────────────────────────────────────────────┘│
+│                             ▲                                                │
+│                             │ Creates & owns                                 │
+│                             ▼                                                │
+│  ┌─────────────────────────────────────────────────────────────────────────┐│
+│  │              AI Agents (PR #695 SPIFFE)                                 ││
+│  │                                                                         ││
+│  │  SPIFFE ID: spiffe://rag-modulo.example.com/agent/{type}/{id}          ││
+│  │                                                                         ││
+│  │  Agent Record:                                                          ││
+│  │  - id: UUID                                                             ││
+│  │  - spiffe_id: Full SPIFFE ID                                            ││
+│  │  - agent_type: search-enricher, cot-reasoning, etc.                     ││
+│  │  - owner_user_id: UUID (who created/owns this agent)                    ││
+│  │  - capabilities: [search:read, llm:invoke, etc.]                        ││
+│  │  - status: active, suspended, revoked, pending                          ││
+│  │                                                                         ││
+│  │  Auth Flow:                                                             ││
+│  │  1. Agent presents JWT-SVID from SPIRE                                  ││
+│  │  2. MCP Server validates via SpiffeAuthenticator                        ││
+│  │  3. Creates AgentPrincipal with capabilities                            ││
+│  │  4. CBAC (Capability-Based Access Control)                              ││
+│  └─────────────────────────────────────────────────────────────────────────┘│
+│                             ▲                                                │
+│                             │ Invokes via MCP                                │
+│                             ▼                                                │
+│  ┌─────────────────────────────────────────────────────────────────────────┐│
+│  │            MCP Tools                                                    ││
+│  │                                                                         ││
+│  │  MCP Server handles:                                                    ││
+│  │  - Protocol translation (stdio, SSE, HTTP)                              ││
+│  │  - Tool discovery and invocation                                        ││
+│  │  - Rate limiting and circuit breakers                                   ││
+│  │                                                                         ││
+│  │  Identity Propagation:                                                  ││
+│  │  - Agent's SPIFFE ID passed in X-Spiffe-Id header                       ││
+│  │  - MCP tools validate agent capabilities                                ││
+│  │  - Audit log includes agent identity                                    ││
+│  └─────────────────────────────────────────────────────────────────────────┘│
+└─────────────────────────────────────────────────────────────────────────────┘
+```
+
+### Example Flow
+
+```python
+# Agent executes MCP tool with SPIFFE identity
+
+# 1. Agent authenticates with SPIFFE JWT-SVID
+agent_principal = await spiffe_authenticator.validate_svid(jwt_token)
+# AgentPrincipal(spiffe_id="spiffe://rag-modulo/agent/search-enricher/abc123",
+#                capabilities=["search:read", "llm:invoke"])
+
+# 2. Agent calls MCP tool
+response = await mcp_server.invoke_tool(
+    tool_name="rag_search",
+    arguments={"collection_id": "...", "query": "Q4 projections"},
+    auth_context=agent_principal
+)
+
+# 3. MCP tool validates capability
+if "search:read" not in agent_principal.capabilities:
+    raise PermissionDenied("Agent lacks search:read capability")
+
+# 4. Audit log captures full chain
+logger.info(
+    "MCP tool invoked",
+    agent_spiffe_id=agent_principal.spiffe_id,
+    tool="rag_search",
+    owner_user_id=str(agent.owner_user_id)
+)
+```
+
+## Security Considerations
+
+1. **Authentication Required**: All MCP endpoints require authentication
+2. **Capability Validation**: Every tool invocation checks agent capabilities
+3. **Collection Scoping**: Agents can only access authorized collections
+4. **Rate Limiting**: Per-agent rate limits prevent abuse
+5. **Audit Logging**: All tool invocations logged with identity context
+6. **Token Expiration**: JWT-SVIDs have short lifetimes (15 minutes)
+7. **Revocation**: Agents can be suspended/revoked immediately
+
+## Observability
+
+- OpenTelemetry spans for all MCP operations
+- Metrics: tool invocation counts, latency, error rates
+- Structured logging with agent identity context
+- Integration with Context Forge admin UI
+
+## Related Documents
+
+- [MCP Integration Architecture](./mcp-integration-architecture.md)
+- [SearchService Agent Hooks Architecture](./search-agent-hooks-architecture.md)
+- [SPIRE Integration Architecture](./spire-integration-architecture.md)
diff --git a/docs/architecture/search-agent-hooks-architecture.md b/docs/architecture/search-agent-hooks-architecture.md
new file mode 100644
index 00000000..8ee862a9
--- /dev/null
+++ b/docs/architecture/search-agent-hooks-architecture.md
@@ -0,0 +1,416 @@
+# SearchService Agent Hooks Architecture
+
+**Date**: November 2025
+**Status**: Architecture Design
+**Version**: 1.0
+**Related Documents**: [MCP Integration Architecture](./mcp-integration-architecture.md)
+
+## Overview
+
+This document describes the three-stage agent execution hook system integrated into
+SearchService. Agents can be injected at strategic points in the search pipeline to enhance,
+transform, or augment the search process.
+
+## Pipeline Flow
+
+```
+User Query: "What are the revenue projections for Q4?"
+                │
+                ▼
+┌─────────────────────────────────────────────────────────────────────────────┐
+│  STAGE 1: PRE-SEARCH AGENTS                                                  │
+│                                                                              │
+│  Purpose: Enhance/transform the query BEFORE vector search                   │
+│                                                                              │
+│  Example agents:                                                             │
+│  ┌────────────────────────────────────────────────────────────────────────┐ │
+│  │ • Query Expander: "revenue projections Q4" →                           │ │
+│  │   "revenue projections Q4 2024 2025 forecast financial outlook"        │ │
+│  │                                                                        │ │
+│  │ • Language Detector/Translator: Detect non-English, translate to EN    │ │
+│  │                                                                        │ │
+│  │ • Acronym Resolver: "Q4" → "fourth quarter, Q4, Oct-Dec"               │ │
+│  │                                                                        │ │
+│  │ • Intent Classifier: Tag as "financial_analysis" for routing           │ │
+│  └────────────────────────────────────────────────────────────────────────┘ │
+│                                                                              │
+│  Input:  { query: "What are the revenue projections for Q4?" }              │
+│  Output: { query: "revenue projections Q4 2024 forecast...", metadata: {} } │
+└─────────────────────────────────────────────────────────────────────────────┘
+                │
+                ▼
+┌─────────────────────────────────────────────────────────────────────────────┐
+│  CORE RAG SEARCH (existing logic - unchanged)                                │
+│                                                                              │
+│  • Vector embedding of (enhanced) query                                     │
+│  • Milvus similarity search                                                 │
+│  • Document retrieval                                                       │
+│  • Optional: Chain-of-Thought reasoning                                     │
+│                                                                              │
+│  Output: 10 ranked documents with scores                                    │
+└─────────────────────────────────────────────────────────────────────────────┘
+                │
+                ▼
+┌─────────────────────────────────────────────────────────────────────────────┐
+│  STAGE 2: POST-SEARCH AGENTS                                                 │
+│                                                                              │
+│  Purpose: Process/filter/augment retrieved documents BEFORE answer gen      │
+│                                                                              │
+│  Example agents:                                                             │
+│  ┌────────────────────────────────────────────────────────────────────────┐ │
+│  │ • Re-ranker: Use cross-encoder to re-score documents for relevance     │ │
+│  │                                                                        │ │
+│  │ • Deduplicator: Remove near-duplicate content across documents         │ │
+│  │                                                                        │ │
+│  │ • Fact Checker: Validate claims against trusted sources                │ │
+│  │                                                                        │ │
+│  │ • PII Redactor: Remove sensitive info before showing to user           │ │
+│  │                                                                        │ │
+│  │ • External Enricher: Add real-time stock prices, weather, etc.         │ │
+│  │   (This is what SearchResultEnricher does)                             │ │
+│  └────────────────────────────────────────────────────────────────────────┘ │
+│                                                                              │
+│  Input:  { documents: [...10 docs...], query: "..." }                       │
+│  Output: { documents: [...8 docs, reordered, enriched...] }                 │
+└─────────────────────────────────────────────────────────────────────────────┘
+                │
+                ▼
+┌─────────────────────────────────────────────────────────────────────────────┐
+│  ANSWER GENERATION (existing logic - unchanged)                              │
+│                                                                              │
+│  • LLM synthesizes answer from documents                                    │
+│  • Source attribution                                                       │
+│  • CoT reasoning steps (if enabled)                                         │
+│                                                                              │
+│  Output: { answer: "Based on the documents...", sources: [...] }            │
+└─────────────────────────────────────────────────────────────────────────────┘
+                │
+                ▼
+┌─────────────────────────────────────────────────────────────────────────────┐
+│  STAGE 3: RESPONSE AGENTS                                                    │
+│                                                                              │
+│  Purpose: Generate artifacts/transformations from the final answer           │
+│                                                                              │
+│  Example agents:                                                             │
+│  ┌────────────────────────────────────────────────────────────────────────┐ │
+│  │ • PowerPoint Generator: Create slides from answer + sources            │ │
+│  │   Output: { type: "pptx", data: "base64...", filename: "Q4.pptx" }     │ │
+│  │                                                                        │ │
+│  │ • PDF Report Generator: Formatted document with citations              │ │
+│  │   Output: { type: "pdf", data: "base64...", filename: "report.pdf" }   │ │
+│  │                                                                        │ │
+│  │ • Chart Generator: Visualize numerical data from answer                │ │
+│  │   Output: { type: "png", data: "base64...", filename: "chart.png" }    │ │
+│  │                                                                        │ │
+│  │ • Audio Summary: Text-to-speech of key findings                        │ │
+│  │   Output: { type: "mp3", data: "base64...", filename: "summary.mp3" }  │ │
+│  │                                                                        │ │
+│  │ • Email Draft: Format answer for email sharing                         │ │
+│  │   Output: { type: "html", data: "<html>...", subject: "Q4 Summary" }   │ │
+│  └────────────────────────────────────────────────────────────────────────┘ │
+│                                                                              │
+│  These run in PARALLEL since they're independent transformations            │
+└─────────────────────────────────────────────────────────────────────────────┘
+                │
+                ▼
+┌─────────────────────────────────────────────────────────────────────────────┐
+│  FINAL RESPONSE                                                              │
+│                                                                              │
+│  {                                                                           │
+│    "answer": "Based on the financial documents, Q4 revenue is...",          │
+│    "sources": [                                                              │
+│      { "document_id": "...", "title": "Q4 Forecast", "score": 0.92 }        │
+│    ],                                                                        │
+│    "cot_steps": [...],  // If CoT enabled                                   │
+│    "agent_artifacts": [  // NEW - from response agents                      │
+│      {                                                                       │
+│        "agent_id": "ppt_generator",                                         │
+│        "type": "pptx",                                                       │
+│        "data": "UEsDBBQAAAAIAH...",  // base64                              │
+│        "filename": "Q4_Revenue_Projections.pptx",                           │
+│        "metadata": { "slides": 5 }                                          │
+│      },                                                                      │
+│      {                                                                       │
+│        "agent_id": "chart_generator",                                       │
+│        "type": "png",                                                        │
+│        "data": "iVBORw0KGgo...",  // base64                                 │
+│        "filename": "revenue_chart.png",                                     │
+│        "metadata": { "width": 800, "height": 600 }                          │
+│      }                                                                       │
+│    ]                                                                         │
+│  }                                                                           │
+└─────────────────────────────────────────────────────────────────────────────┘
+```
+
+## Agent Stages
+
+### Stage 1: Pre-Search Agents
+
+**Purpose**: Transform or enhance the query before vector search.
+
+**Execution**: Sequential by priority (results chain to next agent).
+
+| Agent Type | Description | Use Case |
+|------------|-------------|----------|
+| Query Expander | Adds synonyms and related terms | Improve recall |
+| Language Detector | Identifies query language | Multi-language support |
+| Translator | Translates non-English queries | Internationalization |
+| Acronym Resolver | Expands abbreviations | Domain-specific search |
+| Intent Classifier | Tags query intent | Routing and filtering |
+| Spell Checker | Corrects typos | User experience |
+
+**Input Schema**:
+
+```python
+class PreSearchInput:
+    query: str
+    collection_id: UUID
+    user_id: UUID
+    metadata: dict[str, Any]
+```
+
+**Output Schema**:
+
+```python
+class PreSearchOutput:
+    query: str  # Modified query
+    metadata: dict[str, Any]  # Additional context
+    skip_search: bool = False  # If True, skip core search
+```
+
+### Stage 2: Post-Search Agents
+
+**Purpose**: Process, filter, or augment retrieved documents before answer generation.
+
+**Execution**: Sequential by priority (documents flow through each agent).
+
+| Agent Type | Description | Use Case |
+|------------|-------------|----------|
+| Re-ranker | Cross-encoder re-scoring | Improve precision |
+| Deduplicator | Remove near-duplicates | Cleaner results |
+| Fact Checker | Validate against trusted sources | Accuracy |
+| PII Redactor | Remove sensitive information | Compliance |
+| External Enricher | Add real-time data | Currency |
+| Relevance Filter | Remove low-quality results | Quality |
+
+**Input Schema**:
+
+```python
+class PostSearchInput:
+    documents: list[Document]
+    query: str
+    collection_id: UUID
+    user_id: UUID
+    metadata: dict[str, Any]
+```
+
+**Output Schema**:
+
+```python
+class PostSearchOutput:
+    documents: list[Document]  # Modified/filtered documents
+    metadata: dict[str, Any]  # Enrichment data
+```
+
+### Stage 3: Response Agents
+
+**Purpose**: Generate artifacts or transformations from the final answer.
+
+**Execution**: Parallel (independent transformations).
+
+| Agent Type | Description | Output Format |
+|------------|-------------|---------------|
+| PowerPoint Generator | Create presentation slides | `.pptx` |
+| PDF Report Generator | Formatted document with citations | `.pdf` |
+| Chart Generator | Visualize numerical data | `.png`, `.svg` |
+| Audio Summary | Text-to-speech narration | `.mp3` |
+| Email Draft | Format for email sharing | `.html` |
+| Executive Summary | Condensed key findings | `.txt` |
+
+**Input Schema**:
+
+```python
+class ResponseAgentInput:
+    answer: str
+    sources: list[Source]
+    query: str
+    documents: list[Document]
+    collection_id: UUID
+    user_id: UUID
+    cot_steps: list[CotStep] | None
+```
+
+**Output Schema**:
+
+```python
+class AgentArtifact:
+    agent_id: str
+    type: str  # "pptx", "pdf", "png", "mp3", "html"
+    data: str  # base64 encoded
+    filename: str
+    metadata: dict[str, Any]
+```
+
+## Agent Priority and Chaining
+
+Agents at each stage execute in priority order (lower number = higher priority):
+
+```
+Pre-search stage (priority order):
+  1. Language Detector (priority: 0)  → detects "es" (Spanish)
+  2. Translator (priority: 10)        → uses detection, translates to EN
+  3. Query Expander (priority: 20)    → expands the translated query
+
+Each agent receives:
+  - AgentContext with query, collection_id, user_id
+  - previous_agent_results: List of results from earlier agents in this stage
+```
+
+## AgentContext
+
+Context object passed to all agents:
+
+```python
+@dataclass
+class AgentContext:
+    # Collection context
+    collection_id: UUID
+    user_id: UUID
+
+    # Conversation context
+    conversation_id: UUID | None = None
+    conversation_history: list[dict[str, str]] | None = None
+
+    # Search context (populated as pipeline progresses)
+    query: str | None = None
+    retrieved_documents: list[dict[str, Any]] | None = None
+    search_metadata: dict[str, Any] | None = None
+
+    # Pipeline context
+    pipeline_stage: str  # 'pre_search', 'post_search', 'response'
+
+    # Agent chaining
+    previous_agent_results: list[AgentResult] | None = None
+```
+
+## AgentResult
+
+Result object returned by all agents:
+
+```python
+@dataclass
+class AgentResult:
+    agent_id: str
+    success: bool
+    data: dict[str, Any]
+    metadata: dict[str, Any]
+    errors: list[str] | None = None
+
+    # For chaining agents
+    next_agent_id: str | None = None
+```
+
+## Collection-Agent Association
+
+Agents are configured per collection:
+
+```
+Collection Settings → Agents & Tools
+┌─────────────────────────────────────────────────────────────────────────┐
+│ ☑ PowerPoint Generator              Stage: Response   Priority: 1      │
+│   Creates slides from search results                   [Configure]     │
+├─────────────────────────────────────────────────────────────────────────┤
+│ ☑ Query Expander                    Stage: Pre-search  Priority: 0     │
+│   Adds synonyms and related terms                     [Configure]      │
+├─────────────────────────────────────────────────────────────────────────┤
+│ ☐ External Knowledge Enricher       Stage: Post-search Priority: 5     │
+│   Augments with real-time market data                 [Configure]      │
+└─────────────────────────────────────────────────────────────────────────┘
+```
+
+## Database Schema
+
+### AgentConfig Table
+
+```sql
+CREATE TABLE agent_configs (
+    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
+    user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
+    agent_id VARCHAR(100) NOT NULL,  -- From agent registry
+    name VARCHAR(255) NOT NULL,
+    description TEXT,
+    config JSONB NOT NULL DEFAULT '{}',  -- Agent-specific settings
+    enabled BOOLEAN NOT NULL DEFAULT true,
+    trigger_stage VARCHAR(50) NOT NULL,  -- 'pre_search', 'post_search', 'response'
+    priority INTEGER NOT NULL DEFAULT 0,
+    created_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT CURRENT_TIMESTAMP,
+    updated_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT CURRENT_TIMESTAMP
+);
+
+-- Many-to-many: Collections ↔ AgentConfigs
+CREATE TABLE collection_agents (
+    collection_id UUID NOT NULL REFERENCES collections(id) ON DELETE CASCADE,
+    agent_config_id UUID NOT NULL REFERENCES agent_configs(id) ON DELETE CASCADE,
+    PRIMARY KEY (collection_id, agent_config_id)
+);
+
+-- Indexes
+CREATE INDEX idx_agent_configs_user_id ON agent_configs(user_id);
+CREATE INDEX idx_agent_configs_trigger_stage ON agent_configs(trigger_stage);
+CREATE INDEX idx_agent_configs_enabled ON agent_configs(enabled);
+```
+
+### Example AgentConfig
+
+```json
+{
+  "id": "abc123...",
+  "user_id": "user456...",
+  "agent_id": "ppt_generator",
+  "name": "PowerPoint Generator",
+  "config": {
+    "type": "mcp",
+    "context_forge_tool_id": "generate_powerpoint",
+    "argument_mapping": {
+      "title": "query",
+      "documents": "documents",
+      "max_slides": "config.max_slides"
+    },
+    "settings": {
+      "max_slides": 15,
+      "template": "corporate"
+    }
+  },
+  "enabled": true,
+  "trigger_stage": "response",
+  "priority": 10
+}
+```
+
+## Error Handling
+
+- **Agent Timeout**: Each agent has configurable timeout (default 30s)
+- **Agent Failure**: Logged, skipped, pipeline continues
+- **Circuit Breaker**: Failing agents disabled after threshold
+- **Fallback**: Optional fallback agents for critical stages
+
+## Performance Considerations
+
+1. **Pre-search agents**: Run sequentially (query transformation order matters)
+2. **Post-search agents**: Run sequentially (document filtering order matters)
+3. **Response agents**: Run in parallel (independent artifact generation)
+4. **Caching**: Agent results cached by (query_hash, agent_id, config_hash)
+5. **Timeouts**: Per-agent and per-stage timeouts prevent runaway execution
+
+## Observability
+
+- All agent executions logged with structured context
+- OpenTelemetry spans for each agent invocation
+- Metrics: execution time, success rate, artifact sizes
+- Traces flow through Context Forge for end-to-end visibility
+
+## Related Documents
+
+- [MCP Integration Architecture](./mcp-integration-architecture.md)
+- [RAG Modulo MCP Server Architecture](./rag-modulo-mcp-server-architecture.md)
+- [Agent MCP Architecture Design](../design/agent-mcp-architecture.md)
diff --git a/docs/architecture/system-architecture.md b/docs/architecture/system-architecture.md
new file mode 100644
index 00000000..ddff44b6
--- /dev/null
+++ b/docs/architecture/system-architecture.md
@@ -0,0 +1,425 @@
+# RAG Modulo System Architecture
+
+## Repository Overview
+
+**RAG Modulo** is a production-ready Retrieval-Augmented Generation (RAG) platform that enables
+intelligent document processing, semantic search, and AI-powered question answering. The system
+combines enterprise-grade document processing with advanced AI reasoning capabilities to provide
+accurate, context-aware answers from large document collections.
+
+### Key Capabilities
+
+1. **Document Processing**: Supports multiple formats (PDF, DOCX, XLSX, TXT) with advanced
+   processing via IBM Docling for tables, images, and complex layouts
+2. **Intelligent Search**: Vector similarity search with hybrid strategies, reranking, and source attribution
+3. **Chain of Thought Reasoning**: Automatic question decomposition with step-by-step reasoning for complex queries
+4. **Multi-LLM Support**: Seamless integration with WatsonX, OpenAI, and Anthropic
+5. **Multi-Vector Database**: Pluggable support for Milvus, Elasticsearch, Pinecone, Weaviate, and ChromaDB
+6. **Conversational Interface**: Multi-turn conversations with context preservation
+7. **Podcast Generation**: AI-powered podcast creation from document collections
+8. **Voice Synthesis**: Text-to-speech capabilities with multiple providers
+
+## System Architecture Diagram
+
+```mermaid
+graph TB
+    subgraph "Client Layer"
+        WEB[React Web Frontend<br/>TypeScript + Tailwind CSS<br/>Carbon Design System]
+        CLI[CLI Client<br/>rag-cli commands]
+        API_CLIENT[External API Clients<br/>REST/WebSocket]
+    end
+
+    subgraph "API Gateway Layer"
+        FASTAPI[FastAPI Application<br/>main.py<br/>Port 8000]
+
+        subgraph "Middleware Stack"
+            CORS[LoggingCORSMiddleware<br/>CORS + Request Logging]
+            SESSION[SessionMiddleware<br/>Session Management]
+            AUTH_MW[AuthenticationMiddleware<br/>SPIFFE/OIDC Validation]
+        end
+    end
+
+    subgraph "Router Layer - REST Endpoints"
+        AUTH_R["/auth<br/>Authentication"]
+        SEARCH_R["/api/search<br/>RAG Search"]
+        COLLECTION_R["/api/collections<br/>Document Management"]
+        CHAT_R["/api/chat<br/>Conversational Interface"]
+        CONV_R["/api/conversations<br/>Session Management"]
+        PODCAST_R["/api/podcast<br/>Podcast Generation"]
+        VOICE_R["/api/voice<br/>Voice Synthesis"]
+        AGENT_R["/api/agents<br/>SPIFFE Agent Management"]
+        USER_R["/api/users<br/>User Management"]
+        TEAM_R["/api/teams<br/>Team Collaboration"]
+        DASH_R["/api/dashboard<br/>Analytics"]
+        HEALTH_R["/api/health<br/>Health Checks"]
+        WS_R["/ws<br/>WebSocket"]
+    end
+
+    subgraph "Service Layer - Business Logic"
+        SEARCH_SVC[SearchService<br/>RAG Orchestration]
+        CONV_SVC[ConversationService<br/>Multi-turn Context]
+        MSG_ORCH[MessageProcessingOrchestrator<br/>Message Flow]
+        COLLECTION_SVC[CollectionService<br/>Collection Management]
+        FILE_SVC[FileManagementService<br/>File Operations]
+        PODCAST_SVC[PodcastService<br/>Content Generation]
+        VOICE_SVC[VoiceService<br/>Audio Synthesis]
+        AGENT_SVC[AgentService<br/>SPIFFE Identity]
+        USER_SVC[UserService<br/>User Operations]
+        TEAM_SVC[TeamService<br/>Team Operations]
+        DASH_SVC[DashboardService<br/>Analytics]
+        PIPELINE_SVC[PipelineService<br/>Pipeline Execution]
+        COT_SVC[ChainOfThoughtService<br/>Reasoning Engine]
+        ANSWER_SYNTH[AnswerSynthesizer<br/>Answer Generation]
+        CITATION_SVC[CitationAttributionService<br/>Source Attribution]
+    end
+
+    subgraph "RAG Pipeline Architecture - 6 Stages"
+        PIPELINE_EXEC[PipelineExecutor<br/>Orchestrates Stages]
+        SEARCH_CTX[SearchContext<br/>State Management]
+
+        STAGE1[Stage 1: Pipeline Resolution<br/>Resolve User Pipeline Config]
+        STAGE2[Stage 2: Query Enhancement<br/>Rewrite/Enhance Query]
+        STAGE3[Stage 3: Retrieval<br/>Vector Similarity Search]
+        STAGE4[Stage 4: Reranking<br/>Relevance Scoring]
+        STAGE5[Stage 5: Reasoning<br/>Chain of Thought]
+        STAGE6[Stage 6: Generation<br/>LLM Answer Synthesis]
+    end
+
+    subgraph "Document Ingestion Pipeline"
+        DOC_STORE[DocumentStore<br/>Ingestion Orchestration]
+        DOC_PROC[DocumentProcessor<br/>Format Router]
+
+        PDF_PROC[PdfProcessor<br/>PyMuPDF + OCR]
+        DOCLING_PROC[DoclingProcessor<br/>IBM Docling<br/>Tables/Images]
+        WORD_PROC[WordProcessor<br/>DOCX Support]
+        EXCEL_PROC[ExcelProcessor<br/>XLSX Support]
+        TXT_PROC[TxtProcessor<br/>Plain Text]
+
+        CHUNKING[Chunking Strategies<br/>Sentence/Semantic/Hierarchical]
+        EMBEDDING[Embedding Generation<br/>Vector Creation]
+    end
+
+    subgraph "Retrieval Layer"
+        RETRIEVER[Retriever<br/>Vector Search]
+        RERANKER[Reranker<br/>Relevance Scoring]
+        QUERY_REWRITER[QueryRewriter<br/>Query Optimization]
+    end
+
+    subgraph "Generation Layer"
+        LLM_FACTORY[LLMProviderFactory<br/>Provider Management]
+
+        WATSONX[WatsonX Provider<br/>IBM WatsonX AI]
+        OPENAI[OpenAI Provider<br/>GPT Models]
+        ANTHROPIC[Anthropic Provider<br/>Claude Models]
+
+        AUDIO_FACTORY[AudioFactory<br/>Audio Provider Management]
+        ELEVENLABS[ElevenLabs Audio<br/>Voice Synthesis]
+        OPENAI_AUDIO[OpenAI Audio<br/>TTS]
+        OLLAMA_AUDIO[Ollama Audio<br/>Local TTS]
+    end
+
+    subgraph "Repository Layer - Data Access"
+        USER_REPO[UserRepository]
+        COLLECTION_REPO[CollectionRepository]
+        FILE_REPO[FileRepository]
+        CONV_REPO[ConversationRepository]
+        AGENT_REPO[AgentRepository]
+        PODCAST_REPO[PodcastRepository]
+        VOICE_REPO[VoiceRepository]
+        TEAM_REPO[TeamRepository]
+        PIPELINE_REPO[PipelineRepository]
+        LLM_REPO[LLMProviderRepository]
+    end
+
+    subgraph "Data Persistence Layer"
+        POSTGRES[(PostgreSQL<br/>Port 5432<br/>Metadata & Config)]
+
+        VECTOR_DB[(Vector Database<br/>Abstracted Interface)]
+        MILVUS[Milvus<br/>Primary Vector DB<br/>Port 19530]
+        PINECONE[Pinecone<br/>Cloud Vector DB]
+        WEAVIATE[Weaviate<br/>GraphQL Vector DB]
+        ELASTICSEARCH[Elasticsearch<br/>Search Engine]
+        CHROMA[ChromaDB<br/>Lightweight Vector DB]
+    end
+
+    subgraph "Object Storage"
+        MINIO[(MinIO<br/>Port 9000<br/>Object Storage<br/>Files & Audio)]
+    end
+
+    subgraph "External Services"
+        SPIRE[SPIRE Server<br/>SPIFFE Workload Identity<br/>Agent Authentication]
+        OIDC[OIDC Provider<br/>IBM AppID<br/>User Authentication]
+        MLFLOW[MLFlow<br/>Port 5001<br/>Model Tracking]
+    end
+
+    subgraph "Core Infrastructure"
+        CONFIG[Settings/Config<br/>Pydantic Settings<br/>Environment Variables]
+        LOGGING[Logging Utils<br/>Structured Logging<br/>Context Tracking]
+        IDENTITY[Identity Service<br/>User/Agent Identity]
+        EXCEPTIONS[Custom Exceptions<br/>Domain Errors]
+    end
+
+    %% Client to API Gateway
+    WEB -->|HTTP/WebSocket| FASTAPI
+    CLI -->|HTTP| FASTAPI
+    API_CLIENT -->|REST API| FASTAPI
+
+    %% Middleware Flow
+    FASTAPI --> CORS
+    CORS --> SESSION
+    SESSION --> AUTH_MW
+
+    %% Router Registration
+    AUTH_MW --> AUTH_R
+    AUTH_MW --> SEARCH_R
+    AUTH_MW --> COLLECTION_R
+    AUTH_MW --> CHAT_R
+    AUTH_MW --> CONV_R
+    AUTH_MW --> PODCAST_R
+    AUTH_MW --> VOICE_R
+    AUTH_MW --> AGENT_R
+    AUTH_MW --> USER_R
+    AUTH_MW --> TEAM_R
+    AUTH_MW --> DASH_R
+    AUTH_MW --> HEALTH_R
+    AUTH_MW --> WS_R
+
+    %% Router to Service
+    SEARCH_R --> SEARCH_SVC
+    CHAT_R --> CONV_SVC
+    CONV_R --> CONV_SVC
+    CONV_SVC --> MSG_ORCH
+    MSG_ORCH --> SEARCH_SVC
+    COLLECTION_R --> COLLECTION_SVC
+    COLLECTION_SVC --> FILE_SVC
+    PODCAST_R --> PODCAST_SVC
+    VOICE_R --> VOICE_SVC
+    AGENT_R --> AGENT_SVC
+    USER_R --> USER_SVC
+    TEAM_R --> TEAM_SVC
+    DASH_R --> DASH_SVC
+
+    %% Search Service to Pipeline
+    SEARCH_SVC --> PIPELINE_EXEC
+    PIPELINE_EXEC --> STAGE1
+    STAGE1 --> STAGE2
+    STAGE2 --> STAGE3
+    STAGE3 --> STAGE4
+    STAGE4 --> STAGE5
+    STAGE5 --> STAGE6
+    PIPELINE_EXEC --> SEARCH_CTX
+
+    %% Pipeline Stages to Services
+    STAGE1 --> PIPELINE_SVC
+    STAGE2 --> QUERY_REWRITER
+    STAGE3 --> RETRIEVER
+    STAGE4 --> RERANKER
+    STAGE5 --> COT_SVC
+    STAGE6 --> ANSWER_SYNTH
+
+    %% Retrieval to Vector DB
+    RETRIEVER --> VECTOR_DB
+    VECTOR_DB --> MILVUS
+    VECTOR_DB --> PINECONE
+    VECTOR_DB --> WEAVIATE
+    VECTOR_DB --> ELASTICSEARCH
+    VECTOR_DB --> CHROMA
+
+    %% Generation Layer
+    ANSWER_SYNTH --> LLM_FACTORY
+    LLM_FACTORY --> WATSONX
+    LLM_FACTORY --> OPENAI
+    LLM_FACTORY --> ANTHROPIC
+    PODCAST_SVC --> LLM_FACTORY
+    VOICE_SVC --> AUDIO_FACTORY
+    AUDIO_FACTORY --> ELEVENLABS
+    AUDIO_FACTORY --> OPENAI_AUDIO
+    AUDIO_FACTORY --> OLLAMA_AUDIO
+
+    %% Data Ingestion Flow
+    FILE_SVC --> DOC_STORE
+    DOC_STORE --> DOC_PROC
+    DOC_PROC --> PDF_PROC
+    DOC_PROC --> DOCLING_PROC
+    DOC_PROC --> WORD_PROC
+    DOC_PROC --> EXCEL_PROC
+    DOC_PROC --> TXT_PROC
+    DOC_PROC --> CHUNKING
+    CHUNKING --> EMBEDDING
+    DOC_STORE --> VECTOR_DB
+    DOC_STORE --> MINIO
+
+    %% Service to Repository
+    USER_SVC --> USER_REPO
+    COLLECTION_SVC --> COLLECTION_REPO
+    FILE_SVC --> FILE_REPO
+    CONV_SVC --> CONV_REPO
+    AGENT_SVC --> AGENT_REPO
+    PODCAST_SVC --> PODCAST_REPO
+    VOICE_SVC --> VOICE_REPO
+    TEAM_SVC --> TEAM_REPO
+    PIPELINE_SVC --> PIPELINE_REPO
+    PIPELINE_SVC --> LLM_REPO
+
+    %% Repository to Database
+    USER_REPO --> POSTGRES
+    COLLECTION_REPO --> POSTGRES
+    FILE_REPO --> POSTGRES
+    CONV_REPO --> POSTGRES
+    AGENT_REPO --> POSTGRES
+    PODCAST_REPO --> POSTGRES
+    VOICE_REPO --> POSTGRES
+    TEAM_REPO --> POSTGRES
+    PIPELINE_REPO --> POSTGRES
+    LLM_REPO --> POSTGRES
+
+    %% Authentication
+    AUTH_MW --> SPIRE
+    AUTH_MW --> OIDC
+    AGENT_SVC --> SPIRE
+
+    %% Storage
+    FILE_SVC --> MINIO
+    PODCAST_SVC --> MINIO
+    VOICE_SVC --> MINIO
+
+    %% Core Infrastructure
+    FASTAPI --> CONFIG
+    FASTAPI --> LOGGING
+    AUTH_MW --> IDENTITY
+    SEARCH_SVC --> EXCEPTIONS
+    CONV_SVC --> EXCEPTIONS
+
+    %% Styling
+    style FASTAPI fill:#4A90E2,stroke:#2E5C8A,stroke-width:3px
+    style PIPELINE_EXEC fill:#50C878,stroke:#2D8659,stroke-width:2px
+    style VECTOR_DB fill:#FF6B6B,stroke:#C92A2A,stroke-width:2px
+    style POSTGRES fill:#4ECDC4,stroke:#2D7D7D,stroke-width:2px
+    style LLM_FACTORY fill:#FFD93D,stroke:#CC9900,stroke-width:2px
+    style DOC_STORE fill:#9B59B6,stroke:#6C3483,stroke-width:2px
+    style WEB fill:#61DAFB,stroke:#20232A,stroke-width:2px
+    style MINIO fill:#FFA500,stroke:#CC7700,stroke-width:2px
+```
+
+## Architecture Layers Explained
+
+### 1. Client Layer
+
+- **React Web Frontend**: Modern TypeScript/React application with Carbon Design System
+- **CLI Client**: Command-line interface for automation and scripting
+- **API Clients**: External integrations via REST/WebSocket
+
+### 2. API Gateway Layer
+
+- **FastAPI Application**: Main entry point handling HTTP requests
+- **Middleware Stack**: CORS, session management, and authentication
+
+### 3. Router Layer
+
+RESTful endpoints organized by domain (auth, search, collections, chat, etc.)
+
+### 4. Service Layer
+
+Business logic services that orchestrate operations across repositories and external services
+
+### 5. RAG Pipeline (6 Stages)
+
+1. **Pipeline Resolution**: Determines user's default pipeline configuration
+2. **Query Enhancement**: Rewrites/enhances queries for better retrieval
+3. **Retrieval**: Performs vector similarity search
+4. **Reranking**: Scores and reranks results for relevance
+5. **Reasoning**: Applies Chain of Thought for complex questions
+6. **Generation**: Synthesizes final answer using LLM
+
+### 6. Document Ingestion Pipeline
+
+- Processes multiple document formats
+- Applies chunking strategies
+- Generates embeddings
+- Stores in vector database and object storage
+
+### 7. Data Persistence
+
+- **PostgreSQL**: Metadata, configuration, user data
+- **Vector Databases**: Pluggable support for multiple vector DBs
+- **MinIO**: Object storage for files and generated content
+
+### 8. External Services
+
+- **SPIRE**: SPIFFE workload identity for agent authentication
+- **OIDC**: User authentication via IBM AppID
+- **MLFlow**: Model tracking and experimentation
+
+## Key Data Flows
+
+### Search Request Flow
+
+1. Client → FastAPI → Search Router
+2. Search Router → SearchService
+3. SearchService → PipelineExecutor
+4. Pipeline executes 6 stages sequentially
+5. RetrievalStage queries Vector Database
+6. GenerationStage calls LLM Provider
+7. Response flows back through layers
+
+### Document Ingestion Flow
+
+1. Client → Collection Router → CollectionService → FileManagementService
+2. FileManagementService → DocumentStore
+3. DocumentStore → DocumentProcessor → Format-specific Processor
+4. Processor → Chunking Strategy → Embeddings
+5. Embeddings → Vector Database
+6. Original files → MinIO Object Storage
+
+### Conversation Flow
+
+1. Client → Conversation Router → ConversationService
+2. ConversationService → MessageProcessingOrchestrator
+3. Orchestrator → SearchService (with conversation context)
+4. SearchService executes pipeline with context
+5. Response saved via ConversationRepository → PostgreSQL
+
+## Design Patterns
+
+- **Repository Pattern**: Data access abstraction
+- **Factory Pattern**: LLM and Vector DB instantiation
+- **Strategy Pattern**: Chunking strategies, LLM providers
+- **Pipeline Pattern**: Stage-based RAG processing
+- **Dependency Injection**: Services and repositories
+- **Middleware Pattern**: Cross-cutting concerns
+
+## Technology Stack
+
+### Backend
+
+- **Framework**: FastAPI (Python 3.12+)
+- **Database**: PostgreSQL (SQLAlchemy ORM)
+- **Vector DB**: Milvus (primary), Pinecone, Weaviate, Elasticsearch, ChromaDB
+- **Object Storage**: MinIO
+- **Document Processing**: IBM Docling, PyMuPDF, python-docx, openpyxl
+
+### Frontend
+
+- **Framework**: React 18 with TypeScript
+- **Styling**: Tailwind CSS + Carbon Design System
+- **HTTP Client**: Axios
+- **State Management**: React Context API
+
+### Infrastructure
+
+- **Containerization**: Docker + Docker Compose
+- **CI/CD**: GitHub Actions
+- **Container Registry**: GitHub Container Registry (GHCR)
+- **Authentication**: SPIFFE/SPIRE (agents), OIDC (users)
+
+### LLM Providers
+
+- IBM WatsonX
+- OpenAI (GPT models)
+- Anthropic (Claude)
+
+### Audio Providers
+
+- ElevenLabs
+- OpenAI TTS
+- Ollama (local)