Skip to content

feat: Expose RAG Modulo as MCP Server #698

@manavgup

Description

@manavgup

Summary

Implement RAG Modulo as an MCP (Model Context Protocol) server to expose RAG capabilities to external AI tools like Claude Desktop, workflow automation systems, and other MCP clients.

Background

This implements the architecture defined in docs/architecture/rag-modulo-mcp-server-architecture.md. While PR #671 implements RAG Modulo as an MCP client (consuming external tools), this issue implements RAG Modulo as an MCP server (exposing its capabilities).

Exposed MCP Tools

Tool Description Parameters
rag_search Search documents in a collection collection_id, query, top_k, use_cot
rag_ingest Add documents to a collection collection_id, documents
rag_list_collections List accessible collections include_stats
rag_generate_podcast Generate podcast from collection collection_id, topic, duration_minutes
rag_smart_questions Get suggested follow-up questions collection_id, context
rag_get_document Retrieve document by ID document_id

Exposed MCP Resources

Resource URI Description
rag://collection/{id}/documents Document metadata for a collection
rag://collection/{id}/stats Collection statistics
rag://search/{query}/results Cached search results

Implementation Tasks

MCP Server Core

  • Create mcp_server/ directory structure
  • Implement server.py - MCP server setup, transport handling
  • Implement tools.py - Tool definitions and handlers
  • Implement resources.py - Resource definitions and handlers
  • Implement auth.py - SPIFFE/Bearer token validation

Tool Implementations

  • rag_search tool calling SearchService
  • rag_ingest tool calling FileManagementService
  • rag_list_collections tool calling CollectionService
  • rag_generate_podcast tool calling PodcastService
  • rag_smart_questions tool calling SearchService
  • rag_get_document tool calling DocumentService

Authentication

  • SPIFFE JWT-SVID validation (agent-to-agent)
  • Bearer token validation (user-delegated access)
  • API key validation (simple integration)
  • Capability-based access control

Transport Support

  • stdio transport (for Claude Desktop)
  • SSE transport (for web clients)
  • HTTP transport (for API clients)

Configuration

  • MCP server configuration in settings
  • Tool enable/disable per deployment
  • Rate limiting configuration

File Structure

backend/rag_solution/
├── mcp_server/
│   ├── __init__.py
│   ├── server.py          # MCP server setup
│   ├── tools.py           # Tool definitions
│   ├── resources.py       # Resource definitions
│   └── auth.py            # Authentication
├── schemas/
│   └── mcp_schema.py      # MCP request/response schemas
└── router/
    └── mcp_router.py      # REST endpoints for MCP management

Dependencies

Use Cases

  1. Claude Desktop Integration: Users can search RAG Modulo collections directly from Claude Desktop
  2. Workflow Automation: n8n/Zapier can ingest documents and trigger searches
  3. Agent Orchestration: External agents can query RAG Modulo as a knowledge source
  4. Cross-System Integration: Other MCP-enabled systems can access RAG capabilities

Acceptance Criteria

  • Claude Desktop can connect and search collections
  • External workflow tools can ingest documents via MCP
  • SPIFFE authentication works for agent-to-agent calls
  • Bearer token works for user-delegated access
  • Tools properly call underlying services
  • Resources return correct data
  • Unit tests with 80%+ coverage
  • Integration test with Claude Desktop

References

Metadata

Metadata

Assignees

Labels

agenticAgentic AI featuresbackendBackend/API relatedenhancementNew feature or requestpriority:mediumMedium priority - nice to have

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions