Skip to content

Epic: RAG Modulo Evolution - Naive → Advanced → Modular RAG Architecture #256

@manavgup

Description

@manavgup

Epic: RAG Modulo Evolution - Naive → Advanced → Modular RAG Architecture

Executive Summary

This epic tracks the evolution of RAG Modulo from its current state (70% Advanced RAG, 30% Modular RAG) to a fully-featured Modular RAG architecture. The roadmap is divided into 3 phases over 24 weeks, prioritizing high-impact improvements that enhance retrieval quality, answer accuracy, and system intelligence.

Current State Assessment

✅ Implemented Features

Indexing (Modular-level)

  • Document processors for PDF, DOCX, XLSX, TXT (backend/rag_solution/data_ingestion/)
  • Multiple chunking strategies: simple, semantic, token-based (backend/rag_solution/data_ingestion/chunking.py)
  • Batch embedding generation (backend/rag_solution/data_ingestion/ingestion.py)
  • Multiple vector DB support: Milvus, Elasticsearch, Pinecone, Weaviate, ChromaDB

Pre-Retrieval (Advanced-level - Partial)

  • Query rewriting with HyDE (backend/rag_solution/query_rewriting/query_rewriter.py)
  • SimpleQueryRewriter for query expansion

Retrieval (Hybrid approach)

  • Vector, Keyword, Hybrid retriever types (backend/rag_solution/retrieval/retriever.py)
  • Pipeline configuration schema with retriever selection (backend/rag_solution/schemas/pipeline_schema.py)
  • Multiple LLM providers: WatsonX, OpenAI, Anthropic (backend/rag_solution/generation/providers/)

Generation (Modular-level - Partial)

❌ Missing Critical Components

Post-Retrieval

  • No reranking (commented out code exists in backend/rag_solution/evaluation/metrics.py)
  • No chunk compression/selection
  • Basic context management without filtering

Generation

  • No answer verification
  • No hallucination detection
  • No external knowledge integration

Orchestration

  • No semantic routing for pipeline selection
  • No hard/soft prompt distinction
  • No adaptive retrieval strategies

Architecture Target

Based on the Modular RAG reference architecture, we aim to implement:

  1. Advanced Query Processing: Multi-query, decomposition, semantic understanding
  2. Intelligent Retrieval: Hybrid search with reranking and compression
  3. Smart Orchestration: Semantic routing, scheduling, adaptive strategies
  4. Verified Generation: Answer verification, hallucination detection, confidence scoring
  5. Knowledge Enhancement: Knowledge graphs, multi-hop reasoning, external knowledge integration

Roadmap Overview

Phase 1: Complete Advanced RAG (4-6 weeks)

Goal: Implement missing Advanced RAG components for immediate quality improvements

  • Post-retrieval reranking (cross-encoder, Cohere Rerank)
  • Chunk compression and selection
  • Enhanced query expansion and decomposition
  • Structural organization (hierarchical chunking)
  • Hybrid retrieval refinement

Expected Impact: 20-30% improvement in retrieval precision, 15-25% improvement in answer quality

Phase 2: Early Modular RAG (6-8 weeks)

Goal: Build foundational Modular RAG modules

  • Semantic routing and orchestration
  • Query scheduling (hard vs soft prompts)
  • Answer verification and hallucination detection
  • External knowledge integration
  • Knowledge graph foundation

Expected Impact: Intelligent pipeline selection (>85% accuracy), reduced hallucinations (>90% accuracy)

Phase 3: Full Modular RAG (8-10 weeks)

Goal: Complete Modular RAG implementation with advanced capabilities

  • Retriever fine-tuning (LM-supervised)
  • Advanced indexing strategies (multi-index, chunk optimization)
  • Full orchestration with dynamic pipeline assembly
  • Continuous learning and auto-optimization

Expected Impact: Production-ready Modular RAG with adaptive intelligence, continuous improvement

Success Metrics

Quantitative

  • Retrieval Precision@10: Increase from baseline to >0.85
  • Answer Accuracy: >90% factually correct responses
  • Hallucination Rate: <5% hallucinated facts
  • Routing Accuracy: >85% correct pipeline selection
  • Latency: <2s for simple queries, <5s for complex CoT queries

Qualitative

  • Improved handling of complex multi-part questions
  • Better table and structured data retrieval
  • Reduced false positives in search results
  • Smarter resource allocation (avoid unnecessary retrieval)

Implementation Phases

This epic is broken down into the following child issues:

  • #TBD - Phase 1: Complete Advanced RAG (4-6 weeks)
  • #TBD - Phase 2: Early Modular RAG (6-8 weeks)
  • #TBD - Phase 3: Full Modular RAG (8-10 weeks)

Dependencies

Technical Approach

Design Principles

  1. Incremental: Each phase delivers standalone value
  2. Backward Compatible: Existing functionality remains unchanged
  3. Configurable: All new features behind feature flags
  4. Tested: Comprehensive unit, integration, and performance tests
  5. Documented: Clear documentation for each component

Architecture Pattern

  • Service-based: Follow existing service architecture (backend/rag_solution/services/)
  • Dependency Injection: Use settings and dependency injection patterns
  • Abstract Interfaces: Define base classes for extensibility
  • Factory Pattern: Use factories for component instantiation

Risk Mitigation

Risk Impact Mitigation
Performance degradation High Benchmark each phase, feature flags for rollback
Complexity creep Medium Strict scope control, MVP approach per phase
Breaking changes High Comprehensive testing, staged rollout
LLM cost increase Medium Token budgets, caching, smart routing

Resources Required

  • Development: 24 weeks total (can be parallelized with multiple developers)
  • Testing: Benchmark datasets, evaluation framework
  • Infrastructure: No additional infrastructure for Phases 1-2, potential GPU for Phase 3 fine-tuning

Timeline

Weeks 1-6:   Phase 1 - Complete Advanced RAG
Weeks 7-14:  Phase 2 - Early Modular RAG  
Weeks 15-24: Phase 3 - Full Modular RAG

References

  • Modular RAG Architecture (reference diagram provided)
  • RAG Survey Paper
  • LangChain Retrieval Strategies
  • LlamaIndex Advanced Retrieval Patterns

Related Issues


Note: This is a living epic that will be updated as phases progress. Each phase issue will contain detailed implementation specifications, file changes, and testing criteria.

Child Issues

This epic is broken down into the following implementation phases:

Total Timeline: 18-24 weeks to complete all phases
Current Priority: Phase 1 (#257)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestepicLarge feature spanning multiple user storiesinfrastructureInfrastructure and deployment

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions