Skip to content

manavgup/rag_modulo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

RAG Modulo

License: MIT Python 3.12+ Docker FastAPI React

A production-ready, modular Retrieval-Augmented Generation (RAG) platform with Chain of Thought reasoning, multi-LLM support, and enterprise-grade features

๐Ÿš€ Quick Start โ€ข ๐Ÿ“š Documentation โ€ข ๐Ÿ› ๏ธ Development โ€ข โœจ Features โ€ข ๐Ÿค Contributing


๐ŸŽฏ What is RAG Modulo?

RAG Modulo is a production-ready Retrieval-Augmented Generation platform that provides enterprise-grade document processing, intelligent search, and AI-powered question answering with advanced Chain of Thought (CoT) reasoning. Built with modern technologies and designed for scalability, it supports multiple vector databases (Milvus, Elasticsearch, Pinecone, Weaviate, ChromaDB), LLM providers (WatsonX, OpenAI, Anthropic), and document formats including enhanced support via IBM Docling integration.

โœจ Key Features

๐Ÿง  AI-Powered ๐Ÿ” Advanced Search ๐Ÿ’ฌ Interactive UI ๐Ÿš€ Production Ready
Chain of Thought reasoning
Automatic pipeline resolution
Multi-LLM provider support
Token tracking & monitoring
Vector similarity search
Hybrid search strategies
Intelligent source attribution
Auto-generated suggestions
Modern React interface
Real-time document upload
Podcast generation
Voice preview features
Docker + GHCR images
Multi-stage CI/CD
Security scanning
947 automated tests

๐ŸŽจ Frontend Features

  • Modern UI: React 18 with Tailwind CSS for responsive, accessible design
  • Enhanced Search: Interactive chat interface with Chain of Thought reasoning visualization
  • Document Management: Real-time file upload with drag-and-drop support
  • Smart Display: Document source attribution with chunk-level page references
  • Podcast Generation: AI-powered podcast creation with voice preview
  • Question Suggestions: Intelligent query recommendations based on collection content

๐ŸŽ‰ Current Status: Production Ready

Component Status Progress
๐Ÿ—๏ธ Infrastructure โœ… Production Ready Docker + GHCR + Cloud Deployment
๐Ÿงช Testing โœ… Comprehensive 947 tests (atomic, unit, integration, API)
๐Ÿš€ Core Services โœ… Fully Operational 26+ services with DI pattern
๐Ÿ“š Documentation โœ… Extensive API docs, guides, deployment
๐Ÿ”ง Development โœ… Optimized Containerless local dev workflow
๐Ÿ”’ Security โœ… Hardened Multi-layer scanning (Trivy, Bandit, Gitleaks)

๐ŸŽ‰ Recent Major Improvements

Feature Description Benefit
๐Ÿง  Chain of Thought Automatic question decomposition with step-by-step reasoning 40%+ better answer quality on complex queries
โšก Auto Pipeline Resolution Zero-config search - backend handles pipeline selection Simplified API, reduced client complexity
๐Ÿ”’ Security Hardening Multi-layer scanning (Trivy, Bandit, Gitleaks, Semgrep) Production-grade security posture
๐Ÿš€ Containerless Dev Local development without containers 10x faster iteration, instant hot-reload
๐Ÿ“„ IBM Docling Enhanced document processing for complex formats Better PDF/DOCX/XLSX handling
๐ŸŽ™๏ธ Podcast Generation AI-powered podcast creation with voice preview Interactive content from documents
๐Ÿ’ก Smart Suggestions Auto-generated relevant questions Improved user experience and discovery
๐Ÿ“ฆ GHCR Images Pre-built production images Faster deployments, consistent environments

๐Ÿš€ Quick Start

Prerequisites

Requirement Version Purpose
Python 3.12+ Backend development
Poetry Latest Python dependency management
Node.js 18+ Frontend development
Docker Latest Infrastructure services
Docker Compose V2 Orchestration

Option 1: Local Development (โšก Fastest - Recommended)

Best for: Daily development, feature work, rapid iteration

# 1. Clone repository
git clone https://github.com/manavgup/rag-modulo.git
cd rag-modulo

# 2. Set up environment
cp env.example .env
# Edit .env with your API keys (WatsonX, OpenAI, etc.)

# 3. Install dependencies
make local-dev-setup  # Installs both backend (Poetry) and frontend (npm)

# 4. Start infrastructure (Postgres, Milvus, MinIO, MLFlow)
make local-dev-infra

# 5. Start backend (Terminal 1)
make local-dev-backend

# 6. Start frontend (Terminal 2)
make local-dev-frontend

# OR start everything in background
make local-dev-all

Access Points:

Benefits:

  • โšก Instant reload - Python/React changes reflected immediately (no container rebuilds)
  • ๐Ÿ› Native debugging - Use PyCharm, VS Code debugger with breakpoints
  • ๐Ÿ“ฆ Local caching - Poetry/npm caches work natively for faster dependency installs
  • ๐Ÿ”ฅ Fastest iteration - Pre-commit hooks optimized (fast on commit, comprehensive on push)

When to use:

  • โœ… Daily development work
  • โœ… Feature development and bug fixes
  • โœ… Rapid iteration and testing
  • โœ… Debugging with breakpoints

Option 2: Production Mode (๐Ÿณ Docker)

Best for: Production-like testing, deployment validation

# Clone repository
git clone https://github.com/manavgup/rag-modulo.git
cd rag-modulo

# Set up environment
cp env.example .env
# Edit .env with your API keys

# Start with pre-built images from GHCR
make run-ghcr

# OR build and run locally
make build-all-local
docker compose up -d

When to use:

  • โœ… Testing production configurations
  • โœ… Validating Docker builds
  • โœ… Deployment rehearsal
  • โœ… Performance benchmarking

Option 3: GitHub Codespaces (โ˜๏ธ Cloud)

Best for: Quick experimentation, onboarding, cloud development

  1. Go to repository โ†’ "Code" โ†’ "Codespaces"
  2. Click "Create codespace" on your branch
  3. Start coding in browser-based VS Code
  4. Run: make venv && make run-infra

When to use:

  • โœ… No local setup required
  • โœ… Consistent development environment
  • โœ… Work from any device
  • โœ… Team onboarding

๐Ÿ—๏ธ Architecture Overview

RAG Modulo follows a modern, service-based architecture with clear separation of concerns:

graph TB
    subgraph "Frontend Layer"
        UI[React Web UI]
        CLI[Command Line Interface]
    end

    subgraph "API Layer"
        API[FastAPI Backend]
        AUTH[OIDC Authentication]
    end

    subgraph "Service Layer"
        SEARCH[Search Service]
        CONV[Conversation Service]
        TOKEN[Token Tracking]
        COT[Chain of Thought]
    end

    subgraph "Data Layer"
        VDB[(Vector Database)]
        PG[(PostgreSQL)]
        MINIO[(MinIO Storage)]
    end

    subgraph "External Services"
        LLM[LLM Providers]
        EMB[Embedding Models]
    end

    UI --> API
    CLI --> API
    API --> SEARCH
    API --> CONV
    API --> TOKEN
    API --> COT
    SEARCH --> VDB
    SEARCH --> PG
    CONV --> LLM
    TOKEN --> PG
    COT --> LLM
    API --> MINIO
Loading

๐Ÿ› ๏ธ Development Workflow

๐ŸŽฏ Recommended Daily Workflow

Philosophy: Develop locally without containers for maximum speed, deploy with containers for production.

# Morning setup (once per day)
cd rag-modulo
source backend/.venv/bin/activate  # Activate Python environment
make run-infra                      # Start infrastructure (Postgres, Milvus, etc.)

# Terminal 1: Backend with auto-reload
cd backend
uvicorn main:app --reload --port 8000

# Terminal 2: Frontend with HMR
cd frontend
npm run dev

# Development cycle
# 1. Make code changes
# 2. See changes instantly (auto-reload)
# 3. Test manually via http://localhost:3000
# 4. Run quick checks before commit
make quick-check

# End of day cleanup
make local-dev-stop  # Stop infrastructure containers
deactivate           # Deactivate Python venv

๐Ÿ”ง Essential Development Commands

Command Description When to Use
make local-dev-setup Install all dependencies (backend + frontend) First time setup
make local-dev-infra Start infrastructure containers only Daily (Postgres, Milvus, MinIO, MLFlow)
make local-dev-backend Start backend with hot-reload Development (Terminal 1)
make local-dev-frontend Start frontend with HMR Development (Terminal 2)
make local-dev-all Start everything in background Quick full stack startup
make quick-check Fast lint + format check Pre-commit validation
make test-unit-fast Run unit tests locally Rapid testing without containers
make local-dev-stop Stop all services Clean shutdown

๐Ÿงช Testing & Quality

# Fast local testing (no containers)
source backend/.venv/bin/activate
cd backend
pytest tests/unit/ -v              # Unit tests only
pytest tests/integration/ -v       # Integration tests

# Or use Makefile targets
make test-unit-fast                # Fast unit tests
make test-integration              # Integration tests (needs infra)

# Quality checks
make quick-check                   # Fast: format + lint
make lint                          # All linters
make format                        # Auto-fix formatting
make security-check                # Security scans
make coverage                      # Test coverage report

๐Ÿณ Container Development (When Needed)

Only for production-like testing or deployment validation:

# Build production images
make build-backend
make build-frontend

# Start production environment
make prod-start

# Or use pre-built GHCR images
make run-ghcr

๐Ÿ“Š Features & Capabilities

๐Ÿง  Advanced AI Features

  • Chain of Thought Reasoning: Automatic question decomposition with step-by-step reasoning, iterative context building, and transparent reasoning visualization
  • Automatic Pipeline Resolution: Zero-config search experience - backend automatically selects and creates pipelines based on user context
  • Token Tracking & Monitoring: Real-time usage tracking across all LLM interactions with detailed breakdowns
  • Multi-LLM Support: Seamless switching between WatsonX, OpenAI, and Anthropic with provider-specific optimizations
  • IBM Docling Integration: Enhanced document processing for complex formats (PDF, DOCX, XLSX)
  • Question Suggestions: AI-generated relevant questions based on document collection content

๐Ÿ” Search & Retrieval

  • Vector Databases: Pluggable support for Milvus (default), Elasticsearch, Pinecone, Weaviate, ChromaDB via common interface
  • Hybrid Search: Combines semantic vector similarity with keyword search strategies
  • Source Attribution: Granular document source tracking with chunk-level page references across reasoning steps
  • Advanced Chunking: Hierarchical chunking strategies with configurable size and overlap
  • Conversation History: Context-aware search with conversation memory for multi-turn interactions

๐Ÿ—๏ธ Architecture & Scalability

  • Service-Based Design: 26+ services with clean separation of concerns and dependency injection pattern
  • Repository Pattern: Data access abstraction layer for improved testability and maintainability
  • Asynchronous Operations: Async/await throughout for efficient concurrent request handling
  • Production Deployment: Docker + GHCR images, multi-stage builds, cloud-ready (AWS, Azure, GCP, IBM Cloud)
  • Modular Design: Pluggable components for vector DBs, LLM providers, embedding models

๐Ÿงช Testing & Quality Assurance

  • Comprehensive Test Suite: 947 automated tests across all layers (atomic, unit, integration, API, E2E)
  • Multi-Layer Testing Strategy:
    • Atomic tests for schemas and data structures
    • Unit tests for business logic and services
    • Integration tests for service interactions
    • API tests for endpoint validation
  • Security Scanning: Multi-layer security with Trivy (containers), Bandit (Python), Gitleaks (secrets), Semgrep (SAST)
  • Code Quality: Ruff linting, MyPy type checking, Pylint analysis, pre-commit hooks
  • CI/CD Pipeline: Multi-stage GitHub Actions with test isolation, builds, and comprehensive integration testing

๐Ÿ“š Documentation

๐Ÿ“– Complete Documentation

๐Ÿ”ง Configuration & Tools

๐Ÿ› ๏ธ Command-Line Interface (CLI)

RAG Modulo includes a powerful CLI for interacting with the system:

# After installation, use the CLI commands:
rag-cli --help          # Main CLI help
rag-search              # Search operations
rag-admin               # Administrative tasks

# Example: Search a collection
rag-cli search query <collection-id> "your question here"

# Create a collection
rag-cli collection create --name "My Docs"

# Upload documents
rag-cli collection upload <collection-id> path/to/documents/

๐Ÿš€ Deployment & Packaging

Production Deployment

RAG Modulo supports multiple deployment strategies:

1. Docker Compose (Recommended)

# Start production environment (all containers)
make prod-start

# Check status
make prod-status

# View logs
make prod-logs

# Stop production environment
make prod-stop

2. Pre-built Images from GHCR

# Pull and run latest images from GitHub Container Registry
make run-ghcr

Available Images:

  • ghcr.io/manavgup/rag_modulo/backend:latest
  • ghcr.io/manavgup/rag_modulo/frontend:latest

3. Custom Docker Deployment

# Build local images
make build-all

# Start services
make run-app

Cloud Deployment Options

AWS Deployment
  • ECS (Elastic Container Service): Use docker-compose.production.yml
  • EKS (Kubernetes): Deploy with Kubernetes manifests
  • EC2: Docker Compose or standalone containers
  • Lambda: Serverless functions for specific services
Azure Deployment
  • Azure Container Instances: Quick container deployment
  • AKS (Azure Kubernetes Service): Production-grade orchestration
  • Azure Container Apps: Serverless container hosting
Google Cloud Deployment
  • Cloud Run: Fully managed serverless platform
  • GKE (Google Kubernetes Engine): Kubernetes orchestration
  • Compute Engine: VM-based deployment with Docker
IBM Cloud Deployment
  • Code Engine: Serverless container platform
  • IKS (IBM Kubernetes Service): Enterprise Kubernetes
  • Red Hat OpenShift: Advanced container platform

Kubernetes Deployment

# Apply Kubernetes manifests
kubectl apply -f deployment/k8s/

# Or deploy with Helm (if charts exist)
helm install rag-modulo ./charts/rag-modulo

๐Ÿ”„ CI/CD Pipeline

GitHub Actions Workflows

RAG Modulo uses a comprehensive CI/CD pipeline with multiple stages:

1. Code Quality & Testing (.github/workflows/ci.yml)

Triggers: Push to main, Pull Requests

Stages:

  1. Lint and Unit Tests (No infrastructure)

    • Ruff linting (120 char line length)
    • MyPy type checking
    • Unit tests with pytest
    • Fast feedback (~5-10 minutes)
  2. Build Docker Images

    • Backend image build
    • Frontend image build
    • Push to GitHub Container Registry (GHCR)
    • Tagged with: latest, sha-<commit>, branch name
  3. Integration Tests

    • Full stack deployment
    • PostgreSQL, Milvus, MLFlow, MinIO
    • API tests, integration tests
    • End-to-end validation

Status Badges:

[![CI Pipeline](https://github.com/manavgup/rag_modulo/workflows/CI/badge.svg)](https://github.com/manavgup/rag_modulo/actions)

2. Security Scanning (.github/workflows/security.yml)

Triggers: Push to main, Pull Requests, Weekly schedule

Scans:

  • Trivy: Container vulnerability scanning
  • Bandit: Python security linting
  • Gitleaks: Secret detection
  • Safety: Python dependency vulnerabilities
  • Semgrep: SAST code analysis

3. Documentation (.github/workflows/docs.yml)

Triggers: Push to main, Pull Requests to docs/

Actions:

  • Build MkDocs site
  • Deploy to GitHub Pages
  • API documentation generation

Local CI Validation

Test CI pipeline locally before pushing:

# Run same checks as CI
make ci-local

# Validate CI workflows
make validate-ci

# Security checks
make security-check
make scan-secrets

Pre-commit Hooks

Optimized for developer velocity:

On Commit (fast, 5-10 sec):

  • Ruff formatting
  • Trailing whitespace
  • YAML syntax
  • File size limits

On Push (slow, 30-60 sec):

  • MyPy type checking
  • Pylint analysis
  • Security scans
  • Strangler pattern checks

In CI (comprehensive):

  • All checks run regardless
  • Ensures quality gates

Container Registry

GitHub Container Registry (GHCR):

  • Automatic image builds on push
  • Multi-architecture support (amd64, arm64)
  • Image signing and verification
  • Retention policies

Image Tags:

  • latest: Latest main branch build
  • sha-<commit>: Specific commit
  • <branch>: Branch-specific builds
  • v<version>: Release tags

๐Ÿงช Testing

Test Categories

Category Count Description Command
โšก Atomic Tests 100+ Schema validation, data structures pytest -m atomic
๐Ÿƒ Unit Tests 83+ Service logic, business rules make test-unit-fast
๐Ÿ”— Integration Tests 43+ Service interactions, DB integration make test-integration
๐Ÿ”Œ API Tests 21+ Endpoint validation, request/response pytest -m api
๐ŸŒ E2E Tests 22+ Full workflow scenarios pytest -m e2e
๐Ÿ“Š Total 947 Complete test coverage make test-all

Running Tests

# Fast local testing (no containers, recommended for development)
make test-unit-fast

# Specific test categories
make test-atomic       # Schema and data structure tests
make test-integration  # Service integration tests (requires infrastructure)
make test-api          # API endpoint tests

# Full test suite with coverage
make coverage

# Run specific test file
make test testfile=tests/unit/test_search_service.py

๐Ÿค Contributing

We welcome contributions! Please see our Contributing Guide for details.

Development Guidelines

  1. Service Layer Architecture - Follow service-based patterns
  2. Code Quality - Use type hints, comprehensive docstrings, PEP 8
  3. Testing - Write tests for all new features
  4. Documentation - Update docs for any changes

Contribution Process

  1. Fork and Clone the repository
  2. Create Feature Branch from main
  3. Make Changes following our guidelines
  4. Run Tests and ensure they pass
  5. Submit Pull Request with clear description

๐Ÿ“ˆ Roadmap

โœ… Phase 1: Foundation (Completed)

  • Service-based architecture with 26+ services
  • Comprehensive test infrastructure (947 tests)
  • Multi-LLM provider support (WatsonX, OpenAI, Anthropic)
  • Vector database abstraction layer
  • CI/CD pipeline with security scanning

โœ… Phase 2: Advanced Features (Completed)

  • Chain of Thought (CoT) reasoning system
  • Automatic pipeline resolution
  • Token tracking and monitoring
  • IBM Docling integration
  • Podcast generation with voice preview
  • Question suggestion system
  • Containerless local development workflow

๐Ÿ”„ Phase 3: Production Enhancement (Current)

  • Production deployment with GHCR images
  • Multi-stage Docker builds
  • Security hardening (Trivy, Bandit, Gitleaks, Semgrep)
  • Enhanced monitoring and observability
  • Performance optimization and caching
  • Authentication system improvements (OIDC)

๐Ÿš€ Phase 4: Enterprise Features (Next)

  • Multi-tenant support
  • Advanced analytics and dashboards
  • Batch processing capabilities
  • API rate limiting and quotas
  • Advanced caching strategies

๐Ÿ”ฎ Phase 5: Innovation (Future)

  • Multi-modal support (image, audio)
  • Agentic AI workflows
  • Real-time collaborative features
  • Advanced reasoning strategies
  • Federated learning support

๐Ÿ†˜ Troubleshooting

Common Issues

๐Ÿ Virtual Environment Issues

Problem: Dependencies not installing

# Use the Makefile (recommended)
make local-dev-setup

# OR manually:
cd backend
poetry config virtualenvs.in-project true
poetry install --with dev,test
source .venv/bin/activate

# Frontend
cd ../frontend
npm install

Problem: Wrong tool versions (e.g., Ruff 0.5.7 instead of 0.14.0)

# Ensure you're in the Poetry virtual environment
cd backend
source .venv/bin/activate
which python  # Should show backend/.venv/bin/python
ruff --version  # Should show 0.14.0

Problem: poetry install fails

# Update Poetry and retry
poetry self update
poetry cache clear . --all
poetry install --with dev,test --sync
๐Ÿณ Docker Issues

Problem: Infrastructure services fail to start

# Use Makefile commands (recommended)
make local-dev-stop    # Stop everything
make local-dev-infra   # Restart infrastructure

# OR manually:
docker compose -f docker-compose-infra.yml down
docker compose -f docker-compose-infra.yml up -d

# Check logs
make logs

Problem: Port already in use

# Find what's using the port
lsof -i :8000  # Backend
lsof -i :3000  # Frontend
lsof -i :5432  # Postgres

# Stop all services
make local-dev-stop

# OR kill specific service
kill $(lsof -t -i:8000)
๐Ÿ” Authentication Issues

Problem: Login attempts fail

  • Ensure OIDC configuration is correct in .env
  • Check IBM Cloud credentials
  • Verify redirect URLs match your setup

Development Mode: Use mock authentication

# In .env or .env.dev
SKIP_AUTH=true
DEVELOPMENT_MODE=true
ENABLE_MOCK_AUTH=true
๐Ÿงช Test Failures

Problem: Tests failing locally

# Ensure you're in venv
source backend/.venv/bin/activate

# Run specific test
cd backend
pytest tests/unit/test_example.py -v

# Run with more details
pytest tests/unit/test_example.py -vv -s

# Check test dependencies
poetry install --with test --sync
๐Ÿ“ฆ Dependency Issues

Problem: Import errors or missing modules

# Reinstall all dependencies
cd backend
poetry install --with dev,test --sync

# Check what's installed
poetry show

# Verify Python path
python -c "import sys; print(sys.path)"

Getting Help

  1. ๐Ÿ“š Check Documentation: Full docs
  2. ๐Ÿ› Report Issues: GitHub Issues
  3. ๐Ÿ’ฌ Discussions: GitHub Discussions
  4. ๐Ÿ“– See: IMMEDIATE_FIX.md for common development issues

๐Ÿ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.


๐Ÿ™ Acknowledgments

  • IBM Docling - Advanced document processing and understanding
  • IBM WatsonX - Enterprise AI foundation models
  • FastAPI - Modern, high-performance web framework
  • React - Powerful UI library for building interactive interfaces
  • Milvus - High-performance vector database
  • Docker - Containerization and deployment platform
  • All Contributors - Thank you for your contributions!

โฌ† Back to Top

Made with โค๏ธ by the RAG Modulo Team

GitHub Docker Python

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors 7