Skip to content
45 changes: 38 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ RAG Modulo is a production-ready Retrieval-Augmented Generation platform that pr
</div>

#### 🎨 Frontend Features

- **Modern UI**: React 18 with Tailwind CSS and Carbon Design System principles
- **Reusable Component Library**: 8 accessible, type-safe components with consistent design patterns
- **Enhanced Search**: Interactive chat interface with Chain of Thought reasoning visualization
Expand Down Expand Up @@ -167,19 +168,22 @@ make local-dev-all
<summary><strong>🔍 What Gets Installed</strong></summary>

**Backend (Python/Poetry)**:

- FastAPI and dependencies
- LLM providers (WatsonX, OpenAI, Anthropic)
- Vector DB clients (Milvus, Elasticsearch, etc.)
- Testing frameworks (pytest, coverage)
- Code quality tools (ruff, mypy, bandit)

**Frontend (npm)**:

- React 18 + Vite
- Tailwind CSS + Carbon Design
- TypeScript dependencies
- Testing libraries

**Infrastructure (Docker)**:

- PostgreSQL (metadata)
- Milvus (vector storage)
- MinIO (object storage)
Expand All @@ -188,18 +192,21 @@ make local-dev-all
</details>

**Access Points:**
- 🌐 **Frontend**: http://localhost:3000
- 🔧 **Backend API**: http://localhost:8000/docs (Swagger UI)
- 📊 **MLFlow**: http://localhost:5001
- 💾 **MinIO Console**: http://localhost:9001

- 🌐 **Frontend**: <http://localhost:3000>
- 🔧 **Backend API**: <http://localhost:8000/docs> (Swagger UI)
- 📊 **MLFlow**: <http://localhost:5001>
- 💾 **MinIO Console**: <http://localhost:9001>

**Benefits:**

- ⚡ **Instant reload** - Python/React changes reflected immediately (no container rebuilds)
- 🐛 **Native debugging** - Use PyCharm, VS Code debugger with breakpoints
- 📦 **Local caching** - Poetry/npm caches work natively for faster dependency installs
- 🔥 **Fastest iteration** - Pre-commit hooks optimized (fast on commit, comprehensive on push)

**When to use:**

- ✅ Daily development work
- ✅ Feature development and bug fixes
- ✅ Rapid iteration and testing
Expand Down Expand Up @@ -227,6 +234,7 @@ docker compose up -d
```

**When to use:**

- ✅ Testing production configurations
- ✅ Validating Docker builds
- ✅ Deployment rehearsal
Expand All @@ -242,6 +250,7 @@ docker compose up -d
4. **Run**: `make venv && make run-infra`

**When to use:**

- ✅ No local setup required
- ✅ Consistent development environment
- ✅ Work from any device
Expand Down Expand Up @@ -364,9 +373,10 @@ make local-dev-all
```

Done! Services running at:
- Frontend: http://localhost:3000
- Backend: http://localhost:8000
- MLFlow: http://localhost:5001

- Frontend: <http://localhost:3000>
- Backend: <http://localhost:8000>
- MLFlow: <http://localhost:5001>

</details>

Expand Down Expand Up @@ -500,6 +510,7 @@ make prod-start
```

Available images:

- `ghcr.io/manavgup/rag_modulo/backend:latest`
- `ghcr.io/manavgup/rag_modulo/frontend:latest`

Expand All @@ -517,6 +528,8 @@ Available images:
- **Multi-LLM Support**: Seamless switching between WatsonX, OpenAI, and Anthropic with provider-specific optimizations
- **IBM Docling Integration**: Enhanced document processing for complex formats (PDF, DOCX, XLSX)
- **Question Suggestions**: AI-generated relevant questions based on document collection content
- **MCP Context Forge**: Tool enrichment via
[IBM MCP](https://github.com/IBM/mcp-context-forge) with resilience patterns

### 🔍 Search & Retrieval

Expand Down Expand Up @@ -564,6 +577,7 @@ Available images:
- **[🔌 API Reference](docs/api/README.md)** - Complete API documentation
- **[🖥️ CLI Documentation](docs/cli/index.md)** - Command-line interface guide
- **[🔐 Secret Management](docs/development/secret-management.md)** - Comprehensive guide for safe secret handling
- **[🔗 MCP Integration](docs/features/mcp-integration.md)** - MCP Context Forge gateway setup and usage

### 🛠️ Command-Line Interface (CLI)

Expand Down Expand Up @@ -617,6 +631,7 @@ make run-ghcr
```

**Available Images:**

- `ghcr.io/manavgup/rag_modulo/backend:latest`
- `ghcr.io/manavgup/rag_modulo/frontend:latest`

Expand Down Expand Up @@ -692,6 +707,7 @@ RAG Modulo uses a comprehensive CI/CD pipeline with multiple stages:
**Triggers:** Push to `main`, Pull Requests

**Stages:**

1. **Lint and Unit Tests** (No infrastructure)
- Ruff linting (120 char line length)
- MyPy type checking
Expand All @@ -711,6 +727,7 @@ RAG Modulo uses a comprehensive CI/CD pipeline with multiple stages:
- End-to-end validation

**Status Badges:**

```markdown
[![CI Pipeline](https://github.com/manavgup/rag_modulo/workflows/CI/badge.svg)](https://github.com/manavgup/rag_modulo/actions)
```
Expand All @@ -720,11 +737,13 @@ RAG Modulo uses a comprehensive CI/CD pipeline with multiple stages:
**Triggers:** Push to `main`, Pull Requests

**Secret Detection (3-Layer Defense):**

1. **Pre-commit hooks**: detect-secrets with baseline (< 1 sec)
2. **Local testing**: Gitleaks via `make pre-commit-run` (~1-2 sec)
3. **CI/CD**: Gitleaks + TruffleHog (~45 sec)

**Scans:**

- **Gitleaks**: Pattern-based secret scanning with custom rules (`.gitleaks.toml`)
- **TruffleHog**: Entropy-based + verified secret detection
- **Trivy**: Container vulnerability scanning
Expand All @@ -735,6 +754,7 @@ RAG Modulo uses a comprehensive CI/CD pipeline with multiple stages:
**⚠️ IMPORTANT:** CI now **fails on ANY secret detection** (no `continue-on-error`). This ensures no secrets make their way to the repository.

**Supported Secret Types:**

- Cloud: AWS, Azure, GCP keys
- LLM: OpenAI, Anthropic, WatsonX, Gemini API keys
- Infrastructure: PostgreSQL, MinIO, MLFlow, JWT secrets
Expand All @@ -748,6 +768,7 @@ RAG Modulo uses a comprehensive CI/CD pipeline with multiple stages:
**Triggers:** Push to `main`, Pull Requests to `docs/`

**Actions:**

- Build MkDocs site
- Deploy to GitHub Pages
- API documentation generation
Expand All @@ -773,30 +794,35 @@ make scan-secrets
Optimized for developer velocity:

**On Commit** (fast, 5-10 sec):

- Ruff formatting
- Trailing whitespace
- YAML syntax
- File size limits

**On Push** (slow, 30-60 sec):

- MyPy type checking
- Pylint analysis
- Security scans
- Strangler pattern checks

**In CI** (comprehensive):

- All checks run regardless
- Ensures quality gates

### Container Registry

**GitHub Container Registry (GHCR)**:

- Automatic image builds on push
- Multi-architecture support (amd64, arm64)
- Image signing and verification
- Retention policies

**Image Tags:**

- `latest`: Latest main branch build
- `sha-<commit>`: Specific commit
- `<branch>`: Branch-specific builds
Expand Down Expand Up @@ -861,13 +887,15 @@ We welcome contributions! Please see our [Contributing Guide](docs/development/c
## 📈 Roadmap

### ✅ Phase 1: Foundation (Completed)

- [x] Service-based architecture with 26+ services
- [x] Comprehensive test infrastructure (947 tests)
- [x] Multi-LLM provider support (WatsonX, OpenAI, Anthropic)
- [x] Vector database abstraction layer
- [x] CI/CD pipeline with security scanning

### ✅ Phase 2: Advanced Features (Completed)

- [x] Chain of Thought (CoT) reasoning system
- [x] Automatic pipeline resolution
- [x] Token tracking and monitoring
Expand All @@ -877,6 +905,7 @@ We welcome contributions! Please see our [Contributing Guide](docs/development/c
- [x] Containerless local development workflow

### 🔄 Phase 3: Production Enhancement (Current)

- [x] Production deployment with GHCR images
- [x] Multi-stage Docker builds
- [x] Security hardening (Trivy, Bandit, Gitleaks, Semgrep)
Expand All @@ -885,13 +914,15 @@ We welcome contributions! Please see our [Contributing Guide](docs/development/c
- [ ] Authentication system improvements (OIDC)

### 🚀 Phase 4: Enterprise Features (Next)

- [ ] Multi-tenant support
- [ ] Advanced analytics and dashboards
- [ ] Batch processing capabilities
- [ ] API rate limiting and quotas
- [ ] Advanced caching strategies

### 🔮 Phase 5: Innovation (Future)

- [ ] Multi-modal support (image, audio)
- [ ] Agentic AI workflows
- [ ] Real-time collaborative features
Expand Down
24 changes: 24 additions & 0 deletions backend/core/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -284,6 +284,30 @@ class Settings(BaseSettings):
log_storage_enabled: Annotated[bool, Field(default=True, alias="LOG_STORAGE_ENABLED")]
log_buffer_size_mb: Annotated[int, Field(default=5, alias="LOG_BUFFER_SIZE_MB")]

# MCP Gateway settings
# Enable/disable MCP integration globally
mcp_enabled: Annotated[bool, Field(default=True, alias="MCP_ENABLED")]
# MCP Context Forge gateway URL (port 3001 to avoid frontend conflict on 3000)
mcp_gateway_url: Annotated[str, Field(default="http://localhost:3001", alias="MCP_GATEWAY_URL")]
# Request timeout in seconds (30s default per requirements)
mcp_timeout: Annotated[float, Field(default=30.0, ge=1.0, le=300.0, alias="MCP_TIMEOUT")]
# Health check timeout (5s per requirements)
mcp_health_timeout: Annotated[float, Field(default=5.0, ge=1.0, le=30.0, alias="MCP_HEALTH_TIMEOUT")]
# Maximum retries for MCP calls
mcp_max_retries: Annotated[int, Field(default=3, ge=0, le=10, alias="MCP_MAX_RETRIES")]
# Circuit breaker failure threshold (5 failures per requirements)
mcp_circuit_breaker_threshold: Annotated[int, Field(default=5, ge=1, le=20, alias="MCP_CIRCUIT_BREAKER_THRESHOLD")]
# Circuit breaker recovery timeout in seconds (60s per requirements)
mcp_circuit_breaker_timeout: Annotated[
float, Field(default=60.0, ge=10.0, le=600.0, alias="MCP_CIRCUIT_BREAKER_TIMEOUT")
]
# JWT token for MCP gateway authentication
mcp_jwt_token: Annotated[str | None, Field(default=None, alias="MCP_JWT_TOKEN")]
# Enable enrichment of search results with MCP tools
mcp_enrichment_enabled: Annotated[bool, Field(default=True, alias="MCP_ENRICHMENT_ENABLED")]
# Maximum concurrent MCP tool invocations
mcp_max_concurrent: Annotated[int, Field(default=5, ge=1, le=20, alias="MCP_MAX_CONCURRENT")]

# Testing settings
testing: Annotated[bool, Field(default=False, alias="TESTING")]
skip_auth: Annotated[bool, Field(default=False, alias="SKIP_AUTH")]
Expand Down
2 changes: 2 additions & 0 deletions backend/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@
from rag_solution.router.conversation_router import router as conversation_router
from rag_solution.router.dashboard_router import router as dashboard_router
from rag_solution.router.health_router import router as health_router
from rag_solution.router.mcp_router import router as mcp_router
from rag_solution.router.podcast_router import router as podcast_router
from rag_solution.router.runtime_config_router import router as runtime_config_router
from rag_solution.router.search_router import router as search_router
Expand Down Expand Up @@ -248,6 +249,7 @@ async def lifespan(_app: FastAPI) -> AsyncGenerator[None, None]:
app.include_router(auth_router)
app.include_router(chat_router)
app.include_router(conversation_router)
app.include_router(mcp_router)
app.include_router(dashboard_router)
app.include_router(health_router)
app.include_router(collection_router)
Expand Down
Loading
Loading