-
Notifications
You must be signed in to change notification settings - Fork 3
Description
🚀 Architecture: Enable Serverless Deployment on IBM Cloud Code Engine
Context
RAG Modulo currently uses a container-orchestration architecture (Docker Compose locally, OpenShift for production) that includes self-managed stateful services:
- PostgreSQL - Metadata and user data storage
- Milvus - Vector database for embeddings
- etcd - Milvus dependency
- MinIO - Object storage (Milvus dependency)
Problem: IBM Cloud Code Engine (serverless platform) cannot run stateful services like databases. Code Engine is designed for stateless containers that scale to zero.
Goal: Architect a solution that enables RAG Modulo to deploy on IBM Cloud Code Engine while maintaining full functionality.
Current Architecture
┌─────────────────────────────────────────┐
│ Docker/OpenShift Cluster │
├─────────────────────────────────────────┤
│ ┌──────────┐ ┌────────────────────┐ │
│ │ Backend │ │ Frontend (WebUI) │ │
│ │ (FastAPI)│ │ (React) │ │
│ └────┬─────┘ └─────────────────────┘ │
│ │ │
│ ┌────┴──────────┐ ┌─────────────┐ │
│ │ PostgreSQL │ │ Milvus │ │
│ │ (StatefulSet) │ │(StatefulSet)│ │
│ └───────────────┘ └──────┬──────┘ │
│ │ │
│ ┌──────┴──────┐ │
│ │ etcd│ MinIO │ │
│ └─────────────┘ │
└─────────────────────────────────────────┘
Proposed Serverless Architecture Options
Option A: IBM Cloud Managed Services (Recommended)
Replace self-hosted databases with IBM Cloud managed services:
Services Mapping:
-
PostgreSQL → IBM Cloud Databases for PostgreSQL
- Fully managed, auto-scaling
- Built-in backup and HA
- Cost: ~$150-300/month (1GB RAM, 10GB storage)
-
Milvus → Alternatives needed (Milvus not available as managed service on IBM Cloud):
- Option A1: IBM watsonx.data with vector extensions
- Option A2: Elasticsearch on IBM Cloud with vector search (kNN)
- Option A3: External managed Milvus (Zilliz Cloud)
- Option A4: Use existing supported vector DBs - Pinecone, Weaviate Cloud (already in codebase)
Architecture:
┌─────────────────────────────────────────────────┐
│ IBM Cloud Code Engine │
├─────────────────────────────────────────────────┤
│ ┌──────────┐ ┌────────────────────┐ │
│ │ Backend │ │ Frontend (WebUI) │ │
│ │ (FastAPI)│ │ (React) │ │
│ └────┬─────┘ └─────────────────────┘ │
└───────┼─────────────────────────────────────────┘
│
│ (Private Endpoints)
│
┌────┴────────────────────────────────┐
│ IBM Cloud Managed Services │
├─────────────────────────────────────┤
│ ┌────────────────────────────┐ │
│ │ Databases for PostgreSQL │ │
│ └────────────────────────────┘ │
│ ┌────────────────────────────┐ │
│ │ Elasticsearch / watsonx │ │
│ │ (Vector Search) │ │
│ └────────────────────────────┘ │
└─────────────────────────────────────┘
Pros:
- ✅ Fully managed, auto-scaling
- ✅ Built-in backup, HA, monitoring
- ✅ Native IBM Cloud integration
- ✅ No container orchestration needed
- ✅ Scale to zero for cost savings
Cons:
- ❌ Migration effort required
- ❌ Vendor lock-in
- ❌ Monthly costs even at low usage
- ❌ Milvus not directly available
Option B: Hybrid - Code Engine + External Managed Databases
Keep application serverless but use external managed vector databases:
Services:
- PostgreSQL → IBM Cloud Databases for PostgreSQL
- Milvus → Zilliz Cloud (managed Milvus) OR Pinecone/Weaviate
Pros:
- ✅ Keep Milvus functionality
- ✅ Minimal code changes
- ✅ Best of both worlds
Cons:
- ❌ Data egress costs (cross-cloud if using external services)
- ❌ Multiple vendors to manage
- ❌ Latency if not in same region
Option C: Keep OpenShift (Current Approach)
Continue using OpenShift for full control over infrastructure:
Pros:
- ✅ Full control over databases
- ✅ Existing Helm charts (PR feat: Add production-ready Kubernetes/OpenShift deployment #261)
- ✅ No migration needed
- ✅ Self-hosted = predictable costs
Cons:
- ❌ Higher baseline cost (~$200-400/month for cluster)
- ❌ No scale-to-zero
- ❌ More operational overhead
Recommended Migration Plan (Option A)
Phase 1: Database Migration (Week 1-2)
-
Create IBM Cloud Databases for PostgreSQL instance
ibmcloud resource service-instance-create rag-modulo-postgres \ databases-for-postgresql standard ca-tor \ -p '{"members_memory_allocation_mb": "1024", "members_disk_allocation_mb": "10240"}'
-
Update backend configuration to use managed PostgreSQL:
- Update
COLLECTIONDB_HOST
to PostgreSQL service endpoint - Configure SSL/TLS certificates
- Test connection from Code Engine
- Update
-
Migrate existing data using
pg_dump
andpg_restore
Phase 2: Vector Database Strategy (Week 2-3)
Decision Required: Choose vector database approach:
Option 2A: Use Elasticsearch on IBM Cloud
- Leverage existing Elasticsearch support in codebase
- Create Databases for Elasticsearch instance
- Migrate Milvus collections to Elasticsearch indices
- Update vector search queries
Option 2B: Use Zilliz Cloud (Managed Milvus)
- Minimal code changes (same Milvus API)
- Create Zilliz Cloud account and cluster
- Configure private connectivity if needed
- Migrate collections
Option 2C: Use Pinecone/Weaviate
- Already supported in codebase (
vectordbs/
providers) - Sign up for Pinecone or Weaviate Cloud
- Update configuration
- Migrate vectors
Phase 3: Code Engine Deployment (Week 3-4)
-
Update environment configuration:
- Create
.env.codeengine
with managed service endpoints - Store secrets in Code Engine secret manager
- Create
-
Deploy backend to Code Engine:
ibmcloud ce application create \ --name rag-modulo-backend \ --image ghcr.io/manavgup/rag_modulo/backend:latest \ --env-from-secret rag-modulo-secrets \ --min-scale 0 --max-scale 10 \ --cpu 1 --memory 2G
-
Deploy frontend to Code Engine:
ibmcloud ce application create \ --name rag-modulo-frontend \ --image ghcr.io/manavgup/rag_modulo/frontend:latest \ --port 3000 \ --min-scale 0 --max-scale 5
-
Configure routing with Code Engine ingress
Phase 4: Testing and Validation (Week 4)
- Test all API endpoints
- Verify RAG search functionality
- Performance testing
- Load testing with scale-to-zero
Cost Comparison
Current OpenShift (Self-Hosted)
- Cluster: ~$200-400/month (2 workers, bx2.4x16)
- Storage: ~$50-100/month (persistent volumes)
- Total: ~$250-500/month (always running)
Proposed Code Engine + Managed Services
- Code Engine: ~$0-50/month (scales to zero, pay per use)
- PostgreSQL: ~$150-300/month (1GB RAM, 10GB storage)
- Elasticsearch/Vector DB: ~$100-300/month (depends on choice)
- Total: ~$250-650/month
Break-even: Code Engine is cost-effective if:
- Application has significant idle time (nights, weekends)
- Scale-to-zero reduces compute costs by >50%
- Development/staging environments can scale to zero
Required Code Changes
1. Configuration Management
File: backend/core/config.py
- Add Code Engine environment detection
- Support managed database connection strings
- Handle SSL/TLS certificates for managed services
2. Database Initialization
Files: backend/rag_solution/models/
, database migration scripts
- Update connection pooling for managed PostgreSQL
- Handle SSL mode (
sslmode=require
) - Configure connection limits appropriately
3. Vector Database Abstraction
Files: backend/vectordbs/
- Ensure abstraction layer works with chosen managed service
- Update Elasticsearch provider if using managed Elasticsearch
- Test Pinecone/Weaviate providers
4. Health Checks
File: backend/rag_solution/router/health.py
- Update health checks for managed services
- Add connection validation
- Monitor service availability
5. CI/CD Pipeline
File: .github/workflows/
- Create new workflow for Code Engine deployment
- Add managed service provisioning steps
- Update secrets management
Decision Matrix
Criteria | OpenShift | Code Engine + Managed | Code Engine + External |
---|---|---|---|
Setup Complexity | High | Medium | Medium |
Migration Effort | None (current) | High | Medium |
Monthly Cost | $250-500 | $250-650 | $300-700 |
Scale to Zero | No | Yes | Yes |
Vendor Lock-in | Low | High | Medium |
Operational Overhead | High | Low | Medium |
Milvus Support | Yes | No (alternatives) | Yes (Zilliz) |
Development Parity | High | Medium | Medium |
Next Steps
- Decision: Choose vector database strategy (Elasticsearch, Zilliz, Pinecone, Weaviate)
- Prototype: Create proof-of-concept with managed PostgreSQL + chosen vector DB
- Cost Analysis: Get actual quotes for managed services
- Migration Plan: Detailed step-by-step migration guide
- Documentation: Update deployment docs for Code Engine
Related Issues
- PR feat: Add production-ready Kubernetes/OpenShift deployment #261 - OpenShift/Kubernetes Helm Charts (current approach)
- Issue Optimize Docker build performance for CI/CD #272 - Docker Build Optimization
- Need to create: "Vector Database Provider Evaluation"
Labels
enhancement
architecture
deployment
ibm-cloud
serverless
code-engine
Questions for Discussion
- Vector Database: Which option for vector search? (Elasticsearch, Zilliz, Pinecone, Weaviate)
- Cost Tolerance: What's the acceptable monthly cost range?
- Migration Timeline: How urgent is serverless deployment?
- Development Parity: How important is local dev matching production?
- Hybrid Approach: Could we use Code Engine for dev/staging and OpenShift for production?