Skip to content

🚀 Architecture: Enable Serverless Deployment on IBM Cloud Code Engine #297

@manavgup

Description

@manavgup

🚀 Architecture: Enable Serverless Deployment on IBM Cloud Code Engine

Context

RAG Modulo currently uses a container-orchestration architecture (Docker Compose locally, OpenShift for production) that includes self-managed stateful services:

  • PostgreSQL - Metadata and user data storage
  • Milvus - Vector database for embeddings
  • etcd - Milvus dependency
  • MinIO - Object storage (Milvus dependency)

Problem: IBM Cloud Code Engine (serverless platform) cannot run stateful services like databases. Code Engine is designed for stateless containers that scale to zero.

Goal: Architect a solution that enables RAG Modulo to deploy on IBM Cloud Code Engine while maintaining full functionality.


Current Architecture

┌─────────────────────────────────────────┐
│         Docker/OpenShift Cluster        │
├─────────────────────────────────────────┤
│  ┌──────────┐  ┌────────────────────┐  │
│  │  Backend │  │  Frontend (WebUI)  │  │
│  │ (FastAPI)│  │     (React)        │  │
│  └────┬─────┘  └─────────────────────┘  │
│       │                                  │
│  ┌────┴──────────┐  ┌─────────────┐    │
│  │  PostgreSQL   │  │   Milvus    │    │
│  │ (StatefulSet) │  │(StatefulSet)│    │
│  └───────────────┘  └──────┬──────┘    │
│                            │            │
│                     ┌──────┴──────┐    │
│                     │ etcd│ MinIO │    │
│                     └─────────────┘    │
└─────────────────────────────────────────┘

Proposed Serverless Architecture Options

Option A: IBM Cloud Managed Services (Recommended)

Replace self-hosted databases with IBM Cloud managed services:

Services Mapping:

  • PostgreSQLIBM Cloud Databases for PostgreSQL

    • Fully managed, auto-scaling
    • Built-in backup and HA
    • Cost: ~$150-300/month (1GB RAM, 10GB storage)
  • Milvus → Alternatives needed (Milvus not available as managed service on IBM Cloud):

Architecture:

┌─────────────────────────────────────────────────┐
│          IBM Cloud Code Engine                  │
├─────────────────────────────────────────────────┤
│  ┌──────────┐  ┌────────────────────┐          │
│  │  Backend │  │  Frontend (WebUI)  │          │
│  │ (FastAPI)│  │     (React)        │          │
│  └────┬─────┘  └─────────────────────┘          │
└───────┼─────────────────────────────────────────┘
        │
        │ (Private Endpoints)
        │
   ┌────┴────────────────────────────────┐
   │   IBM Cloud Managed Services        │
   ├─────────────────────────────────────┤
   │  ┌────────────────────────────┐    │
   │  │  Databases for PostgreSQL  │    │
   │  └────────────────────────────┘    │
   │  ┌────────────────────────────┐    │
   │  │  Elasticsearch / watsonx   │    │
   │  │  (Vector Search)           │    │
   │  └────────────────────────────┘    │
   └─────────────────────────────────────┘

Pros:

  • ✅ Fully managed, auto-scaling
  • ✅ Built-in backup, HA, monitoring
  • ✅ Native IBM Cloud integration
  • ✅ No container orchestration needed
  • ✅ Scale to zero for cost savings

Cons:

  • ❌ Migration effort required
  • ❌ Vendor lock-in
  • ❌ Monthly costs even at low usage
  • ❌ Milvus not directly available

Option B: Hybrid - Code Engine + External Managed Databases

Keep application serverless but use external managed vector databases:

Services:

  • PostgreSQL → IBM Cloud Databases for PostgreSQL
  • Milvus → Zilliz Cloud (managed Milvus) OR Pinecone/Weaviate

Pros:

  • ✅ Keep Milvus functionality
  • ✅ Minimal code changes
  • ✅ Best of both worlds

Cons:

  • ❌ Data egress costs (cross-cloud if using external services)
  • ❌ Multiple vendors to manage
  • ❌ Latency if not in same region

Option C: Keep OpenShift (Current Approach)

Continue using OpenShift for full control over infrastructure:

Pros:

Cons:

  • ❌ Higher baseline cost (~$200-400/month for cluster)
  • ❌ No scale-to-zero
  • ❌ More operational overhead

Recommended Migration Plan (Option A)

Phase 1: Database Migration (Week 1-2)

  1. Create IBM Cloud Databases for PostgreSQL instance

    ibmcloud resource service-instance-create rag-modulo-postgres \
      databases-for-postgresql standard ca-tor \
      -p '{"members_memory_allocation_mb": "1024", "members_disk_allocation_mb": "10240"}'
  2. Update backend configuration to use managed PostgreSQL:

    • Update COLLECTIONDB_HOST to PostgreSQL service endpoint
    • Configure SSL/TLS certificates
    • Test connection from Code Engine
  3. Migrate existing data using pg_dump and pg_restore

Phase 2: Vector Database Strategy (Week 2-3)

Decision Required: Choose vector database approach:

Option 2A: Use Elasticsearch on IBM Cloud

  • Leverage existing Elasticsearch support in codebase
  • Create Databases for Elasticsearch instance
  • Migrate Milvus collections to Elasticsearch indices
  • Update vector search queries

Option 2B: Use Zilliz Cloud (Managed Milvus)

  • Minimal code changes (same Milvus API)
  • Create Zilliz Cloud account and cluster
  • Configure private connectivity if needed
  • Migrate collections

Option 2C: Use Pinecone/Weaviate

  • Already supported in codebase (vectordbs/ providers)
  • Sign up for Pinecone or Weaviate Cloud
  • Update configuration
  • Migrate vectors

Phase 3: Code Engine Deployment (Week 3-4)

  1. Update environment configuration:

    • Create .env.codeengine with managed service endpoints
    • Store secrets in Code Engine secret manager
  2. Deploy backend to Code Engine:

    ibmcloud ce application create \
      --name rag-modulo-backend \
      --image ghcr.io/manavgup/rag_modulo/backend:latest \
      --env-from-secret rag-modulo-secrets \
      --min-scale 0 --max-scale 10 \
      --cpu 1 --memory 2G
  3. Deploy frontend to Code Engine:

    ibmcloud ce application create \
      --name rag-modulo-frontend \
      --image ghcr.io/manavgup/rag_modulo/frontend:latest \
      --port 3000 \
      --min-scale 0 --max-scale 5
  4. Configure routing with Code Engine ingress

Phase 4: Testing and Validation (Week 4)

  • Test all API endpoints
  • Verify RAG search functionality
  • Performance testing
  • Load testing with scale-to-zero

Cost Comparison

Current OpenShift (Self-Hosted)

  • Cluster: ~$200-400/month (2 workers, bx2.4x16)
  • Storage: ~$50-100/month (persistent volumes)
  • Total: ~$250-500/month (always running)

Proposed Code Engine + Managed Services

  • Code Engine: ~$0-50/month (scales to zero, pay per use)
  • PostgreSQL: ~$150-300/month (1GB RAM, 10GB storage)
  • Elasticsearch/Vector DB: ~$100-300/month (depends on choice)
  • Total: ~$250-650/month

Break-even: Code Engine is cost-effective if:

  • Application has significant idle time (nights, weekends)
  • Scale-to-zero reduces compute costs by >50%
  • Development/staging environments can scale to zero

Required Code Changes

1. Configuration Management

File: backend/core/config.py

  • Add Code Engine environment detection
  • Support managed database connection strings
  • Handle SSL/TLS certificates for managed services

2. Database Initialization

Files: backend/rag_solution/models/, database migration scripts

  • Update connection pooling for managed PostgreSQL
  • Handle SSL mode (sslmode=require)
  • Configure connection limits appropriately

3. Vector Database Abstraction

Files: backend/vectordbs/

  • Ensure abstraction layer works with chosen managed service
  • Update Elasticsearch provider if using managed Elasticsearch
  • Test Pinecone/Weaviate providers

4. Health Checks

File: backend/rag_solution/router/health.py

  • Update health checks for managed services
  • Add connection validation
  • Monitor service availability

5. CI/CD Pipeline

File: .github/workflows/

  • Create new workflow for Code Engine deployment
  • Add managed service provisioning steps
  • Update secrets management

Decision Matrix

Criteria OpenShift Code Engine + Managed Code Engine + External
Setup Complexity High Medium Medium
Migration Effort None (current) High Medium
Monthly Cost $250-500 $250-650 $300-700
Scale to Zero No Yes Yes
Vendor Lock-in Low High Medium
Operational Overhead High Low Medium
Milvus Support Yes No (alternatives) Yes (Zilliz)
Development Parity High Medium Medium

Next Steps

  1. Decision: Choose vector database strategy (Elasticsearch, Zilliz, Pinecone, Weaviate)
  2. Prototype: Create proof-of-concept with managed PostgreSQL + chosen vector DB
  3. Cost Analysis: Get actual quotes for managed services
  4. Migration Plan: Detailed step-by-step migration guide
  5. Documentation: Update deployment docs for Code Engine

Related Issues


Labels

enhancement architecture deployment ibm-cloud serverless code-engine


Questions for Discussion

  1. Vector Database: Which option for vector search? (Elasticsearch, Zilliz, Pinecone, Weaviate)
  2. Cost Tolerance: What's the acceptable monthly cost range?
  3. Migration Timeline: How urgent is serverless deployment?
  4. Development Parity: How important is local dev matching production?
  5. Hybrid Approach: Could we use Code Engine for dev/staging and OpenShift for production?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions