Skip to content

Optimize CI/CD: Build containers on merge to main and weekly, not on every PR #349

@manavgup

Description

@manavgup

name: Optimize CI/CD Container Build Strategy
about: Reduce CI/CD time by building containers only when needed
title: 'Optimize CI/CD: Build containers on merge to main and weekly, not on every PR'
labels: ['enhancement', 'ci-cd', 'performance', 'cost-optimization']
assignees: ''

🎯 Objective

Optimize CI/CD pipeline to reduce execution time and GitHub Actions costs by building containers only when necessary, while maintaining full security coverage.

📊 Current State

Current Behavior (Every PR)

  • ✅ Linting checks (~30s)
  • ✅ Build backend container (~5-7 min)
  • ✅ Build frontend container (~3-5 min)
  • ✅ Security scans (Trivy, Dockle, Hadolint) (~2-3 min)
  • ✅ Run tests
  • Total: ~10-15 minutes per PR

Issues

  • ❌ Container builds on every PR (most don't change Dockerfiles)
  • ❌ Slow feedback for developers (10-15 min wait)
  • ❌ High GitHub Actions minute consumption
  • ❌ Most PRs only change Python/TypeScript code, not containers

🎯 Proposed Strategy

Pull Request Workflow (Fast Feedback: ~2-3 min)

Focus: Fast code validation and feedback

on:
  pull_request:
    branches: [main]

jobs:
  # ALWAYS RUN (Fast)
  lint:
    - Ruff (lint + format)
    - MyPy, Pylint, Pydocstyle (informational)
    - YAML, JSON, TOML validation

  code-security:
    - Gitleaks (secret scanning)
    - TruffleHog (secret scanning)
    - Bandit (Python security)

  unit-tests:
    - pytest unit tests (no containers)
    - Coverage report

  # SKIP (Save time)
  skip:
    - ❌ Container builds (unless Dockerfile changed)
    - ❌ Container security scans
    - ❌ Integration tests (optional)

Push to Main Workflow (Production Ready: ~10-15 min)

Focus: Build and secure production artifacts

on:
  push:
    branches: [main]

jobs:
  # Everything from PR checks +

  build-and-push:
    - Build backend container
    - Build frontend container
    - Push to GHCR
    - Tag with version

  container-security:
    - Trivy scan (vulnerabilities)
    - Dockle (best practices)
    - Hadolint (Dockerfile linting)

  integration-tests:
    - Full integration test suite
    - E2E tests

Weekly Security Audit (Scheduled: ~15-20 min)

Focus: Deep security analysis and dependency updates

on:
  schedule:
    - cron: '0 2 * * 1'  # 2 AM every Monday

jobs:
  rebuild-and-scan:
    - Rebuild all containers (fresh base images)
    - Full Trivy scan (all severities)
    - Dockle best practices check
    - Hadolint Dockerfile audit
    - SBOM generation
    - Dependency vulnerability report

  report:
    - Generate security summary
    - Create issue if vulnerabilities found
    - Update security dashboard

Conditional Container Builds (Smart)

Focus: Build only when Dockerfile changes

on:
  pull_request:
    paths:
      - 'backend/Dockerfile*'
      - 'frontend/Dockerfile*'
      - 'docker-compose*.yml'

jobs:
  build-containers:
    - Build changed containers only
    - Run security scans
    - Test container startup

📋 Implementation Plan

Phase 1: Analysis (30 min)

  • Review current CI/CD workflows
  • Identify which jobs run on PR vs push
  • Measure current execution times
  • Document current Actions minute usage

Phase 2: Create New Workflows (2 hours)

  • Create .github/workflows/pr-fast-check.yml

    • Linting
    • Unit tests
    • Code-level security scans
  • Create .github/workflows/main-build-deploy.yml

    • Everything from PR
    • Container builds
    • Push to GHCR
    • Container security scans
  • Create .github/workflows/weekly-security-audit.yml

    • Scheduled rebuild
    • Deep security analysis
    • Vulnerability reporting

Phase 3: Update Existing Workflows (1 hour)

  • Update 01-lint.yml - Keep for PR (no changes needed)
  • Update 03-build-secure.yml - Move to main-only or weekly
  • Update 02-ci.yml - Optimize test execution
  • Add workflow dispatch for manual container builds

Phase 4: Add Smart Conditional Builds (1 hour)

  • Create workflow that triggers only on Dockerfile changes
  • Add path filters to existing workflows
  • Test with dummy PR

Phase 5: Testing & Documentation (1 hour)

  • Test PR workflow (should skip container builds)
  • Test push to main (should build containers)
  • Wait for weekly scheduled run
  • Update docs/development/ci-cd-security.md
  • Add workflow badges to README

🎯 Success Criteria

Performance Metrics

  • PR feedback time: < 3 minutes (down from 10-15 min)
  • Main branch build: < 15 minutes
  • GitHub Actions minutes: 90% reduction on PRs

Security Coverage

  • Code security scans: 100% on every PR
  • Container scans: 100% on main + weekly
  • Vulnerability reports: Weekly automated
  • No security regression

Developer Experience

  • Faster PR feedback → more iterations
  • Clear workflow separation (PR vs Main vs Weekly)
  • Manual build trigger available
  • Security dashboard updated

📊 Expected Impact

Before (Every PR)

Linting:           30s
Backend Build:    420s (7 min)
Frontend Build:   240s (4 min)
Security Scans:   180s (3 min)
Tests:             60s (1 min)
Total:           ~930s (15.5 min)

After (Every PR)

Linting:           30s
Unit Tests:        60s
Code Security:     45s
Total:           ~135s (2.25 min)

Time Saved Per PR: ~13 minutes
Time Saved Per Day (10 PRs): ~130 minutes
Actions Minutes Saved Monthly: ~3,900 minutes

📝 Workflow Examples

PR Workflow (Fast)

name: PR - Fast Checks

on:
  pull_request:
    branches: [main]

jobs:
  lint-and-test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Lint
        run: make lint
      - name: Unit Tests
        run: make test-unit-fast
      - name: Code Security
        run: make security-scan-code

Main Workflow (Build & Deploy)

name: Main - Build & Deploy

on:
  push:
    branches: [main]

jobs:
  build-and-push:
    runs-on: ubuntu-latest
    steps:
      - name: Build Backend
        run: make build-backend
      - name: Security Scan
        run: make security-scan-containers
      - name: Push to GHCR
        run: docker push ghcr.io/...

Weekly Security Audit

name: Weekly Security Audit

on:
  schedule:
    - cron: '0 2 * * 1'  # 2 AM Monday
  workflow_dispatch:  # Manual trigger

jobs:
  security-audit:
    runs-on: ubuntu-latest
    steps:
      - name: Rebuild Containers
        run: make build-all
      - name: Deep Security Scan
        run: make security-scan-deep
      - name: Generate Report
        run: make security-report

🔗 Related Issues

  • Part of Developer Experience improvement initiative
  • Follows containerless development paradigm
  • Complements Makefile streamlining (#TBD)

📚 References


Estimated Effort: 5-6 hours
Priority: High (Developer velocity + cost savings)
Impact: Very High (All developers + monthly cost reduction)

Metadata

Metadata

Assignees

No one assigned

    Labels

    ci-cdCI/CD and DevOps relatedenhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions