Skip to content

Phase 3: Comprehensive Testing Infrastructure #331

@manavgup

Description

@manavgup

Overview

Phase 3 of the CI/CD pipeline implementation focuses on building comprehensive test coverage and infrastructure to ensure code quality and reliability.

Goals

  • ✅ Create smoke test suite for fast validation
  • ✅ Implement integration test workflow
  • ✅ Increase test coverage from current ~60% to 70%
  • ✅ Establish test best practices and patterns

Background

Current State (After Phase 1 & 2)

Completed:

  • ✅ Phase 1: Lint matrix, duplicate build removal, disk optimization
  • ✅ Phase 2: Security scanning (Hadolint, Dockle, Trivy, Syft)

Current Testing:

  • Unit tests: Run in ci.yml with ~60% coverage
  • Integration tests: Exist but not in CI workflow
  • Smoke tests: Not implemented
  • E2E tests: Not implemented

Gaps:

  1. No smoke test suite for quick validation
  2. Integration tests not running in CI
  3. Coverage could be higher (currently ~60%)
  4. No systematic test categorization

Phase 3 Tasks

1. Create Smoke Test Suite

Goal: Fast (<2 min) critical path validation

Tasks:

  • Identify critical API endpoints (health, auth, search)
  • Create tests/smoke/ directory structure
  • Implement smoke tests for:
    • Health check endpoints
    • Authentication flow
    • Basic CRUD operations
    • Database connectivity
    • Vector DB connectivity
  • Add pytest marker: @pytest.mark.smoke
  • Document smoke test patterns

Success Criteria:

  • Smoke tests complete in <2 minutes
  • Cover 80% of critical paths
  • Can run without full infrastructure

2. Create Integration Test Workflow (04-integration.yml)

Goal: Automated integration testing in CI

Workflow Structure:

name: Integration Tests

on:
  pull_request:
    branches: [main]
    paths:
      - 'backend/**'
      - 'tests/integration/**'
  push:
    branches: [main]

jobs:
  smoke-tests:
    name: 🚀 Smoke Tests (Fast)
    runs-on: ubuntu-latest
    steps:
      # Run smoke tests without full infrastructure
      # Should complete in ~2 minutes
      
  integration-tests:
    name: 🔗 Integration Tests (Full Stack)
    runs-on: ubuntu-latest
    needs: smoke-tests
    services:
      postgres:
        image: postgres:15
      milvus:
        image: milvusdb/milvus:latest
    steps:
      # Run full integration tests with all services
      # Should complete in ~5-7 minutes

Tasks:

  • Create .github/workflows/04-integration.yml
  • Configure service containers (Postgres, Milvus, etc.)
  • Set up test database migrations
  • Configure test environment variables
  • Add integration test reporting
  • Document test infrastructure setup

3. Enhance Integration Tests

Current State:

  • Integration tests exist in tests/integration/
  • Not consistently run in CI
  • Some tests may be flaky

Tasks:

  • Audit existing integration tests
  • Fix flaky tests (identify and stabilize)
  • Add missing integration tests:
    • Pipeline execution flow
    • Document ingestion pipeline
    • Search with different vector DBs
    • Multi-user scenarios
    • Configuration management
  • Add test data fixtures
  • Implement test isolation (cleanup between tests)

4. Increase Test Coverage to 70%

Current Coverage: ~60% (unit tests only)
Target Coverage: 70% (unit + integration)

Focus Areas:

  1. Services Layer (highest priority)

    • search_service.py
    • pipeline_service.py
    • document_service.py
    • user_service.py
  2. Repository Layer

    • Database operations
    • Vector DB operations
  3. Router Layer

    • API endpoint handlers
    • Request validation
    • Error handling

Tasks:

  • Generate coverage report: make coverage-html
  • Identify untested modules
  • Write tests for uncovered code
  • Add coverage badges to README
  • Set up coverage reporting in CI

5. Test Best Practices & Documentation

Tasks:

  • Create docs/testing/best-practices.md
  • Document test patterns:
    • Unit test structure
    • Integration test setup
    • Mocking strategies
    • Fixture usage
  • Create test templates
  • Add testing guidelines to CONTRIBUTING.md
  • Document how to run tests locally

Implementation Plan

Week 1: Smoke Tests

  • Day 1-2: Design smoke test suite
  • Day 3-4: Implement smoke tests
  • Day 5: Create smoke test workflow

Week 2: Integration Workflow

  • Day 1-2: Create 04-integration.yml
  • Day 3-4: Configure service containers
  • Day 5: Test and debug workflow

Week 3: Coverage & Polish

  • Day 1-3: Write missing tests (target 70%)
  • Day 4: Documentation
  • Day 5: Review and refinement

Success Metrics

Metric Current Target Measurement
Test Coverage ~60% 70% pytest --cov
Smoke Test Time N/A <2 min CI workflow time
Integration Test Time N/A <7 min CI workflow time
Test Reliability ~85% >95% Pass rate over 10 runs
Flaky Tests Unknown 0 Manual tracking

Testing Infrastructure

Service Dependencies

Required Services:

  • PostgreSQL (metadata)
  • Milvus (vector storage)
  • MinIO (optional - object storage)
  • MLFlow (optional - model tracking)

Strategy:

  • Use Docker Compose for local testing
  • Use GitHub Actions service containers for CI
  • Mock external APIs (WatsonX, OpenAI, etc.)

Test Data

Requirements:

  • Sample documents (PDF, DOCX, TXT)
  • Test embeddings
  • Mock user data
  • Mock pipeline configurations

Location:

  • tests/fixtures/ - Test data files
  • tests/conftest.py - Shared fixtures

Related Issues & PRs


Dependencies

Blocked By:

  • None (can start immediately)

Blocks:

  • Phase 4 (Advanced features)
  • E2E test implementation

Testing Checklist

  • Smoke tests created and passing
  • 04-integration.yml workflow created
  • Integration tests running in CI
  • Coverage increased to 70%
  • Test documentation written
  • Flaky tests identified and fixed
  • Test best practices documented
  • Coverage reports in CI

Resources


Notes

  • Smoke tests should be lightweight and fast
  • Integration tests need full infrastructure
  • Focus on critical paths first
  • Document test patterns for consistency
  • Consider test execution time in design

Priority: Medium
Estimated Time: 2-3 weeks

Metadata

Metadata

Assignees

No one assigned

    Labels

    ci-cdCI/CD and DevOps relatedenhancementNew feature or requesttestingTesting and test infrastructure

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions