Skip to content

Conversation

@manavgup
Copy link
Owner

Summary

Implements a comprehensive CLI tool for testing and diagnosing RAG search quality as specified in #131.

Changes

  • ✅ Created CLI module with three main commands for search testing
  • ✅ Added quality metrics calculation and reporting utilities
  • ✅ Integrated with Makefile for easy execution
  • ✅ Created comprehensive documentation
  • ✅ Added unit tests for CLI functionality

Features

🔧 CLI Commands

  • search test - Test single queries with detailed metrics and verbose output
  • search batch-test - Run quality tests on multiple queries with aggregated reporting
  • search test-components - Test individual RAG pipeline components for debugging

📊 Quality Metrics

  • Answer completeness scoring (0-100%)
  • Keyword coverage analysis
  • Retrieval precision and similarity scores
  • Component-level performance timing
  • Batch quality scoring with statistical summaries

🎯 Additional Features

  • Rich console output with tables and progress indicators
  • JSON export for further analysis
  • Component isolation for debugging specific pipeline stages
  • Configurable query rewriting strategies (simple, hypothetical)
  • Comprehensive error handling and reporting

Files Added/Modified

  • backend/cli/ - Core CLI implementation modules
  • backend/search_cli.py - Entry point script
  • backend/test_data/search_queries.json - Sample test queries
  • backend/tests/cli/ - Unit tests for CLI functionality
  • Makefile - Added search-test, search-batch, search-components targets

Testing

✅ CLI help commands validated
✅ Utility functions tested (metrics calculation, quality evaluation)
✅ Makefile integration verified
✅ Lazy loading prevents configuration errors for help commands

Usage Examples

# Test single query
make search-test QUERY="What is machine learning?" COLLECTION_ID="uuid" USER_ID="uuid"

# Run batch tests
make search-batch COLLECTION_ID="uuid" USER_ID="uuid"

# Test components
make search-components QUERY="test query" COLLECTION_ID="uuid"

Documentation

Comprehensive documentation added in backend/cli/README.md with:

  • Installation instructions
  • Usage examples for all commands
  • Output format descriptions
  • Quality metrics explanations
  • Troubleshooting guide

Closes #131

Completes Phase 3 of #139 with comprehensive dependency management, documentation
generation, and UV evaluation capabilities.

## New Features Added ✅

### Dependency Management
- Add `check-deps` target - identifies outdated dependencies
- Add `check-deps-tree` target - visualizes dependency hierarchy
- Add `export-requirements` target - exports to requirements.txt files
- Found 83 outdated packages in testing including security-critical updates

### Documentation Generation
- Add `docs-generate` target - creates API documentation with pydoc
- Add `docs-serve` target - serves docs locally on port 8080
- Provides quick access to module documentation

### UV Integration (Experimental)
- Add `uv-install` target - installs UV package manager
- Add `uv-sync` target - syncs dependencies with UV
- Add `uv-export` target - exports requirements with UV
- Provides migration path for future UV adoption

## Developer Workflow

New commands available:
- `make check-deps` - Check for outdated dependencies
- `make check-deps-tree` - View dependency tree
- `make export-requirements` - Export requirements files
- `make docs-generate` - Generate API documentation
- `make docs-serve` - Serve docs locally
- `make uv-install` - Install UV (experimental)

## UV vs Poetry Analysis

Evaluated UV as Poetry alternative:
- **Pros**: 10-100x faster, unified tooling, simpler config
- **Cons**: Migration effort, pyproject.toml restructuring needed
- **Recommendation**: Keep Poetry for stability, UV available for experimentation

## Phase 3 Complete - Issue Can Be Closed

**✅ All Phase 3 Requirements Met:**
- Dependency checking implemented
- Requirements export functional
- Documentation generation working
- UV support added as experimental option

**📊 Real Impact:**
- Identified 83 outdated dependencies
- Critical security updates available (urllib3, cryptography, etc.)
- Documentation generation successful
- UV integration path established

Addresses and completes Phase 3 of #139.
Implement comprehensive CLI tool for testing and diagnosing RAG search quality with:

🔧 CLI Commands:
- search test: Test single queries with detailed metrics
- search batch-test: Run quality tests on multiple queries
- search test-components: Test individual RAG pipeline components

📊 Quality Metrics:
- Answer completeness and keyword coverage
- Retrieval precision and similarity scores
- Component-level performance timing
- Batch quality scoring and reporting

🎯 Features:
- Rich console output with tables and progress indicators
- JSON export for further analysis
- Component isolation for debugging
- Configurable query rewriting strategies
- Makefile integration for easy execution

📁 Files Added:
- backend/cli/: Core CLI implementation modules
- backend/search_cli.py: Entry point script
- backend/test_data/search_queries.json: Sample test queries
- backend/tests/cli/: Unit tests for CLI functionality
- Updated Makefile with search-test, search-batch, search-components targets

✅ Addresses issue #131 requirements for systematic RAG search quality testing
@manavgup manavgup merged commit d1fb57e into main Aug 30, 2025
4 checks passed
@manavgup manavgup deleted the feat/cli-search-quality-testing branch September 12, 2025 18:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

🔍 Create CLI-based Search Quality Testing Framework

2 participants