Skip to content

Organize Dev/Test Scripts and Create Documentation #550

@manavgup

Description

@manavgup

Organize Dev/Test Scripts and Create Documentation

Type: Documentation + Cleanup
Priority: P3 - Low (Housekeeping)
Effort: 1-2 hours
Labels: documentation, cleanup, good-first-issue


📋 Overview

Goal: Organize development test scripts and create comprehensive documentation for the backend/dev_tests/ directory.

Context: Project root currently has 6+ untracked test scripts used for manual testing and debugging. These scripts are valuable for development but should be:

  1. Moved to a dedicated directory (backend/dev_tests/)
  2. Documented in /docs folder
  3. Made discoverable for other developers

🎯 Objectives

1. Organize Test Scripts

  • Create backend/dev_tests/ directory structure
  • Move 6 development test scripts from root to backend/dev_tests/
  • Add README.md in backend/dev_tests/ explaining purpose
  • Add .gitignore entry for backend/dev_tests/ output files

2. Create Documentation

  • Create docs/development/dev-test-scripts.md documenting all scripts
  • Include usage examples, prerequisites, and expected output
  • Link from main development documentation

3. Archive Project Roadmap

  • Move MASTER_ISSUES_ROADMAP.mddocs/planning/master-roadmap.md
  • Keep historical record of performance improvements
  • Update any references to new location

📦 Test Scripts to Organize

Scripts Currently in Root

test_docling_config.py           # Docling configuration testing
test_embedding_direct.py         # Direct embedding API tests
test_embedding_retrieval.py      # Embedding retrieval validation
test_query_enhancement_demo.py   # Query enhancement demonstration
test_search_no_cot.py           # Search without Chain of Thought

Scripts Currently in backend/

backend/debug_rag_failure.py    # RAG pipeline debugging

Proposed Structure

backend/dev_tests/
├── README.md                    # Overview of dev test scripts
├── test_docling_config.py
├── test_embedding_direct.py
├── test_embedding_retrieval.py
├── test_query_enhancement_demo.py
├── test_search_no_cot.py
└── debug_rag_failure.py

📝 Documentation to Create

File: docs/development/dev-test-scripts.md

Content Structure:

# Development Test Scripts

## Overview
Development test scripts for manual testing, debugging, and validation.

## Location
`backend/dev_tests/`

## Scripts

### test_docling_config.py
**Purpose**: Test Docling document processing configuration
**Usage**: 
```bash
cd backend
poetry run python dev_tests/test_docling_config.py

Prerequisites: Docling dependencies installed
Output: Configuration validation results

test_embedding_direct.py

Purpose: Test embedding service directly without full pipeline
Usage:

cd backend
poetry run python dev_tests/test_embedding_direct.py

Prerequisites: Embedding service running
Output: Embedding vectors and similarity scores

test_embedding_retrieval.py

Purpose: Test embedding retrieval from vector store
Usage:

cd backend
poetry run python dev_tests/test_embedding_retrieval.py

Prerequisites:

  • Vector store (Milvus) running
  • Collection with embeddings
    Output: Retrieved documents with scores

test_query_enhancement_demo.py

Purpose: Demonstrate query enhancement features
Usage:

cd backend
poetry run python dev_tests/test_query_enhancement_demo.py

Prerequisites: None (self-contained demo)
Output: Enhanced queries with entity extraction

test_search_no_cot.py

Purpose: Test search without Chain of Thought reasoning
Usage:

cd backend
poetry run python dev_tests/test_search_no_cot.py

Prerequisites:

  • All services running (make local-dev-infra)
  • Collection with documents
    Output: Search results and timing

debug_rag_failure.py

Purpose: Debug RAG pipeline failures with detailed logging
Usage:

cd backend
poetry run python dev_tests/debug_rag_failure.py <collection_id> <query>

Prerequisites: RAG services running
Output: Detailed pipeline execution trace

Best Practices

  1. Run from backend/ directory: All scripts assume backend/ as working directory
  2. Check prerequisites: Ensure required services are running
  3. Use for debugging: These scripts are for development, not production
  4. Add new scripts: Follow naming convention test_<component>_<purpose>.py
  5. Document changes: Update this file when adding/modifying scripts

Related Documentation


---

## 📂 Archive Project Roadmap

### File: `docs/planning/master-roadmap.md`

Move `MASTER_ISSUES_ROADMAP.md` to `docs/planning/` directory:

**Why**:
- Historical record of performance journey (75-100s → 8-22s)
- Documents decision points and architecture choices
- Reference for future work (Phase 2-4 plans)
- Too valuable to delete, but clutters root

**Structure**:

docs/planning/
├── master-roadmap.md # Moved from root
└── README.md # Index of planning docs


---

## 🛠️ Implementation Steps

### Step 1: Create Directory Structure
```bash
# Create directories
mkdir -p backend/dev_tests
mkdir -p docs/development
mkdir -p docs/planning

# Create README files
touch backend/dev_tests/README.md
touch docs/planning/README.md

Step 2: Move Test Scripts

# Move test scripts
mv test_docling_config.py backend/dev_tests/
mv test_embedding_direct.py backend/dev_tests/
mv test_embedding_retrieval.py backend/dev_tests/
mv test_query_enhancement_demo.py backend/dev_tests/
mv test_search_no_cot.py backend/dev_tests/
mv backend/debug_rag_failure.py backend/dev_tests/

Step 3: Create Documentation

# Create dev test scripts documentation
touch docs/development/dev-test-scripts.md
# (Fill with content from above)

Step 4: Archive Roadmap

# Move roadmap to docs
mv MASTER_ISSUES_ROADMAP.md docs/planning/master-roadmap.md

# Update references (if any)
# - Search for "MASTER_ISSUES_ROADMAP.md" in codebase
# - Update links to "docs/planning/master-roadmap.md"

Step 5: Update .gitignore

# Add to .gitignore
echo "" >> .gitignore
echo "# Dev test outputs" >> .gitignore
echo "backend/dev_tests/*.log" >> .gitignore
echo "backend/dev_tests/*.json" >> .gitignore
echo "backend/dev_tests/*.txt" >> .gitignore

Step 6: Create README files

backend/dev_tests/README.md:

# Development Test Scripts

Manual testing and debugging scripts for RAG Modulo development.

## Usage
```bash
cd backend
poetry run python dev_tests/<script_name>.py

Documentation

See Dev Test Scripts Guide for detailed usage.

Scripts

  • test_docling_config.py - Docling configuration testing
  • test_embedding_direct.py - Direct embedding tests
  • test_embedding_retrieval.py - Embedding retrieval validation
  • test_query_enhancement_demo.py - Query enhancement demo
  • test_search_no_cot.py - Search without CoT
  • debug_rag_failure.py - RAG pipeline debugging

**docs/planning/README.md**:
```markdown
# Planning Documents

Historical planning documents and roadmaps for RAG Modulo.

## Current Plans
- See GitHub Issues for active epics and user stories

## Historical Plans
- [Master Roadmap](master-roadmap.md) - Performance journey and architecture decisions (Oct 2025)

✅ Acceptance Criteria

Must Complete:

  • All test scripts moved to backend/dev_tests/
  • docs/development/dev-test-scripts.md created with full documentation
  • backend/dev_tests/README.md created
  • MASTER_ISSUES_ROADMAP.md moved to docs/planning/master-roadmap.md
  • docs/planning/README.md created
  • .gitignore updated for dev test outputs
  • All scripts still runnable from new location
  • Documentation links updated

Nice to Have:

  • Add usage examples to each script (docstrings)
  • Create dev_tests/__init__.py for import support
  • Link from main docs/index.md

📚 Related Files

Will Be Created:

  • backend/dev_tests/README.md
  • docs/development/dev-test-scripts.md
  • docs/planning/master-roadmap.md
  • docs/planning/README.md

Will Be Moved:

  • test_*.py (6 files) → backend/dev_tests/
  • MASTER_ISSUES_ROADMAP.mddocs/planning/master-roadmap.md

Will Be Updated:

  • .gitignore (add dev_tests outputs)

🎯 Benefits

  1. Cleaner Root Directory: No more scattered test scripts
  2. Better Discoverability: New developers can find test utilities
  3. Documentation: Clear usage examples and prerequisites
  4. Preserved History: MASTER_ISSUES_ROADMAP.md safely archived
  5. Maintainability: Centralized location for dev tools

🤔 Questions / Decisions

  1. Should dev_tests be tracked in git?

    • Recommendation: Yes, track scripts but ignore outputs
  2. Should we create more dev test scripts?

    • Recommendation: Yes, as needed for specific debugging
  3. Should MASTER_ISSUES_ROADMAP.md stay in root?

    • Recommendation: No, move to docs/planning/ for organization

👥 Labels

documentation cleanup good-first-issue dev-experience

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationImprovements or additions to documentationpriority:lowLow priority - when time permits

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions