🚨 CRITICAL: Test Isolation Violations - Unit Tests Depend on Real Environment Variables

# 🚨 CRITICAL: Test Isolation Violations - Unit Tests Depend on Real Environment Variables

## Problem Summary
Our atomic/unit tests (marked with `@pytest.mark.atomic`) are violating test isolation principles by depending on real environment variables at module import time. This causes CI failures and prevents proper test isolation.

## Root Cause Analysis

### Current Issues
1. **Module-level environment access**: Tests access `settings.wx_api_key`, `settings.jwt_secret_key` etc. when the module loads
2. **Global settings object**: Many tests import and use the global `settings` object instead of mocking
3. **API calls at import time**: Some tests call `get_embeddings()` at module level, making real API calls
4. **Conditional test skips**: Tests use `@pytest.mark.skipif(not settings.wx_api_key, ...)` at module level

### Files Affected (Confirmed)
- `backend/tests/unit/test_core_config.py` - Checks real env vars exist
- `backend/tests/data_ingestion/test_ingestion.py` - Calls `get_embeddings()` at import 
- `backend/tests/service/test_question_service_providers.py` - Module-level settings access
- `backend/tests/service/test_search_service.py` - Uses `settings.milvus_host` in tests
- `backend/tests/services/test_search_service.py` - Uses `settings.embedding_model`

## How to Mass Identify the Problem

### Search Commands
```bash
# Find atomic tests that import settings
grep -l "pytest.mark.atomic" backend/tests/**/*.py | xargs grep -l "from core.config import settings"

# Find module-level settings access
for file in $(find backend/tests -name "*.py"); do
  if grep -q "@pytest.mark.atomic" "$file" && grep -q "from core.config import settings" "$file"; then
    echo "=== $file ===" 
    grep -n "settings\." "$file" | head -3
  fi
done

# Find conditional skips based on real settings
grep -r "@pytest.mark.skipif.*settings\." backend/tests/ --include="*.py"

# Find API calls at module level
grep -r "get_embeddings\|get_datastore" backend/tests/ --include="*.py" | grep -v "def test"
```

### Expected Behavior
**Unit/Atomic tests should:**
- ✅ Run without any real environment variables
- ✅ Use mocking/dependency injection for all external dependencies  
- ✅ Be completely isolated and deterministic
- ✅ Execute instantly without network calls or real services

## How to Fix

### Pattern 1: Mock settings at test level
```python
# BEFORE (bad)
from core.config import settings

@pytest.mark.atomic
def test_something():
    assert settings.jwt_secret_key is not None  # Requires real env var!

# AFTER (good) 
import os
from unittest.mock import patch
from core.config import Settings

@pytest.mark.atomic
@patch.dict(os.environ, {'JWT_SECRET_KEY': 'test-secret'})
def test_something():
    test_settings = Settings()  # Fresh instance with mocked env
    assert test_settings.jwt_secret_key == 'test-secret'
```

### Pattern 2: Dependency injection
```python
# BEFORE (bad)
from core.config import settings

@pytest.mark.atomic  
def test_service():
    service = SearchService()  # Uses global settings internally
    
# AFTER (good)
@pytest.mark.atomic
def test_service():
    mock_settings = Mock()
    mock_settings.milvus_host = 'test-host'
    service = SearchService(settings=mock_settings)  # Inject mock
```

### Pattern 3: Move API calls into test functions
```python
# BEFORE (bad - at module level)
sample_document = Document(
    vectors=get_embeddings(text),  # Real API call at import!
)

# AFTER (good - in test function)
@pytest.mark.atomic
def test_document():
    with patch('vectordbs.utils.watsonx.get_embeddings') as mock_embeddings:
        mock_embeddings.return_value = [0.1, 0.2, 0.3]
        sample_document = Document(vectors=get_embeddings(text))
```

## How to Validate the Fix

### 1. Local Testing
```bash
# Unit tests should run without .env file
cd backend
rm .env 2>/dev/null || true
poetry run pytest tests/ -m atomic -v

# Should pass without any real environment variables
```

### 2. CI Testing
```bash
# CI lint-and-unit job should not need environment variables
# Remove env vars from CI and tests should still pass
```

### 3. Test Isolation Verification
```bash
# Tests should be deterministic - multiple runs should have same result
for i in {1..3}; do 
  echo "Run $i:"
  poetry run pytest tests/ -m atomic --tb=no -q
done
```

## Implementation Plan

### Phase 1: Audit and Prioritize (1-2 days)
1. Run mass identification commands above
2. Categorize issues by severity:
   - **Critical**: Tests that fail CI without env vars
   - **High**: Tests making real API calls  
   - **Medium**: Tests with global settings access
3. Create fix priority list

### Phase 2: Fix Critical Issues (2-3 days)
1. Fix `test_core_config.py` first (blocks CI)
2. Fix tests with module-level API calls
3. Fix conditional skips

### Phase 3: Comprehensive Fix (1 week)
1. Refactor remaining atomic tests
2. Create reusable test fixtures for mocked settings
3. Add linting rules to prevent regression

### Phase 4: Prevention (ongoing)
1. Pre-commit hooks to catch violations
2. CI job that runs atomic tests without env vars
3. Developer documentation on test isolation

## Success Criteria
- [ ] All atomic tests pass without any environment variables
- [ ] No real API calls during test imports
- [ ] CI lint-and-unit job requires no secrets/env vars
- [ ] Test execution time improved (no network waits)
- [ ] Pre-commit hooks prevent regression

## Labels
- `critical`: Blocks CI/CD pipeline
- `testing`: Test framework improvements  
- `technical-debt`: Code quality issue
- `good-first-issue`: Clear patterns for contributors

## Related Issues
- #167 - CI/CD Pipeline fixes (blocked by this)

---
**Priority**: P0 - Critical (blocks development workflow)
**Effort**: 1-2 weeks  
**Impact**: Enables proper CI/CD, improves test reliability

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

🚨 CRITICAL: Test Isolation Violations - Unit Tests Depend on Real Environment Variables #172

🚨 CRITICAL: Test Isolation Violations - Unit Tests Depend on Real Environment Variables

Problem Summary

Root Cause Analysis

Current Issues

Files Affected (Confirmed)

How to Mass Identify the Problem

Search Commands

Expected Behavior

How to Fix

Pattern 1: Mock settings at test level

Pattern 2: Dependency injection

Pattern 3: Move API calls into test functions

How to Validate the Fix

1. Local Testing

2. CI Testing

3. Test Isolation Verification

Implementation Plan

Phase 1: Audit and Prioritize (1-2 days)

Phase 2: Fix Critical Issues (2-3 days)

Phase 3: Comprehensive Fix (1 week)

Phase 4: Prevention (ongoing)

Success Criteria

Labels

Related Issues

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

🚨 CRITICAL: Test Isolation Violations - Unit Tests Depend on Real Environment Variables #172

Description

🚨 CRITICAL: Test Isolation Violations - Unit Tests Depend on Real Environment Variables

Problem Summary

Root Cause Analysis

Current Issues

Files Affected (Confirmed)

How to Mass Identify the Problem

Search Commands

Expected Behavior

How to Fix

Pattern 1: Mock settings at test level

Pattern 2: Dependency injection

Pattern 3: Move API calls into test functions

How to Validate the Fix

1. Local Testing

2. CI Testing

3. Test Isolation Verification

Implementation Plan

Phase 1: Audit and Prioritize (1-2 days)

Phase 2: Fix Critical Issues (2-3 days)

Phase 3: Comprehensive Fix (1 week)

Phase 4: Prevention (ongoing)

Success Criteria

Labels

Related Issues

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions