Skip to content

Commit 53475fc

Browse files
manavgupclaude
andauthored
fix: Comprehensive test fixes after PYTHONPATH removal from Makefile (#506)
* refactor: Move Poetry configuration to project root for cleaner monorepo structure ## Summary Move pyproject.toml and poetry.lock from backend/ to project root to centralize Python dependency management and fix virtual environment accessibility issues. ## Changes ### Poetry Configuration (Moved) - backend/pyproject.toml → pyproject.toml - backend/poetry.lock → poetry.lock ### Makefile (100+ lines across 20+ targets) - Changed VENV_DIR from backend/.venv to .venv - Updated all Poetry commands to run from project root with PYTHONPATH=backend - Added venv dependency to local-dev-backend and local-dev-all targets - Updated build targets to use project root as Docker build context - Updated all test targets (atomic, unit, integration, e2e) - Updated code quality targets (lint, format, security-check, coverage) - Fixed clean target to reference root-level paths ### CI/CD Workflows (5 files) - poetry-lock-check.yml: Updated paths and removed cd backend commands - 01-lint.yml: Removed working-directory, updated all tool paths - 04-pytest.yml: Updated cache keys and test commands - 05-ci.yml: Updated dependency installation commands - makefile-testing.yml: Updated test execution paths ### Docker Configuration - backend/Dockerfile.backend: Updated COPY commands for new build context - docker-compose.dev.yml: Changed context from ./backend to . + fixed indentation ## Benefits - Single source of truth for Python dependencies at project root - Simplified virtual environment management (.venv/ at root) - Consistent build context across all tools (Makefile, docker-compose, CI/CD) - Better monorepo structure for future frontend/backend separation - Fixes dependency accessibility issues (e.g., docling import errors) ## Breaking Changes Developers need to: 1. Remove old venv: rm -rf backend/.venv 2. Create new venv: make venv 3. Update IDE Python interpreter from backend/.venv to .venv 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix(docker): add cache-bust ARG to invalidate stale Docker layers When poetry files were moved from backend/ to project root, Docker cached layers still referenced the old file structure. Adding an ARG before the COPY command forces Docker to invalidate the cache at this layer. Fixes CI build failure in PR #501. * fix(docker): Update Dockerfiles and workflows for Poetry root migration Fixes #502 - Update all Docker and CI/CD references after moving Poetry config from backend/ to project root (Issue #501). Changes: 1. **Dockerfiles** (backend/Dockerfile.backend, Dockerfile.codeengine): - Add POETRY_ROOT_MIGRATION cache-bust ARG to both stages - Update COPY commands to reference pyproject.toml and poetry.lock from project root - Move poetry.lock copy alongside pyproject.toml for consistency - Add explanatory comments about Issue #501 migration 2. **GitHub Actions Workflows**: - Update 05-ci.yml: Fix poetry cache key to use 'poetry.lock' instead of 'backend/poetry.lock' - Update 03-build-secure.yml: Change backend context from 'backend' to '.' for correct file resolution 3. **PyTorch Version Update**: - Upgrade torch from 2.5.0+cpu to 2.6.0+cpu - Upgrade torchvision from 0.20.0+cpu to 0.21.0+cpu - Reason: 2.5.0+cpu not available for ARM64 architecture - New versions are compatible with both ARM64 and x86_64 4. **Secret Management**: - Add pragma: allowlist secret comments to test secrets in 05-ci.yml - Prevents false positives in detect-secrets pre-commit hook Impact: - Fixes failing CI/CD test: TestMakefileTargetsDirect.test_make_build_backend_minimal - Docker builds now correctly find pyproject.toml and poetry.lock at project root - Maintains compatibility with both local development (ARM64) and CI (x86_64) - GitHub Actions workflows correctly cache Poetry dependencies Testing: - Docker build context validated - All references to backend/pyproject.toml and backend/poetry.lock removed - Cache keys updated to match new file locations 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: Comprehensive test fixes after PYTHONPATH removal from Makefile This commit addresses all test failures that occurred after removing PYTHONPATH from the Makefile, ensuring clean separation between backend and root directories. Test Results: 1,508 unit + 177 atomic + 126 integration = 1,811 passing tests Changes: - Import path fixes (relative imports after PYTHONPATH removal) - Test logic fixes (3 files updated to match service behavior) - Pydantic V2 migration (removed deprecated json_encoders) - Atomic test configuration (pytest-atomic.ini moved to root) - Integration test fixes (removed backend. prefix from patches) - Configuration updates (pre-commit, markdownlint, secrets baseline) Related: Poetry root migration (branch: refactor/poetry-to-root-clean) * fix(ci): Exclude playwright tests from pytest collection Playwright tests require the playwright package which is not in project dependencies. Added norecursedirs to exclude tests/playwright from test collection. This fixes CI failure: ModuleNotFoundError: No module named 'playwright' * fix(tests): Fix 5 failing unit tests in CI Fixed test issues after PYTHONPATH removal: 1. test_audio_storage.py - Removed Path mocking, test actual behavior - Test now verifies default path creation instead of mock interaction 2-5. test_conversation_message_repository.py - Fixed schema mocking - Patched from_db_message at schema module level, not repository - Added proper refresh mock to set required fields (id, created_at) - All 4 failing tests now pass Tests passing locally: - test_initialization_with_default_path - test_create_message_success - test_create_message_integrity_error - test_get_by_id_success - test_get_by_id_not_found 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix(tests): Fix Docker build test after Poetry root migration Updated test to copy pyproject.toml and poetry.lock from project root instead of backend/ directory. Changes: - Added pyproject.toml and poetry.lock to root files_to_copy list - Removed these files from backend_files list - Added comment explaining Poetry root migration (Issue #501) This fixes the Docker build failure: ERROR: failed to compute cache key: "/pyproject.toml": not found Root cause: Makefile test was copying Poetry files from backend/ but they've been moved to project root in the Poetry migration. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix: Comprehensive CI/CD fixes for PR #506 - Remove PYTHONPATH from GitHub Actions workflows (pyproject.toml handles it) - Fix pytest collection to use tests/unit/ instead of marker filtering (1508 tests) - Fix Dockerfile torchvision installation (use PyPI version, not +cpu) - Install PyTorch CPU wheels BEFORE Poetry to prevent CUDA builds (saves 6GB disk space) - Normalize import paths in vectordb stores and service layer - Remove obsolete test_docling_processor.py (644 lines deleted) - Update tests to use correct package paths Fixes: - GitHub Actions pytest workflow now runs all 1508 unit tests - Docker build no longer runs out of disk space - Makefile direct tests pass - All local tests pass (verified with poetry run pytest) * fix(docker): Update Dockerfile comments for clarity Updated comments in Dockerfile to better explain the PyTorch CPU-only installation process. No functional changes. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix(docker): Use Docling's approach for CPU-only PyTorch installation Simplified PyTorch CPU-only installation by using PIP_EXTRA_INDEX_URL environment variable, matching Docling's official Docker approach. Changes: - Removed complex multi-step PyTorch installation - Use PIP_EXTRA_INDEX_URL=https://download.pytorch.org/whl/cpu - Single Poetry install command installs all deps with CPU-only PyTorch - Saves ~6GB vs CUDA version This approach is officially recommended by Docling project: https://github.com/docling-project/docling/blob/main/Dockerfile Root cause: poetry.lock has PyTorch 2.8.0 (CUDA). Setting extra-index-url during poetry install ensures CPU-only wheels are used instead. Fixes: https://github.com/manavgup/rag_modulo/actions/runs/18861121839 Issue: #506 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix(docker): Bust Docker cache to force CPU-only PyTorch rebuild Updated CACHE_BUST ARG from 20251027 to 20251028 to invalidate Docker layer cache and force rebuild with PIP_EXTRA_INDEX_URL for CPU-only PyTorch. Issue: Even though Dockerfile was fixed to use CPU-only PyTorch, Docker was using cached layers from previous builds that had CUDA PyTorch. Solution: Change CACHE_BUST ARG value to force all layers after it to rebuild, ensuring the poetry install step uses the CPU-only index. Related: #506 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix(docker): Actually use CACHE_BUST ARG to invalidate cache Added RUN command that references CACHE_BUST ARG to force Docker to invalidate cache and rebuild subsequent layers. Issue: ARG was declared but never used, so Docker continued using cached layers with CUDA PyTorch. Fix: Added 'RUN echo "Cache bust: $CACHE_BUST"' which forces Docker to execute this layer whenever CACHE_BUST value changes, invalidating all subsequent cached layers including poetry install. Related: #506 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix(docker): Use pip with CPU-only PyTorch index to bypass poetry.lock **Problem**: PR #506 CI failing with "no space left on device" due to NVIDIA CUDA libraries (~6-8GB) being installed from poetry.lock. **Root Cause**: poetry.lock has torch==2.8.0 (CUDA version) with NVIDIA dependencies as transitive deps for Linux systems. Even with PIP_EXTRA_INDEX_URL set, `poetry install` installs exactly what's in poetry.lock, ignoring the extra index. **Solution**: Use pip install directly to bypass poetry.lock and install dependencies from pyproject.toml with CPU-only PyTorch index. This matches Docling's official Docker approach. **Changes**: - Copy backend/ directory before pip install (needed for -e .) - Use pip install -e . with --extra-index-url for CPU-only PyTorch - Bypasses poetry.lock entirely, resolving deps from pyproject.toml - Reduces image size by ~6-8GB (NVIDIA libs not installed) **Testing**: Will validate in CI that NVIDIA libraries are not installed. Related: #506, #507 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> * fix(docker): Install dependencies from pyproject.toml bypassing poetry.lock **Problem**: Previous fix failed because pyproject.toml has package-mode=false, which prevents editable install with `pip install -e .`. **Solution**: Extract dependencies from pyproject.toml and install them directly with pip using CPU-only PyTorch index. This approach: - Bypasses poetry.lock completely - Installs CPU-only PyTorch from https://download.pytorch.org/whl/cpu - Works with package-mode=false - Matches Docling's Docker approach **Changes**: - Extract dependencies using tomllib from pyproject.toml - Install each dependency with pip --extra-index-url - Removed editable install (-e .) which doesn't work with package-mode=false Related: #506 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: manavgup <manavg@gmail.com> * fix(docker): Normalize dependency strings to handle spaces and parentheses **Problem**: Dependencies like "psutil (>=7.0.0,<8.0.0)" were being split by xargs into separate arguments, causing pip to fail with "Invalid requirement". **Root Cause**: pyproject.toml uses format with spaces before parentheses (e.g., "psutil (>=7.0.0,<8.0.0)"). When piped through xargs, the space causes splitting into "psutil" and "(>=7.0.0,<8.0.0)", which pip treats as invalid. **Solution**: Normalize dependency strings by removing spaces and parentheses: - "psutil (>=7.0.0,<8.0.0)" -> "psutil>=7.0.0,<8.0.0" - "docling (>=2.0.0)" -> "docling>=2.0.0" **Benefits**: - Maintains all version constraints correctly - Works with xargs without quoting issues - Still bypasses poetry.lock for CPU-only PyTorch Related: #506 Signed-off-by: Manav Gupta <manavgup@ca.ibm.com> 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com> --------- Signed-off-by: manavgup <manavg@gmail.com> Co-authored-by: Claude <noreply@anthropic.com>
1 parent 98572db commit 53475fc

File tree

104 files changed

+1738
-1003
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

104 files changed

+1738
-1003
lines changed

.github/workflows/01-lint.yml

Lines changed: 7 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
name: Lint & Static Analysis
22

3-
# This workflow uses tool versions and configurations from backend/pyproject.toml
3+
# This workflow uses tool versions and configurations from pyproject.toml (project root)
44
# All linting tools (Ruff, MyPy, Pylint, Pydocstyle) reference pyproject.toml as single source of truth
55
# Tool versions: Ruff ^0.14.0, MyPy ^1.15.0, Pylint ^3.3.8, Pydocstyle ^6.3.0
66

@@ -119,48 +119,44 @@ jobs:
119119
blocking: true
120120
cmd: |
121121
pip install toml
122-
python -c "import toml; toml.load(open('backend/pyproject.toml'))"
122+
python -c "import toml; toml.load(open('pyproject.toml'))"
123123
124124
# Python backend linting (blocking) - Check ALL backend Python files
125125
- id: ruff
126126
name: "Ruff (Lint + Format)"
127-
working-directory: backend
128127
blocking: true
129128
cmd: |
130129
pip install poetry
131130
poetry install --only dev
132131
echo "Running Ruff linting..."
133-
poetry run ruff check . --config pyproject.toml
132+
poetry run ruff check backend --config pyproject.toml
134133
echo "Running Ruff formatting check..."
135-
poetry run ruff format --check . --config pyproject.toml
134+
poetry run ruff format --check backend --config pyproject.toml
136135
137136
# Python type/quality checking (non-blocking / informational)
138137
- id: mypy
139138
name: "MyPy Type Check (Informational)"
140-
working-directory: backend
141139
blocking: false
142140
cmd: |
143141
pip install poetry
144142
poetry install --only dev
145-
poetry run mypy . --config-file pyproject.toml --ignore-missing-imports --show-error-codes || true
143+
poetry run mypy backend --config-file pyproject.toml --ignore-missing-imports --show-error-codes || true
146144
147145
- id: pylint
148146
name: "Pylint Quality (Informational)"
149-
working-directory: backend
150147
blocking: false
151148
cmd: |
152149
pip install poetry
153150
poetry install --only dev
154-
poetry run pylint rag_solution/ vectordbs/ core/ auth/ --rcfile=pyproject.toml || true
151+
poetry run pylint backend/rag_solution/ backend/vectordbs/ backend/core/ backend/auth/ --rcfile=pyproject.toml || true
155152
156153
- id: pydocstyle
157154
name: "Docstring Style (Informational)"
158-
working-directory: backend
159155
blocking: false
160156
cmd: |
161157
pip install poetry
162158
poetry install --only dev
163-
poetry run pydocstyle rag_solution/ vectordbs/ core/ auth/ --config=pyproject.toml --count || true
159+
poetry run pydocstyle backend/rag_solution/ backend/vectordbs/ backend/core/ backend/auth/ --config=pyproject.toml --count || true
164160
165161
name: ${{ matrix.name }}
166162

.github/workflows/03-build-secure.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ jobs:
3434
include:
3535
- service: backend
3636
dockerfile: backend/Dockerfile.backend
37-
context: backend
37+
context: .
3838
image_name: rag-modulo-backend
3939
ghcr_image: ghcr.io/manavgup/rag_modulo/backend
4040
- service: frontend

.github/workflows/04-pytest.yml

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -77,20 +77,20 @@ jobs:
7777
uses: actions/cache@v4
7878
with:
7979
path: ~/.cache/pypoetry
80-
key: ${{ runner.os }}-poetry-${{ hashFiles('backend/poetry.lock') }}
80+
key: ${{ runner.os }}-poetry-${{ hashFiles('poetry.lock') }}
8181
restore-keys: |
8282
${{ runner.os }}-poetry-
8383
8484
# 5️⃣ Install Python dependencies
8585
- name: 📥 Install dependencies
86-
run: cd backend && poetry install --with dev,test
86+
run: poetry install --with dev,test
8787

8888
# 6️⃣ Run unit/atomic tests with coverage
8989
- name: 🧪 Run unit tests with coverage
9090
run: |
91-
# Run from backend directory using poetry, but test from project root
92-
cd backend && poetry run pytest ../tests/ -m "unit or atomic" \
93-
--cov=rag_solution \
91+
# Run from project root using poetry
92+
poetry run pytest tests/unit/ \
93+
--cov=backend/rag_solution \
9494
--cov-report=term-missing \
9595
--cov-report=html \
9696
--tb=short \

.github/workflows/05-ci.yml

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,7 @@ jobs:
4949
timeout_minutes: 10
5050
max_attempts: 3
5151
retry_wait_seconds: 30
52-
command: cd backend && poetry install --with dev,test
52+
command: poetry install --with dev,test
5353

5454
- name: Run test isolation checker
5555
run: |
@@ -72,10 +72,10 @@ jobs:
7272
env:
7373
# Essential environment variables for current atomic tests
7474
# TODO: Remove these once issue #172 (test isolation) is fixed
75-
JWT_SECRET_KEY: test-secret-key-for-ci
75+
JWT_SECRET_KEY: test-secret-key-for-ci # pragma: allowlist secret
7676
RAG_LLM: openai
7777
WATSONX_INSTANCE_ID: test-instance-id
78-
WATSONX_APIKEY: test-api-key
78+
WATSONX_APIKEY: test-api-key # pragma: allowlist secret
7979
WATSONX_URL: https://test.watsonx.com
8080
# Additional variables needed by tests
8181
VECTOR_DB: milvus
@@ -97,7 +97,7 @@ jobs:
9797
path: |
9898
~/.cache/pypoetry
9999
backend/.venv
100-
key: ${{ runner.os }}-poetry-${{ hashFiles('backend/poetry.lock') }}
100+
key: ${{ runner.os }}-poetry-${{ hashFiles('poetry.lock') }}
101101
restore-keys: |
102102
${{ runner.os }}-poetry-
103103
@@ -108,7 +108,6 @@ jobs:
108108
max_attempts: 3
109109
retry_wait_seconds: 30
110110
command: |
111-
cd backend
112111
pip install poetry
113112
poetry config virtualenvs.in-project true
114113
# Regenerate lock file to ensure sync

.github/workflows/makefile-testing.yml

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,6 @@ jobs:
3232

3333
- name: Install dependencies
3434
run: |
35-
cd backend
3635
poetry install --with test
3736
3837
- name: Set up Docker Buildx
@@ -41,7 +40,7 @@ jobs:
4140
- name: Test Makefile targets directly
4241
run: |
4342
echo "Running direct Makefile tests..."
44-
cd backend && poetry run pytest ../tests/test_makefile_targets_direct.py -v
43+
poetry run pytest tests/test_makefile_targets_direct.py -v
4544
4645
- name: Test make help
4746
run: |

.github/workflows/poetry-lock-check.yml

Lines changed: 8 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -6,13 +6,13 @@ name: Poetry Lock Validation
66
on:
77
pull_request:
88
paths:
9-
- 'backend/pyproject.toml'
10-
- 'backend/poetry.lock'
9+
- 'pyproject.toml'
10+
- 'poetry.lock'
1111
push:
1212
branches: [main, develop]
1313
paths:
14-
- 'backend/pyproject.toml'
15-
- 'backend/poetry.lock'
14+
- 'pyproject.toml'
15+
- 'poetry.lock'
1616

1717
permissions:
1818
contents: read
@@ -37,7 +37,6 @@ jobs:
3737

3838
- name: Validate poetry.lock is in sync
3939
run: |
40-
cd backend
4140
echo "🔍 Checking if poetry.lock is in sync with pyproject.toml..."
4241
4342
if poetry check --lock; then
@@ -50,18 +49,16 @@ jobs:
5049
echo "but poetry.lock was not regenerated."
5150
echo ""
5251
echo "To fix this:"
53-
echo " 1. cd backend"
54-
echo " 2. poetry lock"
55-
echo " 3. git add poetry.lock"
56-
echo " 4. git commit -m 'chore: update poetry.lock'"
57-
echo " 5. git push"
52+
echo " 1. poetry lock"
53+
echo " 2. git add poetry.lock"
54+
echo " 3. git commit -m 'chore: update poetry.lock'"
55+
echo " 4. git push"
5856
echo ""
5957
exit 1
6058
fi
6159
6260
- name: Check for dependency conflicts
6361
run: |
64-
cd backend
6562
echo "🔍 Checking for dependency conflicts..."
6663
poetry check
6764
echo "✅ No dependency conflicts detected"

.markdownlint.json

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
{
2+
"default": true,
3+
"MD013": {
4+
"line_length": 120,
5+
"code_blocks": false,
6+
"tables": false
7+
},
8+
"MD024": {
9+
"siblings_only": true
10+
},
11+
"MD025": false,
12+
"MD033": false,
13+
"MD036": false,
14+
"MD040": false
15+
}

.pre-commit-config.yaml

Lines changed: 11 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -52,28 +52,31 @@ repos:
5252
args: [--baseline .secrets.baseline]
5353

5454
# Python hooks - must match CI configuration exactly
55-
# CI runs from backend/ with: poetry run ruff check . --config pyproject.toml
56-
# CI runs from backend/ with: poetry run ruff format --check . --config pyproject.toml
55+
# Poetry moved to root (October 2025) - pyproject.toml now at root level
56+
# CI runs from root with: poetry run ruff check backend/ --config pyproject.toml
5757
- repo: https://github.com/astral-sh/ruff-pre-commit
5858
rev: v0.14.0
5959
hooks:
6060
- id: ruff
6161
name: Ruff linting (matches CI)
6262
files: ^backend/
63-
args: [--fix, --config=backend/pyproject.toml]
63+
args: [--fix, --config=pyproject.toml]
6464
- id: ruff-format
6565
name: Ruff formatting (matches CI)
6666
files: ^backend/
67-
args: [--config=backend/pyproject.toml]
67+
args: [--config=pyproject.toml]
6868

69-
# Poetry lock file validation
69+
# Poetry lock file validation (Poetry moved to root in October 2025)
7070
- repo: local
7171
hooks:
7272
- id: poetry-lock-check
7373
name: Check poetry.lock is in sync with pyproject.toml
74-
entry: bash -c 'cd backend && poetry check --lock || (echo "❌ poetry.lock is out of sync. Run - cd backend && poetry lock" && exit 1)'
74+
entry: bash
75+
args:
76+
- -c
77+
- "poetry check --lock || (echo 'poetry.lock is out of sync. Run: poetry lock' && exit 1)"
7578
language: system
76-
files: ^backend/(pyproject\.toml|poetry\.lock)$
79+
files: ^(pyproject\.toml|poetry\.lock)$
7780
pass_filenames: false
7881

7982
# Python test hooks (run on push for velocity optimization)
@@ -110,7 +113,7 @@ repos:
110113
rev: v0.35.0
111114
hooks:
112115
- id: markdownlint
113-
args: [--fix]
116+
args: [--fix, --config, .markdownlint.json]
114117

115118
# Commit message hooks
116119
- repo: https://github.com/commitizen-tools/commitizen

0 commit comments

Comments
 (0)