-
Notifications
You must be signed in to change notification settings - Fork 4
Fix Backend Docker Build Caching Issue #685
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…failures Disables the cache for the backend Docker build in the `03-build-secure.yml` GitHub Actions workflow. This change is necessary to resolve an issue where the security scans are failing due to a `failed to calculate checksum` error when copying the `backend/vectordbs` directory. This error is likely caused by a corrupted cache layer in the Docker build process. Disabling the cache forces a clean build, which should prevent this error from occurring. Pre-commit checks failed due to a large number of pre-existing mypy errors that are unrelated to this change.
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with For security, I will only act on instructions from the user who triggered this task. New to Jules? Learn more at jules.google/docs. |
🚀 Development Environment OptionsThis repository supports Dev Containers for a consistent development environment. Option 1: GitHub Codespaces (Recommended)Create a cloud-based development environment:
Option 2: VS Code Dev Containers (Local)Use Dev Containers on your local machine:
Option 3: Traditional Local SetupSet up the development environment manually: # Clone the repository
git clone https://github.com/manavgup/rag_modulo.git
cd rag_modulo
git checkout fix/backend-docker-build-caching
# Initialize development environment
make dev-init
make dev-build
make dev-up
make dev-validateAvailable CommandsOnce in your development environment: make help # Show all available commands
make dev-validate # Validate environment setup
make test-atomic # Run atomic tests
make test-unit # Run unit tests
make lint # Run lintingServices AvailableWhen running
This automated message helps reviewers quickly set up the development environment. |
Code Review: Fix Backend Docker Build Caching IssueSummaryThis PR disables Docker build caching for the backend service to work around a ✅ What Works Well
|
Implements a targeted cache invalidation strategy to resolve the failing security scans. This change addresses the `failed to calculate checksum` error by adding a `CACHE_BUST` build argument to the Docker build process in the `03-build-secure.yml` GitHub Actions workflow. This argument is then used in the `backend/Dockerfile.backend` to invalidate the cache before the `COPY` commands, forcing a fresh copy of the application files. This approach is more efficient than disabling the cache entirely and addresses the feedback from the code review. Pre-commit checks failed due to a large number of pre-existing mypy errors that are unrelated to this change.
Code Review: Fix Backend Docker Build Caching IssueSummaryThis PR adds a CACHE_BUST argument to the backend Docker build workflow to invalidate caching issues. While this addresses the immediate symptoms, there are several concerns about this approach. Critical Issues1. Duplicate CACHE_BUST Declaration (Bug)The CACHE_BUST argument is declared THREE times in Dockerfile.backend:
Problem: The third declaration (line 91) overrides line 79 and will be null if not passed as a build-arg. This creates inconsistency. Recommendation: Remove lines 90-92 entirely. Lines 77-79 already handle the runtime stage CACHE_BUST. 2. Root Cause Not AddressedThe PR description states this is a temporary workaround but doesn't explain:
Recommendation: Before merging, document the specific error/symptom that triggered this fix and create a follow-up issue to investigate the root cause. Design Concerns3. Cache Invalidation Strategy MismatchThere are now two different cache invalidation strategies:
Problem: Using github.run_number defeats the purpose of Docker layer caching entirely. Every build will rebuild all layers after line 91, even if nothing changed. Impact on CI Performance:
Recommendation: Use content-based hashing instead of run number to preserve caching benefits while ensuring fresh builds when content changes. 4. Inconsistent with Existing PatternsLooking at other files, CACHE_BUST uses static dates for intentional invalidation. This PR introduces a completely different pattern (auto-incrementing on every build) without explanation. RecommendationsImmediate (Before Merge):
Follow-up (Create Issues):
VerdictRecommend changes before merge While this may fix the immediate symptom, the implementation has bugs (duplicate ARG) and design issues (cache invalidation every build) that could hurt CI performance. The lack of root cause analysis makes it difficult to assess if this is the right fix. Suggested path forward:
Review based on CLAUDE.md guidelines - Focus on code quality, performance, and maintainability |
…ilures Implements a content-based cache invalidation strategy to resolve the failing security scans. This change addresses the `failed to calculate checksum` error by using a hash of the `backend` directory contents as a `CACHE_BUST` build argument in the `03-build-secure.yml` GitHub Actions workflow. This ensures that the cache is invalidated only when the contents of the backend directory change. This approach is more efficient than disabling the cache entirely or using the run number for cache busting, as it preserves the benefits of layer caching when no files have changed. This change also removes the duplicate `ARG CACHE_BUST` declaration from the `backend/Dockerfile.backend` file. Pre-commit checks failed due to a large number of pre-existing mypy errors that are unrelated to this change.
…r builds
This PR fixes the Docker build cache issue that was causing security scans to fail
with 'failed to calculate checksum' errors when copying backend/vectordbs directory.
Root Cause:
-----------
The Docker build was failing due to corrupted cache layers when copying backend
files (particularly backend/vectordbs). The error occurred during the COPY step
when Docker couldn't calculate the checksum, likely due to cache inconsistency
between builds.
Solution:
---------
Implemented content-based cache invalidation using hashFiles('backend/**') instead
of static dates or run numbers. This ensures:
1. Cache is invalidated ONLY when backend source files actually change
2. Cache is preserved when backend files are unchanged (maintains build speed)
3. Consistent cache behavior across CI runs
Changes:
--------
- Separated cache invalidation concerns:
* Builder stage: CACHE_BUST=20251119 (static, for PyTorch CPU-only migration)
* Runtime stage: BACKEND_CACHE_BUST (dynamic, content-based hash)
- Renamed runtime ARG from CACHE_BUST to BACKEND_CACHE_BUST to avoid confusion
with builder stage and make purpose explicit
- Added content-based hash via workflow: BACKEND_CACHE_BUST=${{ hashFiles('backend/**') }}
- Added RUN command using BACKEND_CACHE_BUST before COPY commands to ensure
cache invalidation happens at the right layer boundary
Benefits:
---------
- Fixes 'failed to calculate checksum' errors in security scans
- Preserves Docker layer caching when backend files are unchanged
- Maintains CI/CD optimization (Issue #349) - builds remain fast
- Clear separation of concerns between builder and runtime cache invalidation
Follow-up:
----------
The corrupted cache issue may indicate a deeper Docker BuildKit cache problem
that should be investigated if it recurs. Consider monitoring cache hit rates
and investigating BuildKit cache corruption scenarios.
Fixes: Security scan failures in 03-build-secure.yml workflow
Related: Issue #349 (CI/CD pipeline speedup)
Code Review - PR 685: Fix Backend Docker Build Caching IssueSummaryThis PR implements content-based cache invalidation to resolve Docker build failures (failed to calculate checksum errors) in security scans using hashFiles to invalidate cache only when backend source files change. StrengthsRoot Cause Analysis:
Solution Design:
CRITICAL: No Default Value for BACKEND_CACHE_BUSTProblem: ARG BACKEND_CACHE_BUST has no default value in Dockerfile (line 82) Risk:
Recommendation: Add default value in Dockerfile: ARG BACKEND_CACHE_BUST=local-buildThis ensures:
MEDIUM: hashFiles Pattern Too BroadCurrent: Issue: Includes pycache, .pyc, .log files and may invalidate cache unnecessarily Recommendation: Use more targeted pattern: hashFiles('backend/**/*.py', 'backend/Dockerfile.backend', 'pyproject.toml', 'poetry.lock')MEDIUM: Insufficient Testing DocumentationMissing:
Recommendation: Test before merge:
MINOR: Documentation Updates NeededFiles to update:
Performance Impact
CI/CD Impact:
Security ConsiderationsPositive:
Note: Ensure no secrets in backend/ directory (should be in .env which is gitignored) Checklist Before Merge
Verdict: APPROVE WITH CHANGESOverall: Correct approach, well-documented, aligns with CI/CD goals Must Fix: Add default value for BACKEND_CACHE_BUST before merge Should Fix: Refine hashFiles pattern and add local build tests Confidence: High - Solution is sound, needs local build compatibility fix Once the default ARG value is added, this is ready to merge! Generated by Claude Code Review |
CRITICAL: Add default value for BACKEND_CACHE_BUST
- Added default value 'local-build' to support local Docker builds
- Local builds (make build-backend, docker build) now work without explicit ARG
- CI builds override with content hash automatically
MEDIUM: Refine hashFiles pattern for better cache efficiency
- Changed from 'backend/**' to targeted patterns excluding cache/log files
- New pattern: hashFiles('backend/**/*.py', 'backend/Dockerfile.backend', 'pyproject.toml', 'poetry.lock')
- Prevents unnecessary cache invalidation from .pyc, __pycache__, .log files
- More precise cache invalidation only when source code changes
MINOR: Update documentation
- Updated docs/troubleshooting/docker.md with BACKEND_CACHE_BUST guidance
- Added Docker build cache invalidation section to CLAUDE.md
- Added comment to Makefile documenting default value usage
- Documented local vs CI build strategies
Benefits:
- Local builds work seamlessly without build-arg
- CI builds have optimized cache invalidation (faster builds)
- Clear documentation for developers and CI/CD processes
- Better cache hit rates by excluding irrelevant files
Fixes code review feedback on PR #685
Code Review: Fix Backend Docker Build Caching IssueThank you for this PR! This is a well-thought-out solution to the Docker cache invalidation problem. Here's my detailed review: ✅ Strengths1. Smart Cache Invalidation StrategyThe content-based cache invalidation using
2. Clear Two-Tier ApproachThe distinction between builder and runtime stages is well-documented:
3. Local vs CI Differentiation
4. Comprehensive Documentation
🔍 Issues & Concerns1. Critical: Workflow Syntax Error
|
This pull request fixes the failing security scans by disabling the cache for the backend Docker build. This is a temporary workaround to address the immediate issue of the failing build. The root cause of the caching issue should be investigated further.
PR created automatically by Jules for task 5180592018382084440 started by @manavgup