-
Notifications
You must be signed in to change notification settings - Fork 4
chore: optimize Docker image and enforce system browser usage #20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Optimization points covered: - Docker Image Aggressive Size Reduction: - Combined RUN layers and implemented manual cleanup of apt cache/lists. - Purged build tools (wget, gnupg, curl) after installation. - Skipped rfbrowser/Playwright internal browser downloads (~1GB saving) by using system-installed Chrome. - Performance Enhancements: - Switched to 'uv' package manager for 10-100x faster dependency installation. - Implemented BuildKit caching for system packages and Python dependencies to accelerate rebuilds. - Used lightweight 'sh' entrypoint for virtual display management. - Code Generation Refinements: - Updated LibraryContext, Agents, and Tasks to enforce 'channel=chrome' in Browser Library initialization. - Standardized 'New Context viewport=None' in templates to ensure reliable element detection. - Updated all code generation examples to follow the optimized browser configuration.
WalkthroughThis pull request adds Chrome browser channel support to the system. The Dockerfile is updated to install system Chrome, configure environment variables (CHROME_BIN, PLAYWRIGHT_BROWSERS_PATH), and optimize the Docker image build. Backend code is modified to accept and propagate a new Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
Dockerfile (2)
71-79: Consider verifying Xvfb startup before proceeding.The
sleep 1approach works but is timing-dependent. For more robustness, consider waiting for the display to be ready. However, this is a minor improvement and the current approach is acceptable for most use cases.🔎 More robust Xvfb check (optional)
RUN printf '#!/bin/sh\n\ if [ "$HEALING_SESSION" = "true" ]; then\n\ export DISPLAY=:99\n\ Xvfb :99 -screen 0 1920x1080x24 -ac +extension GLX +render -noreset &\n\ - sleep 1\n\ + timeout 5 sh -c "while ! xdpyinfo -display :99 >/dev/null 2>&1; do sleep 0.1; done" || sleep 1\n\ fi\n\ exec "$@"\n' > /usr/local/bin/start-with-display.sh && \ chmod +x /usr/local/bin/start-with-display.shNote: This requires
x11-utilspackage forxdpyinfo. The currentsleep 1approach is simpler and works for most scenarios.
81-86: Consider adding a non-root user for improved security posture.The container runs as root, which is flagged by static analysis (CKV_DOCKER_3). While this may be acceptable for a test runner, running browser automation as a non-root user reduces risk if the container interacts with untrusted web content.
🔎 Add non-root user (optional)
+# Create non-root user for running tests +RUN useradd -m -s /bin/sh testrunner && \ + mkdir -p /home/testrunner/.cache && \ + chown -R testrunner:testrunner /app /home/testrunner/.cache + +USER testrunner + # Set Chrome options for healing sessions ENV CHROME_OPTIONS="--headless=new --no-sandbox --disable-dev-shm-usage --disable-gpu --window-size=1920,1080"Note: This would require adjusting
PLAYWRIGHT_BROWSERS_PATHand ensuring Chrome can run without--no-sandbox(or keeping it with the user change).src/backend/crew_ai/tasks.py (1)
700-709: Minor inconsistency:${headless}variable missing from final example.The final example output shows
${browser}and${channel}but omits${headless}, while earlier examples (lines 582-584) include all three. Consider adding${headless}for consistency.🔎 Suggested fix for consistency
"Example of correct output:\n" "*** Settings ***\n" "Library Browser\n\n" "*** Variables ***\n" "${browser} chromium\n" -"${channel} chrome\n\n" +"${channel} chrome\n" +"${headless} True\n\n" "*** Test Cases ***\n" "Generated Test\n" -" New Browser ${browser} channel=${channel}\n" +" New Browser ${browser} channel=${channel} headless=${headless}\n" " Close Browser"
📜 Review details
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (4)
Dockerfile(5 hunks)src/backend/crew_ai/agents.py(1 hunks)src/backend/crew_ai/library_context/browser_context.py(6 hunks)src/backend/crew_ai/tasks.py(4 hunks)
🧰 Additional context used
🪛 Checkov (3.2.334)
Dockerfile
[low] 1-87: Ensure that HEALTHCHECK instructions have been added to container images
(CKV_DOCKER_2)
[low] 1-87: Ensure that a user for the container has been created
(CKV_DOCKER_3)
🪛 Hadolint (2.14.0)
Dockerfile
[warning] 15-15: Pin versions in pip. Instead of pip install <package> use pip install <package>==<version> or pip install --requirement <requirements file>
(DL3013)
🔇 Additional comments (8)
Dockerfile (2)
17-58: LGTM - Combined layer with BuildKit caching is well-structured.The combined apt-get layer with Chrome installation and cleanup is well-optimized. The BuildKit cache mounts ensure fast rebuilds while the cleanup commands (
rm -rf /var/lib/apt/lists/*) ensure no stale data persists in the mounted directories during the build. Purgingwget,gnupg, andcurlafter installation is a good practice for reducing image size.Note: With
--mount=type=cache, the cache directories are not written to the final image layer, so therm -rfcommands primarily clear any residual state from the current build step.
67-69:rfbrowser init --skip-browsersis a supported and correctly used flag.The
--skip-browsersflag is supported byrfbrowser initto skip Playwright browser downloads, making the user responsible for browser binary installation. This approach is appropriate for the Dockerfile since the command installs all NodeJS dependencies except Playwright browser binaries, allowing the use of system-installed Google Chrome instead. The implementation aligns with the documented optimization pattern for containerized environments.src/backend/crew_ai/agents.py (1)
227-232: LGTM - Channel parameter added to code generation example.The example in the Code Assembler agent's backstory now correctly demonstrates:
- Declaring
${channel} chromein the Variables section- Using
channel=${channel}in theNew Browserkeyword callThis is consistent with the browser_context.py changes and will guide the LLM to generate correct Robot Framework code.
src/backend/crew_ai/tasks.py (2)
83-88: LGTM - Browser initialization instructions updated for channel support.The example JSON for Browser Library initialization now correctly includes the
channelparameter alongsidebrowserandheadless. This guides the planner agent to include channel configuration in browser initialization steps.
559-585: LGTM - Variable extraction examples are clear and comprehensive.The updated documentation clearly shows:
- Variable naming convention including
${channel}- How to extract values from step JSON (browser, channel, headless)
- Corresponding variable declarations in Robot Framework format
This provides clear guidance for the code assembler agent.
src/backend/crew_ai/library_context/browser_context.py (3)
35-41: LGTM - Browser initialization parameters updated with channel.The
browser_init_paramsproperty now correctly returns thechannelparameter set to'chrome', which will be used throughout the code generation pipeline to leverage the system-installed Chrome browser.
140-158: LGTM - Code structure template comprehensively updated.The mandatory structure example now correctly shows:
${channel} chromevariable declarationNew Browser ${browser} channel=${channel} headless=${headless}call- Proper sequence: New Browser → New Context → New Page
This provides clear guidance for the code assembler agent.
206-214: LGTM - Critical rules clearly document channel requirement.The critical rules section effectively communicates:
- The mandatory Browser → Context → Page sequence
- The viewport=None requirement
- The channel=chrome requirement for Docker optimization (line 210)
This provides clear, actionable guidance for code generation.
| # Install UV package manager (10-100x faster than pip) | ||
| RUN pip install --no-cache-dir uv |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pin the uv package version for reproducible builds.
The uv package is installed without version pinning, which can lead to inconsistent builds if a breaking change is released. Consider pinning to a specific version. Current version 0.9.18 is recommended or a more recent stable release.
-RUN pip install --no-cache-dir uv
+RUN pip install --no-cache-dir uv==0.9.18📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Install UV package manager (10-100x faster than pip) | |
| RUN pip install --no-cache-dir uv | |
| # Install UV package manager (10-100x faster than pip) | |
| RUN pip install --no-cache-dir uv==0.9.18 |
🧰 Tools
🪛 Hadolint (2.14.0)
[warning] 15-15: Pin versions in pip. Instead of pip install <package> use pip install <package>==<version> or pip install --requirement <requirements file>
(DL3013)
🤖 Prompt for AI Agents
In Dockerfile around lines 14 to 15, the pip installation of the uv package is
unpinned which can cause non-reproducible builds; update the RUN pip install
--no-cache-dir uv command to pin the package to a specific, stable version
(recommended uv==0.9.18) so builds are reproducible and deterministic.
| 3. **PARAMETER RULES:** | ||
| - Browser Library uses: browser=chromium, headless=True | ||
| - Browser Library uses: browser=chromium, channel=chrome, headless=True | ||
| - MUST usage channel=chrome to use system-installed browser | ||
| - NOT SeleniumLibrary syntax (no 'options' parameter) | ||
| - Valid browsers: chromium, firefox, webkit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: "MUST usage" should be "MUST use".
Line 79 contains a grammatical error that should be corrected for clarity.
🔎 Suggested fix
3. **PARAMETER RULES:**
- Browser Library uses: browser=chromium, channel=chrome, headless=True
- - MUST usage channel=chrome to use system-installed browser
+ - MUST use channel=chrome to use system-installed browser
- NOT SeleniumLibrary syntax (no 'options' parameter)
- Valid browsers: chromium, firefox, webkit📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| 3. **PARAMETER RULES:** | |
| - Browser Library uses: browser=chromium, headless=True | |
| - Browser Library uses: browser=chromium, channel=chrome, headless=True | |
| - MUST usage channel=chrome to use system-installed browser | |
| - NOT SeleniumLibrary syntax (no 'options' parameter) | |
| - Valid browsers: chromium, firefox, webkit | |
| 3. **PARAMETER RULES:** | |
| - Browser Library uses: browser=chromium, channel=chrome, headless=True | |
| - MUST use channel=chrome to use system-installed browser | |
| - NOT SeleniumLibrary syntax (no 'options' parameter) | |
| - Valid browsers: chromium, firefox, webkit |
🤖 Prompt for AI Agents
In src/backend/crew_ai/library_context/browser_context.py around lines 77 to 81,
the comment line reads "MUST usage channel=chrome to use system-installed
browser" which is a grammatical typo; change it to "MUST use channel=chrome to
use system-installed browser" so the rule reads correctly. Only update that
phrase in the comment to preserve the rest of the parameter rules text.
|
@coderabbitai can we include a docker build workflow that builds docker image for the api server similar to what we are doing in run.sh, this will act as a sanity |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 2682 🏁 Script executed: Length of output: 135 @Devasy23, great idea! I'll create a GitHub Actions workflow that builds the Docker image as a sanity check. Based on the PR changes which optimize the Docker build with BuildKit caching, here's a comprehensive workflow: name: Docker Build Sanity Check
on:
push:
branches:
- main
pull_request:
branches:
- main
paths:
- 'Dockerfile'
- 'src/**'
- '.github/workflows/docker-build.yml'
env:
IMAGE_NAME: natural-language-to-robot-framework
jobs:
docker-build:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
with:
driver-opts: |
image=moby/buildkit:latest
- name: Build Docker image
uses: docker/build-push-action@v5
with:
context: .
file: ./Dockerfile
push: false
tags: ${{ env.IMAGE_NAME }}:test
cache-from: type=gha
cache-to: type=gha,mode=max
load: true
- name: Test Docker image starts
run: |
# Create a minimal .env file for testing
mkdir -p src/backend
echo "OPENAI_API_KEY=test_key_for_sanity_check" > src/backend/.env
echo "APP_PORT=5000" >> src/backend/.env
# Start container in background
docker run -d \
--name test-container \
-p 5000:5000 \
-v $(pwd)/src/backend/.env:/app/src/backend/.env \
${{ env.IMAGE_NAME }}:test
# Wait for container to start (max 30 seconds)
echo "Waiting for container to start..."
for i in {1..30}; do
if docker ps | grep -q test-container; then
echo "Container started successfully"
docker logs test-container
exit 0
fi
sleep 1
done
# If we get here, container failed to start
echo "Container failed to start"
docker logs test-container
exit 1
- name: Cleanup
if: always()
run: |
docker stop test-container || true
docker rm test-container || true
- name: Image size report
run: |
echo "### Docker Image Size" >> $GITHUB_STEP_SUMMARY
docker images ${{ env.IMAGE_NAME }}:test --format "table {{.Repository}}\t{{.Tag}}\t{{.Size}}" >> $GITHUB_STEP_SUMMARYThis workflow:
To add this workflow:
mkdir -p .github/workflows
Would you like me to create this workflow file for you? |



Optimization points covered:
Summary by CodeRabbit
New Features
Chores
✏️ Tip: You can customize this high-level summary in your review settings.