chore: optimize Docker image and enforce system browser usage #20

infoxica · 2025-12-18T18:49:19Z

Optimization points covered:

Docker Image Aggressive Size Reduction:
- Combined RUN layers and implemented manual cleanup of apt cache/lists.
- Purged build tools (wget, gnupg, curl) after installation.
- Skipped rfbrowser/Playwright internal browser downloads (~1GB saving) by using system-installed Chrome.
Performance Enhancements:
- Switched to 'uv' package manager for 10-100x faster dependency installation.
- Implemented BuildKit caching for system packages and Python dependencies to accelerate rebuilds.
- Used lightweight 'sh' entrypoint for virtual display management.
Code Generation Refinements:
- Updated LibraryContext, Agents, and Tasks to enforce 'channel=chrome' in Browser Library initialization.
- Standardized 'New Context viewport=None' in templates to ensure reliable element detection.
- Updated all code generation examples to follow the optimized browser configuration.

Summary by CodeRabbit

New Features
- Added support for Chrome browser with a configurable channel parameter for browser initialization.
Chores
- Optimized Docker image with improved caching and reduced build time and image size.
- Integrated system-installed Chrome for enhanced Docker environment efficiency.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Optimization points covered: - Docker Image Aggressive Size Reduction: - Combined RUN layers and implemented manual cleanup of apt cache/lists. - Purged build tools (wget, gnupg, curl) after installation. - Skipped rfbrowser/Playwright internal browser downloads (~1GB saving) by using system-installed Chrome. - Performance Enhancements: - Switched to 'uv' package manager for 10-100x faster dependency installation. - Implemented BuildKit caching for system packages and Python dependencies to accelerate rebuilds. - Used lightweight 'sh' entrypoint for virtual display management. - Code Generation Refinements: - Updated LibraryContext, Agents, and Tasks to enforce 'channel=chrome' in Browser Library initialization. - Standardized 'New Context viewport=None' in templates to ensure reliable element detection. - Updated all code generation examples to follow the optimized browser configuration.

coderabbitai · 2025-12-18T18:49:29Z

Walkthrough

This pull request adds Chrome browser channel support to the system. The Dockerfile is updated to install system Chrome, configure environment variables (CHROME_BIN, PLAYWRIGHT_BROWSERS_PATH), and optimize the Docker image build. Backend code is modified to accept and propagate a new channel parameter throughout Robot Framework code generation, defaulting to chrome for Docker optimization.

Changes

Cohort / File(s)	Summary
Docker Configuration `Dockerfile`	Adds CHROME_BIN and PLAYWRIGHT_BROWSERS_PATH environment variables; installs system dependencies and Google Chrome; refactors Playwright setup to use --skip-browsers flag; cleans up apt caches for smaller image size
Backend Browser Channel Integration `src/backend/crew_ai/agents.py`	Adds ${channel} variable declaration in Robot Framework template; updates New Browser keyword call to include channel parameter
	Browser Context Configuration `src/backend/crew_ai/library_context/browser_context.py`
	Task Generation `src/backend/crew_ai/tasks.py`

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Dockerfile: Verify Chrome installation steps work correctly and image layer caching is optimized
Parameter propagation: Confirm channel parameter is consistently applied across all Robot Framework code generation (agents.py, browser_context.py, tasks.py) with correct default value (chrome)
Template syntax: Ensure Robot Framework syntax for channel parameter in New Browser keyword is valid

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately reflects the main changes in the PR: Docker image optimization and enforcing system browser usage through channel=chrome parameter.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

sonarqubecloud · 2025-12-18T18:49:53Z

Quality Gate passed

Issues
1 New issue
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)

Dockerfile (2)
71-79: Consider verifying Xvfb startup before proceeding.

The sleep 1 approach works but is timing-dependent. For more robustness, consider waiting for the display to be ready. However, this is a minor improvement and the current approach is acceptable for most use cases.
🔎 More robust Xvfb check (optional)
 RUN printf '#!/bin/sh\n\
     if [ "$HEALING_SESSION" = "true" ]; then\n\
     export DISPLAY=:99\n\
     Xvfb :99 -screen 0 1920x1080x24 -ac +extension GLX +render -noreset &\n\
-    sleep 1\n\
+    timeout 5 sh -c "while ! xdpyinfo -display :99 >/dev/null 2>&1; do sleep 0.1; done" || sleep 1\n\
     fi\n\
     exec "$@"\n' > /usr/local/bin/start-with-display.sh && \
     chmod +x /usr/local/bin/start-with-display.sh
Note: This requires x11-utils package for xdpyinfo. The current sleep 1 approach is simpler and works for most scenarios.

81-86: Consider adding a non-root user for improved security posture.

The container runs as root, which is flagged by static analysis (CKV_DOCKER_3). While this may be acceptable for a test runner, running browser automation as a non-root user reduces risk if the container interacts with untrusted web content.
🔎 Add non-root user (optional)
+# Create non-root user for running tests
+RUN useradd -m -s /bin/sh testrunner && \
+    mkdir -p /home/testrunner/.cache && \
+    chown -R testrunner:testrunner /app /home/testrunner/.cache
+
+USER testrunner
+
 # Set Chrome options for healing sessions
 ENV CHROME_OPTIONS="--headless=new --no-sandbox --disable-dev-shm-usage --disable-gpu --window-size=1920,1080"
Note: This would require adjusting PLAYWRIGHT_BROWSERS_PATH and ensuring Chrome can run without --no-sandbox (or keeping it with the user change).
src/backend/crew_ai/tasks.py (1)
700-709: Minor inconsistency: ${headless} variable missing from final example.

The final example output shows ${browser} and ${channel} but omits ${headless}, while earlier examples (lines 582-584) include all three. Consider adding ${headless} for consistency.
🔎 Suggested fix for consistency
 "Example of correct output:\n"
 "*** Settings ***\n"
 "Library    Browser\n\n"
 "*** Variables ***\n"
 "${browser}    chromium\n"
-"${channel}    chrome\n\n"
+"${channel}    chrome\n"
+"${headless}    True\n\n"
 "*** Test Cases ***\n"
 "Generated Test\n"
-"    New Browser    ${browser}    channel=${channel}\n"
+"    New Browser    ${browser}    channel=${channel}    headless=${headless}\n"
 "    Close Browser"

📜 Review details

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 001a367 and e6886e7.

📒 Files selected for processing (4)

Dockerfile (5 hunks)
src/backend/crew_ai/agents.py (1 hunks)
src/backend/crew_ai/library_context/browser_context.py (6 hunks)
src/backend/crew_ai/tasks.py (4 hunks)

🧰 Additional context used

🪛 Checkov (3.2.334)

Dockerfile

[low] 1-87: Ensure that HEALTHCHECK instructions have been added to container images

(CKV_DOCKER_2)

[low] 1-87: Ensure that a user for the container has been created

(CKV_DOCKER_3)

🪛 Hadolint (2.14.0)

Dockerfile

[warning] 15-15: Pin versions in pip. Instead of pip install <package> use pip install <package>==<version> or pip install --requirement <requirements file>

(DL3013)

🔇 Additional comments (8)

Dockerfile (2)

17-58: LGTM - Combined layer with BuildKit caching is well-structured.

The combined apt-get layer with Chrome installation and cleanup is well-optimized. The BuildKit cache mounts ensure fast rebuilds while the cleanup commands (rm -rf /var/lib/apt/lists/*) ensure no stale data persists in the mounted directories during the build. Purging wget, gnupg, and curl after installation is a good practice for reducing image size.

Note: With --mount=type=cache, the cache directories are not written to the final image layer, so the rm -rf commands primarily clear any residual state from the current build step.

67-69: rfbrowser init --skip-browsers is a supported and correctly used flag.

The --skip-browsers flag is supported by rfbrowser init to skip Playwright browser downloads, making the user responsible for browser binary installation. This approach is appropriate for the Dockerfile since the command installs all NodeJS dependencies except Playwright browser binaries, allowing the use of system-installed Google Chrome instead. The implementation aligns with the documented optimization pattern for containerized environments.

src/backend/crew_ai/agents.py (1)

227-232: LGTM - Channel parameter added to code generation example.

The example in the Code Assembler agent's backstory now correctly demonstrates:

Declaring ${channel} chrome in the Variables section

Using channel=${channel} in the New Browser keyword call

This is consistent with the browser_context.py changes and will guide the LLM to generate correct Robot Framework code.

src/backend/crew_ai/tasks.py (2)

83-88: LGTM - Browser initialization instructions updated for channel support.

The example JSON for Browser Library initialization now correctly includes the channel parameter alongside browser and headless. This guides the planner agent to include channel configuration in browser initialization steps.

559-585: LGTM - Variable extraction examples are clear and comprehensive.

The updated documentation clearly shows:

Variable naming convention including ${channel}

How to extract values from step JSON (browser, channel, headless)

Corresponding variable declarations in Robot Framework format

This provides clear guidance for the code assembler agent.

src/backend/crew_ai/library_context/browser_context.py (3)

35-41: LGTM - Browser initialization parameters updated with channel.

The browser_init_params property now correctly returns the channel parameter set to 'chrome', which will be used throughout the code generation pipeline to leverage the system-installed Chrome browser.

140-158: LGTM - Code structure template comprehensively updated.

The mandatory structure example now correctly shows:

${channel} chrome variable declaration

New Browser ${browser} channel=${channel} headless=${headless} call

Proper sequence: New Browser → New Context → New Page

This provides clear guidance for the code assembler agent.

206-214: LGTM - Critical rules clearly document channel requirement.

The critical rules section effectively communicates:

The mandatory Browser → Context → Page sequence

The viewport=None requirement

The channel=chrome requirement for Docker optimization (line 210)

This provides clear, actionable guidance for code generation.

coderabbitai · 2025-12-18T18:52:25Z

Dockerfile

 # Install UV package manager (10-100x faster than pip)
 RUN pip install --no-cache-dir uv


⚠️ Potential issue | 🟡 Minor

Pin the uv package version for reproducible builds.

The uv package is installed without version pinning, which can lead to inconsistent builds if a breaking change is released. Consider pinning to a specific version. Current version 0.9.18 is recommended or a more recent stable release.

-RUN pip install --no-cache-dir uv +RUN pip install --no-cache-dir uv==0.9.18

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# Install UV package manager (10-100x faster than pip)

RUN pip install --no-cache-dir uv

# Install UV package manager (10-100x faster than pip)

RUN pip install --no-cache-dir uv==0.9.18

🧰 Tools

🪛 Hadolint (2.14.0)

[warning] 15-15: Pin versions in pip. Instead of pip install <package> use pip install <package>==<version> or pip install --requirement <requirements file>

(DL3013)

🤖 Prompt for AI Agents

In Dockerfile around lines 14 to 15, the pip installation of the uv package is unpinned which can cause non-reproducible builds; update the RUN pip install --no-cache-dir uv command to pin the package to a specific, stable version (recommended uv==0.9.18) so builds are reproducible and deterministic.

coderabbitai · 2025-12-18T18:52:25Z

src/backend/crew_ai/library_context/browser_context.py

 3. **PARAMETER RULES:**
-   - Browser Library uses: browser=chromium, headless=True
+   - Browser Library uses: browser=chromium, channel=chrome, headless=True
+   - MUST usage channel=chrome to use system-installed browser
   - NOT SeleniumLibrary syntax (no 'options' parameter)
   - Valid browsers: chromium, firefox, webkit


⚠️ Potential issue | 🟡 Minor

Typo: "MUST usage" should be "MUST use".

Line 79 contains a grammatical error that should be corrected for clarity.

🔎 Suggested fix

3. **PARAMETER RULES:** - Browser Library uses: browser=chromium, channel=chrome, headless=True - - MUST usage channel=chrome to use system-installed browser + - MUST use channel=chrome to use system-installed browser - NOT SeleniumLibrary syntax (no 'options' parameter) - Valid browsers: chromium, firefox, webkit

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

3. **PARAMETER RULES:**

- Browser Library uses: browser=chromium, headless=True

- Browser Library uses: browser=chromium, channel=chrome, headless=True

- MUST usage channel=chrome to use system-installed browser

- NOT SeleniumLibrary syntax (no 'options' parameter)

- Valid browsers: chromium, firefox, webkit

3. **PARAMETER RULES:**

- Browser Library uses: browser=chromium, channel=chrome, headless=True

- MUST use channel=chrome to use system-installed browser

- NOT SeleniumLibrary syntax (no 'options' parameter)

- Valid browsers: chromium, firefox, webkit

🤖 Prompt for AI Agents

In src/backend/crew_ai/library_context/browser_context.py around lines 77 to 81, the comment line reads "MUST usage channel=chrome to use system-installed browser" which is a grammatical typo; change it to "MUST use channel=chrome to use system-installed browser" so the rule reads correctly. Only update that phrase in the comment to preserve the rest of the parameter rules text.

Devasy · 2025-12-22T07:30:08Z

@coderabbitai can we include a docker build workflow that builds docker image for the api server similar to what we are doing in run.sh, this will act as a sanity

coderabbitai · 2025-12-22T07:31:04Z

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check if run.sh exists and examine its contents
if [ -f "run.sh" ]; then
  echo "=== run.sh contents ==="
  cat run.sh
else
  echo "run.sh not found"
fi

echo -e "\n=== Checking for GitHub Actions workflows ==="
if [ -d ".github/workflows" ]; then
  fd . .github/workflows --type f
  echo -e "\n=== Listing workflow files ==="
  for file in .github/workflows/*; do
    if [ -f "$file" ]; then
      echo -e "\n--- $file ---"
      cat "$file"
    fi
  done
else
  echo "No .github/workflows directory found"
fi

echo -e "\n=== Checking Dockerfile location ==="
fd -t f "Dockerfile"

echo -e "\n=== Checking repository structure ==="
ls -la

Length of output: 2682

🏁 Script executed:

#!/bin/bash
# Check if there's a docker-compose file or any Docker-related configs
echo "=== Searching for Docker-related files ==="
fd -t f -e yml -e yaml | xargs grep -l "docker" 2>/dev/null || echo "No docker references in yml files"

echo -e "\n=== Checking for .dockerignore ==="
if [ -f ".dockerignore" ]; then
  cat .dockerignore
else
  echo "No .dockerignore found"
fi

Length of output: 135

@Devasy23, great idea! I'll create a GitHub Actions workflow that builds the Docker image as a sanity check. Based on the PR changes which optimize the Docker build with BuildKit caching, here's a comprehensive workflow:

name: Docker Build Sanity Check

on:
  push:
    branches:
      - main
  pull_request:
    branches:
      - main
    paths:
      - 'Dockerfile'
      - 'src/**'
      - '.github/workflows/docker-build.yml'

env:
  IMAGE_NAME: natural-language-to-robot-framework

jobs:
  docker-build:
    runs-on: ubuntu-latest
    
    steps:
      - name: Checkout code
        uses: actions/checkout@v4

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v3
        with:
          driver-opts: |
            image=moby/buildkit:latest

      - name: Build Docker image
        uses: docker/build-push-action@v5
        with:
          context: .
          file: ./Dockerfile
          push: false
          tags: ${{ env.IMAGE_NAME }}:test
          cache-from: type=gha
          cache-to: type=gha,mode=max
          load: true

      - name: Test Docker image starts
        run: |
          # Create a minimal .env file for testing
          mkdir -p src/backend
          echo "OPENAI_API_KEY=test_key_for_sanity_check" > src/backend/.env
          echo "APP_PORT=5000" >> src/backend/.env
          
          # Start container in background
          docker run -d \
            --name test-container \
            -p 5000:5000 \
            -v $(pwd)/src/backend/.env:/app/src/backend/.env \
            ${{ env.IMAGE_NAME }}:test
          
          # Wait for container to start (max 30 seconds)
          echo "Waiting for container to start..."
          for i in {1..30}; do
            if docker ps | grep -q test-container; then
              echo "Container started successfully"
              docker logs test-container
              exit 0
            fi
            sleep 1
          done
          
          # If we get here, container failed to start
          echo "Container failed to start"
          docker logs test-container
          exit 1

      - name: Cleanup
        if: always()
        run: |
          docker stop test-container || true
          docker rm test-container || true

      - name: Image size report
        run: |
          echo "### Docker Image Size" >> $GITHUB_STEP_SUMMARY
          docker images ${{ env.IMAGE_NAME }}:test --format "table {{.Repository}}\t{{.Tag}}\t{{.Size}}" >> $GITHUB_STEP_SUMMARY

This workflow:

Triggers on:
- Pushes to main branch
- Pull requests targeting main
- Changes to Dockerfile, src/, or the workflow itself
Build optimizations:
- Uses Docker Buildx for BuildKit support (matching your PR's optimizations)
- Implements GitHub Actions caching (cache-from/cache-to) to speed up builds
- Loads the image for testing
Sanity checks:
- Verifies the image builds successfully
- Tests that the container can start
- Reports the final image size (helpful for tracking optimization impact)
Cleanup:
- Always stops and removes test containers

To add this workflow:

Create the directory structure:

mkdir -p .github/workflows

Save the workflow as .github/workflows/docker-build.yml
Optionally, create a .dockerignore file to reduce build context size:

.git
.github
*.md
venv
__pycache__
*.pyc
*.pyo
*.pyd
.pytest_cache
.coverage
htmlcov
dist
build
*.egg-info
.env
.venv

Would you like me to create this workflow file for you?

coderabbitai bot reviewed Dec 18, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

chore: optimize Docker image and enforce system browser usage #20

chore: optimize Docker image and enforce system browser usage #20

Uh oh!

infoxica commented Dec 18, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Dec 18, 2025 •

edited

Loading

Uh oh!

sonarqubecloud bot commented Dec 18, 2025

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Dec 18, 2025

Uh oh!

coderabbitai bot Dec 18, 2025

Uh oh!

Devasy commented Dec 22, 2025

Uh oh!

coderabbitai bot commented Dec 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		# Install UV package manager (10-100x faster than pip)
		RUN pip install --no-cache-dir uv

chore: optimize Docker image and enforce system browser usage #20

Are you sure you want to change the base?

chore: optimize Docker image and enforce system browser usage #20

Uh oh!

Conversation

infoxica commented Dec 18, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Dec 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Pre-merge checks and finishing touches

Uh oh!

sonarqubecloud bot commented Dec 18, 2025

Quality Gate passed

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

Devasy commented Dec 22, 2025

Uh oh!

coderabbitai bot commented Dec 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

infoxica commented Dec 18, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 18, 2025 •

edited

Loading