Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Browser version mismatch between Windows development and AWS deployment #2715

Closed
1 task
wojtekKrol opened this issue Oct 15, 2024 · 2 comments
Closed
1 task
Assignees
Labels
bug Something isn't working. t-tooling Issues with this label are in the ownership of the tooling team.

Comments

@wojtekKrol
Copy link

Which package is this bug report for? If unsure which one to select, leave blank

@crawlee/playwright (PlaywrightCrawler)

Issue description

Description

I'm developing a Crawlee-based application on Windows and deploying it on AWS (Amazon Linux 2). I'm encountering an issue where the application is unable to locate the correct browser directory after deployment.

Environment

  • Development: Windows
  • Deployment: AWS Elastic Beanstalk (Amazon Linux 2, Fedora-based)
  • Crawlee version: 3.11.1
  • Playwright version: 1.44.1
  • Node.js version: >=18

Issue

When running the application on AWS, I receive the following error:

ERROR: Error processing batch: {
  "errorName": "BrowserLaunchError",
  "errorMessage": "Failed to launch browser. Please check the following:
  - Try installing the required dependencies by running `npx playwright install --with-deps` (https://playwright.dev/docs/browsers).

The original error is available in the `cause` property. Below is the error received when trying to launch a browser:
​",
  "stackTrace": "Failed to launch browser. Please check the following:
  - Try installing the required dependencies by running `npx playwright install --with-deps` (https://playwright.dev/docs/browsers).

The original error is available in the `cause` property. Below is the error received when trying to launch a browser:
​
browserType.launchPersistentContext: Executable doesn't exist at /home/webapp/.cache/ms-playwright/chromium-1117/chrome-linux/chrome

Details

  • On Windows (development environment), Playwright installs Chromium version 1117.
  • On Amazon Linux 2 (deployment environment), Playwright installs Chromium version 1140.
  • The application is trying to locate the browser in the directory for version 1117, which doesn't exist on the deployment server.

Installation Process

I've set up the following to install Chromium and its dependencies:

  1. In .ebextensions/chromium.config, I install necessary packages:
packages:
  yum:
    cups-libs: []
    dbus-glib: []
    libXrandr: []
    # ... (other packages)
    openssl-devel: []
    jq: []

commands:
  refresh-fonts:
    command: fc-cache -f -v
    
container_commands:
  enable_extra_packages:
    command: "sudo amazon-linux-extras install epel -y"
  install_chromium:
    command: "sudo yum install -y chromium" 
  1. In .platform/hooks/predeploy/chromium.sh, I run the following script:
#!/bin/bash
# ... (log setup)

sudo -iu webapp bash << 'EOF'
# ... (logging functions)

log_message "Remove package-lock & node_modules"
rm -rf node_modules package-lock.json

log_message "npm installation"
npm install   

log_message "Uninstall all playwright browsers"
npx playwright uninstall --all

log_message "Install chromium to webapp/.cache"
npx playwright install chromium

# ... (check if chromium is installed)
EOF

Attempted Solution

I tried renaming the directory on the deployment server from chromium-1140 to chromium-1117, but this did not resolve the issue.

Questions

  1. Why my application is trying to use chromium-1117 ? becasue I'm sending built app? But im building on jenkins pipeline so it's built on linux as well

Additional Information

  • Full package.json can be provided if needed.

Code sample

No response

Package version

3.11.1

Node.js version

18

Operating system

Amazon Linux 2 (Fedora underneath)

Apify platform

  • Tick me if you encountered this issue on the Apify platform

I have tested this on the next release

No response

Other context

No response

@wojtekKrol wojtekKrol added the bug Something isn't working. label Oct 15, 2024
@github-actions github-actions bot added the t-tooling Issues with this label are in the ownership of the tooling team. label Oct 15, 2024
@barjin
Copy link
Contributor

barjin commented Oct 30, 2024

Hello, and thank you for your interest in this project!

I'm not sure this is a Crawlee issue - I can imagine this is caused by the Amazon environment or Playwright itself. Crawlee really just runs Playwright with chromium.launch() etc.

If you're struggling with getting Playwright to run the right executable binary, you can pass launch options from Crawlee with:

const crawler = new PlaywrightCrawler({
    launchContext: {
        launchOptions: {
            executablePath: '/path/to/chrome',
        }
    }
});

This way, you can tell Playwright which browser binary to run. More about the executablePath launch option here.

Does this solve your problem? I'll keep this issue open for a bit more, but as I mentioned earlier, I'm quite positive this has got nothing to do with Crawlee, but... stranger things have happened.

@wojtekKrol
Copy link
Author

@barjin hi, it was an issue with my pipelines where prev developer do "npm I -no-package-lock" which resulted with updating crawlee version and therefore npx playwright install was installing newer version. By the way properly executePath did not worked

This issue is not related to crawlee so it can be closed.

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working. t-tooling Issues with this label are in the ownership of the tooling team.
Projects
None yet
Development

No branches or pull requests

2 participants