Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker: The display compositor is frequently crashing. Goodbye. #3774

Closed
franz101 opened this issue Jan 15, 2019 · 24 comments
Closed

Docker: The display compositor is frequently crashing. Goodbye. #3774

franz101 opened this issue Jan 15, 2019 · 24 comments

Comments

@franz101
Copy link

Hi,
I'm using a docker container with node:8-slim
and puppeteer 1.10.0. The container runs fine locally but on my Debian server I got the error Message:
gpu_data_manager_impl_private.cc(892)] The display compositor is frequently crashing. Goodbye.
I guess this might be chrome related and maybe caused through lack of memory.

.... Couldn't find anyone experiencing this error before.

@franz101
Copy link
Author

Found a solution: Set chrome to stable

@nikhilo
Copy link

nikhilo commented Mar 14, 2019

This seems to have broken again in latest stable. Google Chrome 73.0.3683.75

@sonnyt
Copy link

sonnyt commented Mar 14, 2019

@nikhilo same here, did you find a solution?

@rafaelbnp
Copy link

@nikhilo @sonnyt have been fighting this since yesterday. Any luck?

@sonnyt
Copy link

sonnyt commented Mar 14, 2019

@rafael-paiva nope, just ended up downgrading to Chromium 72.0.3626.121.

@rafaelbnp
Copy link

I'm using headless Firefox so my tests are not blocked. @franz101 can we re-open this issue?

@aslushnikov
Copy link
Contributor

Can you guys share your docker container so that we can see this happenning locally?

@chrismllr
Copy link

chrismllr commented Mar 15, 2019

This is happening for me on node:10-browsers and node:lts-browsers, pre-built docker images we're using in Circle: https://circleci.com/docs/2.0/circleci-images/#nodejs

These install Chrome @ 73.0.3683.75

Downgrading to node:10.14-browsers did the trick in the meantime (installed Chrome ~71)

@aslushnikov
Copy link
Contributor

@chrismllr do I understand correctly that it crashes for the Chrome that's bundled with the docker image? Does the Chromium version we bundle with Puppeteer crash the same way?

@yhatt
Copy link

yhatt commented Mar 18, 2019

It has also happened similar crashes while running tests for our project.
https://circleci.com/gh/marp-team/marp-cli/1193

By using Chromium@74.0.3723.0 (r637110: Puppeteer v1.13.0 bundled version) instead Chrome@73.0.3683.75, it gets working well.

jonathanperret added a commit to 1024pix/pix that referenced this issue Mar 18, 2019
Chrome 73 breaks headless tests on CircleCI (see
puppeteer/puppeteer#3774).

While it's being investigated, using an older image should get us
running tests again.
@chrismllr
Copy link

@aslushnikov Correct, it was the chrome in the prebuilt docker image. I may have jumped in based on a problem I was having elsewhere, but the bug seemed to be identical.

benthorner pushed a commit to alphagov/publishing-e2e-tests that referenced this issue Mar 18, 2019
Previously our E2E tests started failing with the following error.

    chrome not reachable
    12:41:37               (Session info: headless chrome=73.0.3683.75)

This issue was limited to the CI agent machines. In order to debug the
issue, we SSH'd to ci-agent-8, became the jenkins user and changed to
one of the workspace directories where the E2E tests are run. We than
ran the E2E tests manually in order to reproduce the issue, as follows.

    make clone
    make pull
    make start

    docker-compose run publishing-e2e-tests bash

Once inside the container where the tests are run, we were able to
reproduce the issue with 'bundle exec rspec'. In order to investigate
further, we then installed vim, in order to install the irb gem and
start an irb console using 'bundle exec irb -Ispec'. Then we did

    require 'spec_helper'
    driver = Capybara.drivers[Capybara.current_driver].call
    driver.visit('https://google.com')
    driver.visit('https://google.com')

Running the visit method twice yields the same error as when running the
tests. Using 'docker exec' to start another bash console and inspect the
running processes shows that Chrome itself is failing to start.

    root       252   245 11 17:26 pts/1    00:00:00 [chrome] <defunct>

Running chrome manually with the options from the spec_helper then
yields the following error, even though we hadn't changed these options.

    google-chrome-stable
        --disable-dev-shm-usage
        --disable-gpu
        --disable-web-security
        --disable-infobars
        --disable-notifications
        --headless
        --no-sandbox
        --window-size=1400,1400
        https://google.com

    [0318/173719.964171:FATAL:gpu_data_manager_impl_private.cc(892)] The display compositor is frequently crashing. Goodbye.

Searching online for this error indicates its related to a new version
of Chrome, as per the following issue on the puppeteer repo.

    puppeteer/puppeteer#3774

Unfortunately it's not possible for us to downgrade Chrome, since Google
only provide the latest version in their package repo, and the E2E tests
are being run in transient containers, which have no older versions
available to downgrade to. This is the point where we lost all hope.

In Chapter 1 we experimented with running Chrome manually, based on
https://developers.google.com/web/updates/2017/04/headless-chrome.

    google-chrome-stable --headless --no-sandbox https://google.com

The success of this command indicated one of the options specified in
the spec_helper was causing Chrome to crash, and experimentation showed
this was '--disable-dev-shm-usage'. Removing this parameter fixed the
error, but caused Chrome to crash for a different reason.

/dev/shm is a tmpfs partition, but by default it is only 64M in size.
Previously, we had specified the '--disable-dev-shm-usage' option to use
/tmp instead, but the new release of Chrome makes this option unusable
for some reason. The obvious remedy is to increase the size of /dev/shm.

    publishing-e2e-tests:
      shm_size: 2G  <<<
      build: .

The combination of removing the faulty option and specifying a larger
size for /dev/shm meant we could then run the E2E tests successfully.

And they all lived happily ever after.
benthorner pushed a commit to alphagov/publishing-e2e-tests that referenced this issue Mar 18, 2019
Chapter 1: The Long Day
=======================

Previously our E2E tests started failing with the following error.

    chrome not reachable
    12:41:37               (Session info: headless chrome=73.0.3683.75)

This issue was limited to the CI agent machines. In order to debug the
issue, we SSH'd to ci-agent-8, became the jenkins user and changed to
one of the workspace directories where the E2E tests are run. We than
ran the E2E tests manually in order to reproduce the issue, as follows.

    make clone
    make pull
    make start

    docker-compose run publishing-e2e-tests bash

Once inside the container where the tests are run, we were able to
reproduce the issue with 'bundle exec rspec'. In order to investigate
further, we then installed vim, in order to install the irb gem and
start an irb console using 'bundle exec irb -Ispec'. Then we did

    require 'spec_helper'
    driver = Capybara.drivers[Capybara.current_driver].call
    driver.visit('https://google.com')
    driver.visit('https://google.com')

Running the visit method twice yields the same error as when running the
tests. Using 'docker exec' to start another bash console and inspect the
running processes shows that Chrome itself is failing to start.

    root       252   245 11 17:26 pts/1    00:00:00 [chrome] <defunct>

Running chrome manually with the options from the spec_helper then
yields the following error, even though we hadn't changed these options.

    google-chrome-stable
        --disable-dev-shm-usage
        --disable-gpu
        --disable-web-security
        --disable-infobars
        --disable-notifications
        --headless
        --no-sandbox
        --window-size=1400,1400
        https://google.com

    [0318/173719.964171:FATAL:gpu_data_manager_impl_private.cc(892)] The display compositor is frequently crashing. Goodbye.

Searching online for this error indicates its related to a new version
of Chrome, as per the following issue on the puppeteer repo.

    puppeteer/puppeteer#3774

Unfortunately it's not possible for us to downgrade Chrome, since Google
only provide the latest version in their package repo, and the E2E tests
are being run in transient containers, which have no older versions
available to downgrade to. This is the point where we lost all hope.

Chapter 2: The New Dawn
=======================

In Chapter 1 we experimented with running Chrome manually, based on
https://developers.google.com/web/updates/2017/04/headless-chrome.

    google-chrome-stable --headless --no-sandbox https://google.com

The success of this command indicated one of the options specified in
the spec_helper was causing Chrome to crash, and experimentation showed
this was '--disable-dev-shm-usage'. Removing this parameter fixed the
error, but caused Chrome to crash for a different reason.

/dev/shm is a tmpfs partition, but by default it is only 64M in size.
Previously, we had specified the '--disable-dev-shm-usage' option to use
/tmp instead, but the new release of Chrome makes this option unusable
for some reason. The obvious remedy is to increase the size of /dev/shm.

    publishing-e2e-tests:
      shm_size: 2G  <<<
      build: .

The combination of removing the faulty option and specifying a larger
size for /dev/shm meant we could then run the E2E tests successfully.

And they all lived happily ever after.
fatso83 added a commit to fatso83/sinon that referenced this issue Mar 20, 2019
@franz101
Copy link
Author

This seems to have broken again in latest stable. Google Chrome 73.0.3683.75

@franz101 franz101 reopened this Mar 21, 2019
@franz101
Copy link
Author

`FROM node:8-slim
EXPOSE 3000 9229
#FROM node:latest

See https://crbug.com/795759

RUN apt-get update && apt-get install -yq libgconf-2-4

Install latest chrome dev package and fonts to support major charsets (Chinese, Japanese, Arabic, Hebrew, Thai and a few others)

Note: this installs the necessary libs to make the bundled version of Chromium that Puppeteer

installs, work.

RUN apt-get update && apt-get install -y wget --no-install-recommends
&& wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add -
&& sh -c 'echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google.list'
&& apt-get update
&& apt-get install -y google-chrome-stable fonts-ipafont-gothic fonts-wqy-zenhei fonts-thai-tlwg fonts-kacst ttf-freefont
--no-install-recommends
&& rm -rf /var/lib/apt/lists/*
&& apt-get purge --auto-remove -y curl
&& rm -rf /src/*.deb

It's a good idea to use dumb-init to help prevent zombie chrome processes.

ADD https://github.com/Yelp/dumb-init/releases/download/v1.2.0/dumb-init_1.2.0_amd64 /usr/local/bin/dumb-init
RUN chmod +x /usr/local/bin/dumb-init

Uncomment to skip the chromium download when installing puppeteer. If you do,

you'll need to launch puppeteer with:

browser.launch({executablePath: 'google-chrome-unstable'})

ENV PUPPETEER_SKIP_CHROMIUM_DOWNLOAD true

RUN groupadd -r pptruser && useradd -r -g pptruser -G audio,video pptruser
&& mkdir -p /home/pptruser/Downloads && mkdir -p /home/pptruser/app/img
&& chown -R pptruser:pptruser /home/pptruser
RUN chown -R pptruser:pptruser /home/pptruser
RUN chmod -R 777 /home/pptruser/app/img

WORKDIR /home/pptruser/app
COPY . /home/pptruser/app

Install puppeteer so it's available in the container.

RUN npm i puppeteer

Add user so we don't need --no-sandbox.

RUN npm i -f
RUN npm install

Run everything after as non-privileged user.

USER pptruser
CMD npm run docker-start

`

@aslushnikov

@sonnyt
Copy link

sonnyt commented Mar 21, 2019

I switched over to google-chrome-unstable which is 74.0... it works!

@franz101
Copy link
Author

@rafael-paiva nope, just ended up downgrading to Chromium 72.0.3626.121.

how to you downgrade in the docker file? It's working now using a different docker container:
https://github.com/buildkite/docker-puppeteer

but how do you set the chrome version to fixed?

@skamenetskiy
Copy link

Having the same issue when running in ubuntu container under windows (not docker).

google-chrome --version
Google Chrome 74.0.3729.22 dev

22 03 2019 12:09:12.354:ERROR [launcher]: ChromeHeadless crashed.
DevTools listening on ws://127.0.0.1:9222/devtools/browser/a9e949f6-bb7e-498f-a4d7-e2164c42fddc
[0322/120912.025646:FATAL:gpu_data_manager_impl_private.cc(897)] The display compositor is frequently crashing. Goodbye.
Failed to generate minidump.[0322/120912.337254:ERROR:broker_posix.cc(43)] Invalid node channel message

@mrodal
Copy link

mrodal commented Mar 28, 2019

Im having the same issue on Google Chrome 74.0.3729.40 beta on WSL... Is this being worked by someone? should we create a bug report somewhere?

@dpdudek
Copy link

dpdudek commented Apr 5, 2019

In my case I resolved it by adding option --disable-features=VizDisplayCompositor.

@aslushnikov
Copy link
Contributor

WSL-related issues are unrelated to Docker. Other than that, this seems to be fixed.

@MolloKhan
Copy link

MolloKhan commented Jun 5, 2019

In my case I resolved it by adding option --disable-features=VizDisplayCompositor.

@dpdudek Man after a couple of hours of dealing with this problem that option saved my day. Thanks!
I have a quick question. Where can I check the list of features? I mean, how did you know that you have to disable that feature? It might be helpful in the future

@dpdudek
Copy link

dpdudek commented Jun 5, 2019

@larzuk91 When I enabled verbose log
driver = webdriver.Chrome(executable_path=CHROME_DRIVER_PATH, options=chrome_options, service_args=["--verbose", "--log-path=/var/log/driver.log"])
then message from driver was "The display compositor is frequently crashing. Goodbye.". After that I found that message in gpu_data_manager_impl_private.cc. Looking in code I found that it was some inconsistency with disable gpu and VizDisplayCompositor feature in "if" expressions (I had disabled gpu by
--disable-gpu parameter). Somewhere else I found how to disable VizDisplayCompositor and it started working, so I stopped at that.

List of features is also available under chrome://flags, but now VizDisplayCompositor is not visible in my version of chrome ( 75.0.3770.80).

https://chromium.googlesource.com/chromium/src/+/fd6ee1143ba55b99a627f158ef61bb80b898fe97/content/browser/gpu/gpu_data_manager_impl_private.cc

@MolloKhan
Copy link

MolloKhan commented Jun 5, 2019

Ohh, it's clearer now. Thanks a lot for helping me understand what was happening. The funny part is by disabling that feature, it only crashes 3 times and after that it works

DevTools listening on ws://127.0.0.1:9222/devtools/browser/02600cca-cb6a-478f-be44-43b17ca624e6
[0605/111803.924207:WARNING:gpu_process_host.cc(1205)] The GPU process has crashed 1 time(s)
[0605/111804.278884:WARNING:gpu_process_host.cc(1205)] The GPU process has crashed 2 time(s)
[0605/111804.633481:WARNING:gpu_process_host.cc(1205)] The GPU process has crashed 3 time(s)

Not sure why but I'm happy that I can run my tests locally :)

@cah-chase-spencer
Copy link

In my case I resolved it by adding option --disable-features=VizDisplayCompositor.

THANK YOU!!

@Bug-Reaper
Copy link

Ran into the same error with an electron app I've had installed forever that sporadically broke. Solved with --no-sandbox of all things which seems super weird considering it's a compositor issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests