Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tasks going OOM on Taskcluster #12874

Closed
gsnedders opened this issue Sep 6, 2018 · 10 comments
Closed

Tasks going OOM on Taskcluster #12874

gsnedders opened this issue Sep 6, 2018 · 10 comments

Comments

@gsnedders
Copy link
Member

e.g., https://tools.taskcluster.net/groups/f7409aGOTa6OJLwvyw7igw/tasks/N0HriXt3TH62lxQhwftRlg/details went OOM (in the job for #12380)

The suite ends with:

�[2m�[34m 9:39.10�(B�[m �[33mTEST_START�(B�[m: /html/infrastructure/urls/resolving-urls/query-encoding/location.sub.html?encoding=utf8
�[2m�[34m 9:59.22�(B�[m �[31mTEST_END�(B�[m: TIMEOUT, expected OK
�[2m�[34m 9:59.55�(B�[m �[33mWARNING�(B�[m u'runner_teardown': ()
�[2m�[34m 9:59.57�(B�[m �[34mINFO�(B�[m STDERR: ['/home/test/web-platform-tests/_venv/bin/chromedriver', '--port=4444', '--url-base=/']
�[2m�[34m 9:59.60�(B�[m �[33mWARNING�(B�[m Failure during init Traceback (most recent call last):
  File "/home/test/web-platform-tests/tools/wptrunner/wptrunner/testrunner.py", line 195, in init
    self.browser.start(group_metadata=group_metadata, **self.browser_settings)
  File "/home/test/web-platform-tests/tools/wptrunner/wptrunner/browsers/chrome.py", line 83, in start
    self.server.start(block=False)
  File "/home/test/web-platform-tests/tools/wptrunner/wptrunner/webdriver_server.py", line 49, in start
    self._run(block)
  File "/home/test/web-platform-tests/tools/wptrunner/wptrunner/webdriver_server.py", line 63, in _run
    self._proc.run()
  File "/home/test/web-platform-tests/_venv/lib/python2.7/site-packages/mozprocess/processhandler.py", line 760, in run
    self.proc = self.Process([self.cmd] + self.args, **args)
  File "/home/test/web-platform-tests/_venv/lib/python2.7/site-packages/mozprocess/processhandler.py", line 114, in __init__
    universal_newlines, startupinfo, creationflags)
  File "/usr/lib/python2.7/subprocess.py", line 711, in __init__
    errread, errwrite)
  File "/usr/lib/python2.7/subprocess.py", line 1235, in _execute_child
    self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory

�[2m�[34m 9:59.60�(B�[m �[31mERROR�(B�[m Traceback (most recent call last):
  File "/home/test/web-platform-tests/tools/wptrunner/wptrunner/testrunner.py", line 195, in init
    self.browser.start(group_metadata=group_metadata, **self.browser_settings)
  File "/home/test/web-platform-tests/tools/wptrunner/wptrunner/browsers/chrome.py", line 83, in start
    self.server.start(block=False)
  File "/home/test/web-platform-tests/tools/wptrunner/wptrunner/webdriver_server.py", line 49, in start
    self._run(block)
  File "/home/test/web-platform-tests/tools/wptrunner/wptrunner/webdriver_server.py", line 63, in _run
    self._proc.run()
  File "/home/test/web-platform-tests/_venv/lib/python2.7/site-packages/mozprocess/processhandler.py", line 760, in run
    self.proc = self.Process([self.cmd] + self.args, **args)
  File "/home/test/web-platform-tests/_venv/lib/python2.7/site-packages/mozprocess/processhandler.py", line 114, in __init__
    universal_newlines, startupinfo, creationflags)
  File "/usr/lib/python2.7/subprocess.py", line 711, in __init__
    errread, errwrite)
  File "/usr/lib/python2.7/subprocess.py", line 1235, in _execute_child
    self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory

�[2m�[34m 9:59.60�(B�[m �[33mWARNING�(B�[m Failure during init Traceback (most recent call last):
  File "/home/test/web-platform-tests/tools/wptrunner/wptrunner/testrunner.py", line 195, in init
    self.browser.start(group_metadata=group_metadata, **self.browser_settings)
  File "/home/test/web-platform-tests/tools/wptrunner/wptrunner/browsers/chrome.py", line 83, in start
    self.server.start(block=False)
  File "/home/test/web-platform-tests/tools/wptrunner/wptrunner/webdriver_server.py", line 49, in start
    self._run(block)
  File "/home/test/web-platform-tests/tools/wptrunner/wptrunner/webdriver_server.py", line 63, in _run
    self._proc.run()
  File "/home/test/web-platform-tests/_venv/lib/python2.7/site-packages/mozprocess/processhandler.py", line 760, in run
    self.proc = self.Process([self.cmd] + self.args, **args)
  File "/home/test/web-platform-tests/_venv/lib/python2.7/site-packages/mozprocess/processhandler.py", line 114, in __init__
    universal_newlines, startupinfo, creationflags)
  File "/usr/lib/python2.7/subprocess.py", line 711, in __init__
    errread, errwrite)
  File "/usr/lib/python2.7/subprocess.py", line 1235, in _execute_child
    self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory

�[2m�[34m 9:59.60�(B�[m �[31mERROR�(B�[m Traceback (most recent call last):
  File "/home/test/web-platform-tests/tools/wptrunner/wptrunner/testrunner.py", line 195, in init
    self.browser.start(group_metadata=group_metadata, **self.browser_settings)
  File "/home/test/web-platform-tests/tools/wptrunner/wptrunner/browsers/chrome.py", line 83, in start
    self.server.start(block=False)
  File "/home/test/web-platform-tests/tools/wptrunner/wptrunner/webdriver_server.py", line 49, in start
    self._run(block)
  File "/home/test/web-platform-tests/tools/wptrunner/wptrunner/webdriver_server.py", line 63, in _run
    self._proc.run()
  File "/home/test/web-platform-tests/_venv/lib/python2.7/site-packages/mozprocess/processhandler.py", line 760, in run
    self.proc = self.Process([self.cmd] + self.args, **args)
  File "/home/test/web-platform-tests/_venv/lib/python2.7/site-packages/mozprocess/processhandler.py", line 114, in __init__
    universal_newlines, startupinfo, creationflags)
  File "/usr/lib/python2.7/subprocess.py", line 711, in __init__
    errread, errwrite)
  File "/usr/lib/python2.7/subprocess.py", line 1235, in _execute_child
    self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory

�[2m�[34m 9:59.60�(B�[m �[34mINFO�(B�[m STDERR: ['/home/test/web-platform-tests/_venv/bin/chromedriver', '--port=4444', '--url-base=/']
�[2m�[34m 9:59.60�(B�[m �[33mWARNING�(B�[m Failure during init Traceback (most recent call last):
  File "/home/test/web-platform-tests/tools/wptrunner/wptrunner/testrunner.py", line 195, in init
    self.browser.start(group_metadata=group_metadata, **self.browser_settings)
  File "/home/test/web-platform-tests/tools/wptrunner/wptrunner/browsers/chrome.py", line 83, in start
    self.server.start(block=False)
  File "/home/test/web-platform-tests/tools/wptrunner/wptrunner/webdriver_server.py", line 49, in start
    self._run(block)
  File "/home/test/web-platform-tests/tools/wptrunner/wptrunner/webdriver_server.py", line 63, in _run
    self._proc.run()
  File "/home/test/web-platform-tests/_venv/lib/python2.7/site-packages/mozprocess/processhandler.py", line 760, in run
    self.proc = self.Process([self.cmd] + self.args, **args)
  File "/home/test/web-platform-tests/_venv/lib/python2.7/site-packages/mozprocess/processhandler.py", line 114, in __init__
    universal_newlines, startupinfo, creationflags)
  File "/usr/lib/python2.7/subprocess.py", line 711, in __init__
    errread, errwrite)
  File "/usr/lib/python2.7/subprocess.py", line 1235, in _execute_child
    self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory

�[2m�[34m 9:59.60�(B�[m �[31mERROR�(B�[m Traceback (most recent call last):
  File "/home/test/web-platform-tests/tools/wptrunner/wptrunner/testrunner.py", line 195, in init
    self.browser.start(group_metadata=group_metadata, **self.browser_settings)
  File "/home/test/web-platform-tests/tools/wptrunner/wptrunner/browsers/chrome.py", line 83, in start
    self.server.start(block=False)
  File "/home/test/web-platform-tests/tools/wptrunner/wptrunner/webdriver_server.py", line 49, in start
    self._run(block)
  File "/home/test/web-platform-tests/tools/wptrunner/wptrunner/webdriver_server.py", line 63, in _run
    self._proc.run()
  File "/home/test/web-platform-tests/_venv/lib/python2.7/site-packages/mozprocess/processhandler.py", line 760, in run
    self.proc = self.Process([self.cmd] + self.args, **args)
  File "/home/test/web-platform-tests/_venv/lib/python2.7/site-packages/mozprocess/processhandler.py", line 114, in __init__
    universal_newlines, startupinfo, creationflags)
  File "/usr/lib/python2.7/subprocess.py", line 711, in __init__
    errread, errwrite)
  File "/usr/lib/python2.7/subprocess.py", line 1235, in _execute_child
    self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory

�[2m�[34m 9:59.60�(B�[m �[34mINFO�(B�[m STDERR: ['/home/test/web-platform-tests/_venv/bin/chromedriver', '--port=4444', '--url-base=/']
�[2m�[34m 9:59.61�(B�[m �[33mWARNING�(B�[m Failure during init Traceback (most recent call last):
  File "/home/test/web-platform-tests/tools/wptrunner/wptrunner/testrunner.py", line 195, in init
    self.browser.start(group_metadata=group_metadata, **self.browser_settings)
  File "/home/test/web-platform-tests/tools/wptrunner/wptrunner/browsers/chrome.py", line 83, in start
    self.server.start(block=False)
  File "/home/test/web-platform-tests/tools/wptrunner/wptrunner/webdriver_server.py", line 49, in start
    self._run(block)
  File "/home/test/web-platform-tests/tools/wptrunner/wptrunner/webdriver_server.py", line 63, in _run
    self._proc.run()
  File "/home/test/web-platform-tests/_venv/lib/python2.7/site-packages/mozprocess/processhandler.py", line 760, in run
    self.proc = self.Process([self.cmd] + self.args, **args)
  File "/home/test/web-platform-tests/_venv/lib/python2.7/site-packages/mozprocess/processhandler.py", line 114, in __init__
    universal_newlines, startupinfo, creationflags)
  File "/usr/lib/python2.7/subprocess.py", line 711, in __init__
    errread, errwrite)
  File "/usr/lib/python2.7/subprocess.py", line 1235, in _execute_child
    self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory

�[2m�[34m 9:59.61�(B�[m �[34mINFO�(B�[m STDERR: ['/home/test/web-platform-tests/_venv/bin/chromedriver', '--port=4444', '--url-base=/']
�[2m�[34m 9:59.61�(B�[m �[31mERROR�(B�[m Traceback (most recent call last):
  File "/home/test/web-platform-tests/tools/wptrunner/wptrunner/testrunner.py", line 195, in init
    self.browser.start(group_metadata=group_metadata, **self.browser_settings)
  File "/home/test/web-platform-tests/tools/wptrunner/wptrunner/browsers/chrome.py", line 83, in start
    self.server.start(block=False)
  File "/home/test/web-platform-tests/tools/wptrunner/wptrunner/webdriver_server.py", line 49, in start
    self._run(block)
  File "/home/test/web-platform-tests/tools/wptrunner/wptrunner/webdriver_server.py", line 63, in _run
    self._proc.run()
  File "/home/test/web-platform-tests/_venv/lib/python2.7/site-packages/mozprocess/processhandler.py", line 760, in run
    self.proc = self.Process([self.cmd] + self.args, **args)
  File "/home/test/web-platform-tests/_venv/lib/python2.7/site-packages/mozprocess/processhandler.py", line 114, in __init__
    universal_newlines, startupinfo, creationflags)
  File "/usr/lib/python2.7/subprocess.py", line 711, in __init__
    errread, errwrite)
  File "/usr/lib/python2.7/subprocess.py", line 1235, in _execute_child
    self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory

�[2m�[34m 9:59.61�(B�[m �[33mWARNING�(B�[m Failure during init Traceback (most recent call last):
  File "/home/test/web-platform-tests/tools/wptrunner/wptrunner/testrunner.py", line 195, in init
    self.browser.start(group_metadata=group_metadata, **self.browser_settings)
  File "/home/test/web-platform-tests/tools/wptrunner/wptrunner/browsers/chrome.py", line 83, in start
    self.server.start(block=False)
  File "/home/test/web-platform-tests/tools/wptrunner/wptrunner/webdriver_server.py", line 49, in start
    self._run(block)
  File "/home/test/web-platform-tests/tools/wptrunner/wptrunner/webdriver_server.py", line 63, in _run
    self._proc.run()
  File "/home/test/web-platform-tests/_venv/lib/python2.7/site-packages/mozprocess/processhandler.py", line 760, in run
    self.proc = self.Process([self.cmd] + self.args, **args)
  File "/home/test/web-platform-tests/_venv/lib/python2.7/site-packages/mozprocess/processhandler.py", line 114, in __init__
    universal_newlines, startupinfo, creationflags)
  File "/usr/lib/python2.7/subprocess.py", line 711, in __init__
    errread, errwrite)
  File "/usr/lib/python2.7/subprocess.py", line 1235, in _execute_child
    self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory

�[2m�[34m 9:59.61�(B�[m �[31mERROR�(B�[m Traceback (most recent call last):
  File "/home/test/web-platform-tests/tools/wptrunner/wptrunner/testrunner.py", line 195, in init
    self.browser.start(group_metadata=group_metadata, **self.browser_settings)
  File "/home/test/web-platform-tests/tools/wptrunner/wptrunner/browsers/chrome.py", line 83, in start
    self.server.start(block=False)
  File "/home/test/web-platform-tests/tools/wptrunner/wptrunner/webdriver_server.py", line 49, in start
    self._run(block)
  File "/home/test/web-platform-tests/tools/wptrunner/wptrunner/webdriver_server.py", line 63, in _run
    self._proc.run()
  File "/home/test/web-platform-tests/_venv/lib/python2.7/site-packages/mozprocess/processhandler.py", line 760, in run
    self.proc = self.Process([self.cmd] + self.args, **args)
  File "/home/test/web-platform-tests/_venv/lib/python2.7/site-packages/mozprocess/processhandler.py", line 114, in __init__
    universal_newlines, startupinfo, creationflags)
  File "/usr/lib/python2.7/subprocess.py", line 711, in __init__
    errread, errwrite)
  File "/usr/lib/python2.7/subprocess.py", line 1235, in _execute_child
    self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory

�[2m�[34m 9:59.61�(B�[m �[34mINFO�(B�[m STDERR: ['/home/test/web-platform-tests/_venv/bin/chromedriver', '--port=4444', '--url-base=/']
�[2m�[34m 9:59.62�(B�[m �[33mWARNING�(B�[m Failure during init Traceback (most recent call last):
  File "/home/test/web-platform-tests/tools/wptrunner/wptrunner/testrunner.py", line 195, in init
    self.browser.start(group_metadata=group_metadata, **self.browser_settings)
  File "/home/test/web-platform-tests/tools/wptrunner/wptrunner/browsers/chrome.py", line 83, in start
    self.server.start(block=False)
  File "/home/test/web-platform-tests/tools/wptrunner/wptrunner/webdriver_server.py", line 49, in start
    self._run(block)
  File "/home/test/web-platform-tests/tools/wptrunner/wptrunner/webdriver_server.py", line 63, in _run
    self._proc.run()
  File "/home/test/web-platform-tests/_venv/lib/python2.7/site-packages/mozprocess/processhandler.py", line 760, in run
    self.proc = self.Process([self.cmd] + self.args, **args)
  File "/home/test/web-platform-tests/_venv/lib/python2.7/site-packages/mozprocess/processhandler.py", line 114, in __init__
    universal_newlines, startupinfo, creationflags)
  File "/usr/lib/python2.7/subprocess.py", line 711, in __init__
    errread, errwrite)
  File "/usr/lib/python2.7/subprocess.py", line 1235, in _execute_child
    self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory

�[2m�[34m 9:59.62�(B�[m �[31mERROR�(B�[m Traceback (most recent call last):
  File "/home/test/web-platform-tests/tools/wptrunner/wptrunner/testrunner.py", line 195, in init
    self.browser.start(group_metadata=group_metadata, **self.browser_settings)
  File "/home/test/web-platform-tests/tools/wptrunner/wptrunner/browsers/chrome.py", line 83, in start
    self.server.start(block=False)
  File "/home/test/web-platform-tests/tools/wptrunner/wptrunner/webdriver_server.py", line 49, in start
    self._run(block)
  File "/home/test/web-platform-tests/tools/wptrunner/wptrunner/webdriver_server.py", line 63, in _run
    self._proc.run()
  File "/home/test/web-platform-tests/_venv/lib/python2.7/site-packages/mozprocess/processhandler.py", line 760, in run
    self.proc = self.Process([self.cmd] + self.args, **args)
  File "/home/test/web-platform-tests/_venv/lib/python2.7/site-packages/mozprocess/processhandler.py", line 114, in __init__
    universal_newlines, startupinfo, creationflags)
  File "/usr/lib/python2.7/subprocess.py", line 711, in __init__
    errread, errwrite)
  File "/usr/lib/python2.7/subprocess.py", line 1235, in _execute_child
    self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory

�[2m�[34m 9:59.62�(B�[m �[31mERROR�(B�[m Max restarts exceeded
�[2m�[34m 9:59.62�(B�[m �[34mINFO�(B�[m Got 655 unexpected results
�[2m�[34m 9:59.62�(B�[m �[33mSUITE_END�(B�[m

If we're having this happen several times on one commit, then it seems likely this will happen semi-regularly, which makes it hard to rely on for PRs or anything else.

@gsnedders gsnedders added the infra label Sep 6, 2018
@jgraham
Copy link
Contributor

jgraham commented Sep 6, 2018

I would be interested to know if we see this on tasks run on wpt-docker-worker; we can possibly adjust the instance type there if OOM is an ongoing issue.

@gsnedders
Copy link
Member Author

Where does the memory limit come from?

@jgraham
Copy link
Contributor

jgraham commented Sep 6, 2018

Presumably from the AWS instance type.

@gsnedders
Copy link
Member Author

According to the logs, we're on a c3.xlarge, which doesn't exist in the AWS documentation… Third party documentation suggests that it has 7.5GB RAM, which should be plenty?

@mdittmer
Copy link
Contributor

@gsnedders has this issue recurred or is it safe to close it?

@gsnedders
Copy link
Member Author

@mdittmer I've seen it somewhat infrequently, but given we don't really have any decent way to notice it at the moment (as it doesn't fail the jobs).

@jugglinmike
Copy link
Contributor

To estimate how frequently this issue occurs, I downloaded the logs for the 10 most recent commits to master along with those from the commit that @gsnedders reported above:

$ for n in $(seq 0 9); do git rev-parse b040cb2a0726~$n; done
b040cb2a07264d35b9f47cd9deb2fb2889fca6d4
b9acbf4eaac436ee7386415e396ad6cade57bab8
68d63d516e6bc4b83934d71616b2aa82d8efe2af
f6de5f7d5d50b1ad30a1e17922e183868fd3e935
0200c63a74a540dfb870881d5f476280c367d1c1
00c1fd9ea4bc1e03f966919f1e209ab8cd81f57e
61e09bec3d21b4e9137dadafc20c04cb9be7ce3e
91491deb7dcfe4a0d0ece61f086345abad4d46a4
75a06f907589cab45d901e88b54babb182e2446f
23c54943daee3cd8520a6d86acc055f1120f96ad

$ for n in $(seq 0 9); do ./wpt tc-download --ref b040cb2a0726~$n --artifact-name live_backing.log --out-dir tc-logs/b040cb2a0726-$n; done

$ ./wpt tc-download --ref 726c83e4f5ad69a42576dfcb87532 --repo-name gsnedders/web-platform-tests --artifact-name live_backing.log --out-dir tc-logs/gsnedders

Searching for "Cannot allocate" turned up two results:

$ grep 'Cannot allocate' -rl tc-logs/
tc-logs/gsnedders/wpt-chrome-dev-testharness-5-N0HriXt3TH62lxQhwftRlg-live_backing.log
tc-logs/gsnedders/wpt-chrome-dev-testharness-9-c5swNT_qQ3-yMrLn8DmQPg-live_backing.log

The sample size is pretty small, but bear in mind that we run WPT in 15 "chunks" per browser. That means it occurred in two of @gsnedders' 15 tasks and 0 of WPT's 150. This supports @jgraham's earlier theory that this is related to the workerType and therefore not influential for the typical use cases of pull request validation and results uploading.

@gsnedders
Copy link
Member Author

Annoying though, given we did want it to be possible to run on people's own forks.

@foolip foolip changed the title Tasks going OOM on TaskCluster Tasks going OOM on Taskcluster Nov 23, 2018
@Hexcles
Copy link
Member

Hexcles commented Feb 20, 2019

This should no longer happen at least in this repo after #14290 .

However, unfortunately, OOM is still possible on forks. We do have special (larger) memory guarantees on wpt-docker-worker (#13989 (comment)). Unless we drastically reduce the memory footprint of WPT (especially manifest), it will continue to be possible to run OOM on the default worker whose memory might be less than 4G.

@Hexcles Hexcles closed this as completed Feb 20, 2019
@gsnedders
Copy link
Member Author

@Hexcles if I'm not mistaken, once #15308 lands manifest memory consumption should be much improved, as each item doesn't keep alive the SourceFile object

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants