-
Notifications
You must be signed in to change notification settings - Fork 7.1k
Closed
Description
I've seen the following test failure more frequently since #1668. It's possible to recreate by building the docker container and running the relevant test. I think this is an issue of the test being really slow sometimes (and not actually hanging), but there is some issue here because it shouldn't be so slow. It's possible that actors are slower to startup now or something.
+ python /home/jenkins/workspace/Ray-PRB/test/jenkins_tests/multi_node_docker_test.py --docker-image=a8af5012e4373c3f41bba48dae9e8c257a2d5ce84dfc7ab429b87cd8466354a5 --num-nodes=5 --num-redis-shards=2 --num-gpus=0,0,5,6,50 --num-drivers=100 --test-script=/ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py
Starting head node with command:['docker', 'run', '-d', '--shm-size=1G', 'a8af5012e4373c3f41bba48dae9e8c257a2d5ce84dfc7ab429b87cd8466354a5', 'ray', 'start', '--head', '--block', '--redis-port=6379', '--num-redis-shards=2', '--num-cpus=10', '--num-gpus=0', '--no-ui']
Starting worker node with command:['docker', 'run', '-d', '--shm-size=1G', '--shm-size=1G', 'a8af5012e4373c3f41bba48dae9e8c257a2d5ce84dfc7ab429b87cd8466354a5', 'ray', 'start', '--block', '--redis-address=172.17.0.2:6379', '--num-cpus=10', '--num-gpus=0']
Starting worker node with command:['docker', 'run', '-d', '--shm-size=1G', '--shm-size=1G', 'a8af5012e4373c3f41bba48dae9e8c257a2d5ce84dfc7ab429b87cd8466354a5', 'ray', 'start', '--block', '--redis-address=172.17.0.2:6379', '--num-cpus=10', '--num-gpus=5']
Starting worker node with command:['docker', 'run', '-d', '--shm-size=1G', '--shm-size=1G', 'a8af5012e4373c3f41bba48dae9e8c257a2d5ce84dfc7ab429b87cd8466354a5', 'ray', 'start', '--block', '--redis-address=172.17.0.2:6379', '--num-cpus=10', '--num-gpus=6']
Starting worker node with command:['docker', 'run', '-d', '--shm-size=1G', '--shm-size=1G', 'a8af5012e4373c3f41bba48dae9e8c257a2d5ce84dfc7ab429b87cd8466354a5', 'ray', 'start', '--block', '--redis-address=172.17.0.2:6379', '--num-cpus=10', '--num-gpus=50']
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
STDOUT:
Driver 0 started at 1521357315.87.
Driver 0 finished at 1521357317.7.
STDERR:
STDOUT:
Driver 1 started at 1521357317.58.
Driver 1 finished at 1521357354.28.
STDERR:
STDOUT:
Driver 2 started at 1521357317.93.
STDERR:
Connection to socket failed for pathname /tmp/scheduler30755824
Could not connect to socket /tmp/scheduler30755824
STDOUT:
Driver 3 started at 1521357317.93.
Driver 3 finished at 1521357346.32.
STDERR:
STDOUT:
Driver 4 started at 1521357315.83.
Driver 4 finished at 1521357353.2.
STDERR:
STDOUT:
Driver 5 started at 1521357316.54.
Driver 5 finished at 1521357346.15.
STDERR:
STDOUT:
Driver 6 started at 1521357318.17.
Driver 6 finished at 1521357355.23.
STDERR:
stop_node {'container_id': u'fdade16183bf881078218e0074d1fe0216ae2749ddf1be22da2969cf22991e69', 'is_head': True}
stop_node {'container_id': u'92f44a5a32b0ac2a6230a5edb07b543844c281402f17c232d8bd5892a2d72545', 'is_head': False}
stop_node {'container_id': u'120bc3ef17f0886eee928a523c82fe783b14e6b4aecb841dc0311a78c7a1e330', 'is_head': False}
stop_node {'container_id': u'4d91a8f9e19f8f2f314ef6c823592901db81451da71c117ffadb29d0d8322120', 'is_head': False}
stop_node {'container_id': u'08612276474af81a862fd8516321fb6c70a891f1a72012354e7eb03d25a8ee59', 'is_head': False}
Traceback (most recent call last):
File "/home/jenkins/workspace/Ray-PRB/test/jenkins_tests/multi_node_docker_test.py", line 354, in <module>
driver_locations=driver_locations)
File "/home/jenkins/workspace/Ray-PRB/test/jenkins_tests/multi_node_docker_test.py", line 293, in run_test
stdout_data, stderr_data = wait_for_output(p)
File "/home/jenkins/workspace/Ray-PRB/test/jenkins_tests/multi_node_docker_test.py", line 23, in wait_for_output
stdout_data, stderr_data = proc.communicate()
File "/home/jenkins/anaconda2/lib/python2.7/subprocess.py", line 800, in communicate
return self._communicate(input)
File "/home/jenkins/anaconda2/lib/python2.7/subprocess.py", line 1417, in _communicate
stdout, stderr = self._communicate_with_poll(input)
File "/home/jenkins/anaconda2/lib/python2.7/subprocess.py", line 1471, in _communicate_with_poll
ready = poller.poll()
File "/home/jenkins/workspace/Ray-PRB/test/jenkins_tests/multi_node_docker_test.py", line 271, in handler
.format(timeout_seconds))
RuntimeError: This test timed out after 600 seconds.
Build step 'Execute shell' marked build as failure
Setting commit status on GitHub for https://github.com/ray-project/ray/commit/ba2e0c9577b9a5dcac2f68e93c37a414ff7244ea
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/4380/
Test FAILed.
Finished: FAILURE
Metadata
Metadata
Assignees
Labels
No labels