Skip to content

Jenkins test times out in many_drivers_test.py and remove_driver_test.py. #1740

@robertnishihara

Description

@robertnishihara

I've seen the following test failure more frequently since #1668. It's possible to recreate by building the docker container and running the relevant test. I think this is an issue of the test being really slow sometimes (and not actually hanging), but there is some issue here because it shouldn't be so slow. It's possible that actors are slower to startup now or something.

+ python /home/jenkins/workspace/Ray-PRB/test/jenkins_tests/multi_node_docker_test.py --docker-image=a8af5012e4373c3f41bba48dae9e8c257a2d5ce84dfc7ab429b87cd8466354a5 --num-nodes=5 --num-redis-shards=2 --num-gpus=0,0,5,6,50 --num-drivers=100 --test-script=/ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py
Starting head node with command:['docker', 'run', '-d', '--shm-size=1G', 'a8af5012e4373c3f41bba48dae9e8c257a2d5ce84dfc7ab429b87cd8466354a5', 'ray', 'start', '--head', '--block', '--redis-port=6379', '--num-redis-shards=2', '--num-cpus=10', '--num-gpus=0', '--no-ui']
Starting worker node with command:['docker', 'run', '-d', '--shm-size=1G', '--shm-size=1G', 'a8af5012e4373c3f41bba48dae9e8c257a2d5ce84dfc7ab429b87cd8466354a5', 'ray', 'start', '--block', '--redis-address=172.17.0.2:6379', '--num-cpus=10', '--num-gpus=0']
Starting worker node with command:['docker', 'run', '-d', '--shm-size=1G', '--shm-size=1G', 'a8af5012e4373c3f41bba48dae9e8c257a2d5ce84dfc7ab429b87cd8466354a5', 'ray', 'start', '--block', '--redis-address=172.17.0.2:6379', '--num-cpus=10', '--num-gpus=5']
Starting worker node with command:['docker', 'run', '-d', '--shm-size=1G', '--shm-size=1G', 'a8af5012e4373c3f41bba48dae9e8c257a2d5ce84dfc7ab429b87cd8466354a5', 'ray', 'start', '--block', '--redis-address=172.17.0.2:6379', '--num-cpus=10', '--num-gpus=6']
Starting worker node with command:['docker', 'run', '-d', '--shm-size=1G', '--shm-size=1G', 'a8af5012e4373c3f41bba48dae9e8c257a2d5ce84dfc7ab429b87cd8466354a5', 'ray', 'start', '--block', '--redis-address=172.17.0.2:6379', '--num-cpus=10', '--num-gpus=50']
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
Starting driver with command /ray/test/jenkins_tests/multi_node_tests/many_drivers_test.py.
STDOUT:
Driver 0 started at 1521357315.87.
Driver 0 finished at 1521357317.7.

STDERR:

STDOUT:
Driver 1 started at 1521357317.58.
Driver 1 finished at 1521357354.28.

STDERR:

STDOUT:
Driver 2 started at 1521357317.93.

STDERR:
Connection to socket failed for pathname /tmp/scheduler30755824
Could not connect to socket /tmp/scheduler30755824

STDOUT:
Driver 3 started at 1521357317.93.
Driver 3 finished at 1521357346.32.

STDERR:

STDOUT:
Driver 4 started at 1521357315.83.
Driver 4 finished at 1521357353.2.

STDERR:

STDOUT:
Driver 5 started at 1521357316.54.
Driver 5 finished at 1521357346.15.

STDERR:

STDOUT:
Driver 6 started at 1521357318.17.
Driver 6 finished at 1521357355.23.

STDERR:

stop_node {'container_id': u'fdade16183bf881078218e0074d1fe0216ae2749ddf1be22da2969cf22991e69', 'is_head': True}
stop_node {'container_id': u'92f44a5a32b0ac2a6230a5edb07b543844c281402f17c232d8bd5892a2d72545', 'is_head': False}
stop_node {'container_id': u'120bc3ef17f0886eee928a523c82fe783b14e6b4aecb841dc0311a78c7a1e330', 'is_head': False}
stop_node {'container_id': u'4d91a8f9e19f8f2f314ef6c823592901db81451da71c117ffadb29d0d8322120', 'is_head': False}
stop_node {'container_id': u'08612276474af81a862fd8516321fb6c70a891f1a72012354e7eb03d25a8ee59', 'is_head': False}
Traceback (most recent call last):
  File "/home/jenkins/workspace/Ray-PRB/test/jenkins_tests/multi_node_docker_test.py", line 354, in <module>
    driver_locations=driver_locations)
  File "/home/jenkins/workspace/Ray-PRB/test/jenkins_tests/multi_node_docker_test.py", line 293, in run_test
    stdout_data, stderr_data = wait_for_output(p)
  File "/home/jenkins/workspace/Ray-PRB/test/jenkins_tests/multi_node_docker_test.py", line 23, in wait_for_output
    stdout_data, stderr_data = proc.communicate()
  File "/home/jenkins/anaconda2/lib/python2.7/subprocess.py", line 800, in communicate
    return self._communicate(input)
  File "/home/jenkins/anaconda2/lib/python2.7/subprocess.py", line 1417, in _communicate
    stdout, stderr = self._communicate_with_poll(input)
  File "/home/jenkins/anaconda2/lib/python2.7/subprocess.py", line 1471, in _communicate_with_poll
    ready = poller.poll()
  File "/home/jenkins/workspace/Ray-PRB/test/jenkins_tests/multi_node_docker_test.py", line 271, in handler
    .format(timeout_seconds))
RuntimeError: This test timed out after 600 seconds.
Build step 'Execute shell' marked build as failure
Setting commit status on GitHub for https://github.com/ray-project/ray/commit/ba2e0c9577b9a5dcac2f68e93c37a414ff7244ea
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/4380/
Test FAILed.
Finished: FAILURE

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions