Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix rlib test failure caused by deadlock #4718

Merged
merged 1 commit into from
Apr 30, 2019

Conversation

guoyuhong
Copy link
Contributor

What do these changes do?

Add lock in fetch_and_execute_function_to_run of import_thread.py.
import_thread.py and main thread may both call pickle.loads to import a certain class. If they do the importing at the same time, there will be exception. Thus, the lock is necessary.

2019-04-29 06:12:41,440	ERROR worker.py:1672 -- Failed to unpickle actor class 'PolicyEvaluator' for actor ID 50e34046a1fe558cbeaf18b48516ab015b832bc6. Traceback:
Traceback (most recent call last):
  File "/home/travis/.local/lib/python3.6/site-packages/ray-0.7.0.dev3-py3.6-linux-x86_64.egg/ray/function_manager.py", line 729, in _load_actor_class_from_gcs
    actor_class = pickle.loads(pickled_class)
  File "/home/travis/.local/lib/python3.6/site-packages/ray-0.7.0.dev3-py3.6-linux-x86_64.egg/ray/rllib/__init__.py", line 11, in <module>
    from ray.rllib.evaluation.policy_graph import PolicyGraph
  File "/home/travis/.local/lib/python3.6/site-packages/ray-0.7.0.dev3-py3.6-linux-x86_64.egg/ray/rllib/evaluation/__init__.py", line 2, in <module>
    from ray.rllib.evaluation.policy_evaluator import PolicyEvaluator
  File "/home/travis/.local/lib/python3.6/site-packages/ray-0.7.0.dev3-py3.6-linux-x86_64.egg/ray/rllib/evaluation/policy_evaluator.py", line 21, in <module>
    from ray.rllib.evaluation.sampler import AsyncSampler, SyncSampler
ImportError: cannot import name 'AsyncSampler'
2019-04-29 06:12:41,440	ERROR worker.py:1672 -- Traceback (most recent call last):
  File "/home/travis/.local/lib/python3.6/site-packages/ray-0.7.0.dev3-py3.6-linux-x86_64.egg/ray/import_thread.py", line 128, in fetch_and_execute_function_to_run
    function = pickle.loads(serialized_function)
  File "/home/travis/.local/lib/python3.6/site-packages/ray-0.7.0.dev3-py3.6-linux-x86_64.egg/ray/rllib/evaluation/sampler.py", line 15, in <module>
    from ray.rllib.evaluation.tf_policy_graph import TFPolicyGraph
  File "/home/travis/.local/lib/python3.6/site-packages/ray-0.7.0.dev3-py3.6-linux-x86_64.egg/ray/rllib/evaluation/tf_policy_graph.py", line 14, in <module>
    from ray.rllib.evaluation.policy_graph import PolicyGraph
  File "/home/travis/.local/lib/python3.6/site-packages/ray-0.7.0.dev3-py3.6-linux-x86_64.egg/ray/rllib/__init__.py", line 11, in <module>
    from ray.rllib.evaluation.policy_graph import PolicyGraph
  File "/home/travis/.local/lib/python3.6/site-packages/ray-0.7.0.dev3-py3.6-linux-x86_64.egg/ray/rllib/evaluation/__init__.py", line 2, in <module>
    from ray.rllib.evaluation.policy_evaluator import PolicyEvaluator
  File "/home/travis/.local/lib/python3.6/site-packages/ray-0.7.0.dev3-py3.6-linux-x86_64.egg/ray/rllib/evaluation/policy_evaluator.py", line 21, in <module>
    from ray.rllib.evaluation.sampler import AsyncSampler, SyncSampler
ImportError: cannot import name 'AsyncSampler'

Related issue number

https://travis-ci.com/ray-project/ray/jobs/196114212
#4499

Linter

  • I've run scripts/format.sh to lint the changes in this PR.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-Perf-Integration-PRB/632/
Test PASSed.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/14018/
Test FAILed.

@robertnishihara robertnishihara merged commit 448a7bd into ray-project:master Apr 30, 2019
@guoyuhong guoyuhong deleted the fixRlibDeadlock branch May 5, 2019 02:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants