Skip to content

Conversation

@robertnishihara
Copy link
Collaborator

This might fix #1773. The same issue probably occurs for other algorithms in RLlib, but I haven't tried them.

Note that this suggests we need to do a lot more cluster testing to help uncover these kinds of issues (which may be timing related).

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/Ray-PRB/4477/
Test PASSed.

@pcmoritz pcmoritz merged commit 10dabce into ray-project:master Mar 23, 2018
@pcmoritz pcmoritz deleted the rllibfix branch March 23, 2018 12:54
royf added a commit to royf/ray that referenced this pull request Apr 22, 2018
* commit 'f69cbd35d4e86f2a3c2ace875aaf8166edb69f5d': (64 commits)
  Bump version to 0.4.0. (ray-project#1745)
  Fix monitor.py bottleneck by removing excess Redis queries. (ray-project#1786)
  Convert the ObjectTable implementation to a Log (ray-project#1779)
  Acquire worker lock when importing actor. (ray-project#1783)
  Introduce a log interface for the new GCS (ray-project#1771)
  [tune] Fix linting error (ray-project#1777)
  [tune] Added pbt with keras on cifar10 dataset example (ray-project#1729)
  Add a GCS table for the xray task flatbuffer (ray-project#1775)
  [tune] Change tune resource request syntax to be less confusing (ray-project#1764)
  Remove from X import Y convention in RLlib ES. (ray-project#1774)
  Check if the provider is external before getting the config. (ray-project#1743)
  Request and cancel notifications in the new GCS API (ray-project#1758)
  Fix resource bookkeeping for blocked actor methods. (ray-project#1766)
  Fix bug when connecting another driver in local case. (ray-project#1760)
  Define string prefixes for all tables in the new GCS API (ray-project#1755)
  [rllib] Update RLlib to work with new actor scheduling behavior (ray-project#1754)
  Redirect output of all processes by default. (ray-project#1752)
  Add API for getting total cluster resources. (ray-project#1736)
  Always send actor creation tasks to the global scheduler. (ray-project#1757)
  Print error when actor takes too long to start, and refactor error me… (ray-project#1747)
  ...

# Conflicts:
#	python/ray/rllib/__init__.py
#	python/ray/rllib/dqn/dqn.py
#	python/ray/rllib/dqn/dqn_evaluator.py
#	python/ray/rllib/dqn/dqn_replay_evaluator.py
#	python/ray/rllib/optimizers/__init__.py
#	python/ray/rllib/tuned_examples/pong-dqn.yaml
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[rllib] Deadlock error when running ES.

3 participants