-
Notifications
You must be signed in to change notification settings - Fork 6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Marking a plasma manager as dead does not mark its local scheduler as dead. #569
Comments
No longer relevant. |
This was referenced Apr 8, 2021
Closed
This was referenced May 15, 2021
This was referenced May 29, 2021
This was referenced Jan 15, 2022
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The file
monitor-008015.err
on the head node looks like this.The entry of
ray.global_state.client_table()
for this node is the following.So the plasma manager has been marked as dead, but the local scheduler on the same node has not.
When I run new workloads, it looks like tasks are scheduled on the node with the "dead" plasma manager. Note that when I run `ps aux | grep "plasma_manager " on the relevant node, the manager seems to still be alive.
What is the intended behavior here. If Ray thinks that the manager is dead, then shouldn't we stop assigning work that node?
The text was updated successfully, but these errors were encountered: