-
-
Notifications
You must be signed in to change notification settings - Fork 719
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add worker_key parameter to Adaptive #1992
Conversation
This allows adaptive clusters to intelligently close down groups of workers based on some logical association. See dask/dask-jobqueue#63 for motivation
@jhamman any luck with this? My current plan is to wait on you to try things. No rush, I just wanted to make sure that you weren't waiting on me to merge this. |
@mrocklin - So far so good. I'm testing this branch along with my mods in dask/dask-jobqueue#63 and things are working much better. I'll give this PR a quick review now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This all seems to work much better so apart from a few questions, I'm happy to see these changes.
@@ -223,7 +227,8 @@ def workers_to_close(self, **kwargs): | |||
@gen.coroutine | |||
def _retire_workers(self, workers=None): | |||
if workers is None: | |||
workers = self.workers_to_close() | |||
workers = self.workers_to_close(key=self.worker_key, | |||
minimum=self.minimum) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is the logic for passing in minimum here? Should you also pass in maximum?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minimum here corresponds to the minimum number of workers that we want to keep. The workers_to_close
function is a slightly nicer place to handle this than outside. For example if workers_to_close
tells us that we need to close all of our workers, we then look at minimum and see that we need one, which worker should we keep? Also, which other workers should we keep that are in the same group. Suddenly we have to replicate the grouping logic.
Maximum doesn't make sense because we're only reducing the number of workers, not increasing them.
future) | ||
return future | ||
else: | ||
return sync(self.loop, func, *args, **kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this related to the grouped worker issue? (I'm not following)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something was slightly nicer if we switched to using self.sync here (this is a common pattern in the client). I decided to clean this up as part of this PR.
My thought is that we can wait on merging this until we have a strong reason to do so (either dask/dask-jobqueue#63 or pressure from somewhere else). I imagine its quite possible that we'll iterate a bit more here before I'm done with dask/dask-jobqueue#63 . |
@mrocklin - save my few questions above, this seems to working and I'd like to see it merged in the next few days. Currently, this is blocking dask/dask-jobqueue#63. |
I'm fine to merge this any time. I'll give you a chance to respond to recent commentary but I'll merge after you give the go-ahead. |
Thanks @mrocklin - go ahead and merge whenever now. |
This allows adaptive clusters to intelligently close down groups of
workers based on some logical association.
See dask/dask-jobqueue#63 for motivation
cc @jhamman