-
-
Notifications
You must be signed in to change notification settings - Fork 719
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deprecate registering plugins by class #5699
Deprecate registering plugins by class #5699
Conversation
Can one of the admins verify this patch? |
003146d
to
9ec869d
Compare
distributed/client.py
Outdated
warnings.warn( | ||
"passing plugin classes or kwargs is deprecated; " | ||
"pass a constructed plugin instance instead", | ||
category=FutureWarning, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry if I missed the discussion somewhere, is there a reason to deprecate this feature? It seems useful to me and many other plugin type packages allow passing a class as the plugin so it may be something users expect. Certainly an improved warning message or fixing the broken behavior is warranted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I have a similar question around why we're deprecating this. Removing this functionality is nice in that is decreases the complexity of possible inputs to register_worker_plugin
. That said, some users like passing in classes and the code associated with handling that case is relatively small. Additionally, all deprecations lead to some level of friction for users and, in my opinion, it doesn't seem worth it in this particular case. Raising an error if kwargs
isn't used should solve the original issue.
FWIW register_scheduler_plugin
has the same logic around accepting types. Whatever we do with register_worker_plugin
, we should make sure to do the same thing with register_scheduler_plugin
too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's a similar PR here: https://github.com/dask/distributed/pull/5394/files#diff-bbcf2e505bf2f9dd0dc25de4582115ee4ed4a6e80997affc7b22122912cc6591R5470 I think it's best to copy that deprecation.
having register_worker_plugin
take a class and kwargs separately seems to imply that the class will be constructed somewhere else at another time - rather than immediately
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's a similar PR here
That's different as it's for methods on the Scheduler
, which are much less user-facing than the Client
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
register_scheduler_plugin
is interesting because it's different again, the plugin gets instantiated in the event loop thread:
distributed/distributed/client.py
Lines 4498 to 4500 in 9a66b71
async def _register_scheduler_plugin(self, plugin, name, **kwargs): | |
if isinstance(plugin, type): | |
plugin = plugin(**kwargs) |
whereas for the other plugin registration methods the plugin is instantiated in the current thread
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. Being consistent with when the plugin instance is created would be nice. My guess is either convention is fine. I have a slight preference for creating the plugin inside the event loop thread so that users can be more explicit in await
ing the registration process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's different as it's for methods on the Scheduler, which are much less user-facing than the Client
Well, we're also raising a FutureWarning
over there so I assume we expect it to be used by someone?
I have a slight preference for creating the plugin inside the event loop thread so that users can be more explicit in awaiting the registration process.
I have a preference to do the opposite since we're talking about a loop on Client side here. I don't see a reason why anyone should interact with the client loop at all. Once on the scheduler/worker this is a different story but for the setup there we have appropriate hooks.
When Thomas and I discussed this earlier I was actually thrown off myself for quite some time because I thought this API would only instantiate the class on the remote but that's not the case.
I don't think having two ways to do the same thing is particularly helping with UX, especially not if there is a looming question of "is it executed on the loop or not?" and "is it instantiated on client side or not?"
The bottom line is that I like this change much more than the exception / argument parsing logic we would have in here if we needed to support both use cases. We'd also need much more explicit docs. It also aligns better with the scheduler
23f1c2a
to
192c668
Compare
9260f30
to
d0afa8d
Compare
d0afa8d
to
d15980c
Compare
905c5d5
to
bbd48f2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I am also +1 on having just one way to do things, even though it is a breaking API change. I don't think the current API adds any value (thing you can do with it that you can't do without it), nor even convenience.
Indeed, passing a NannyPlugin
class (instead of instance) right now will probably behave incorrectly:
distributed/distributed/client.py
Lines 4574 to 4575 in 27e467e
if nanny or nanny is None and isinstance(plugin, NannyPlugin): | |
method = self.scheduler.register_nanny_plugin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'll walk back on my previous push back to the changes here. Looking into our git history it appears support for passing in a plugin class, instead of an instance, was a workaround for a cryptic error message. We can add more explicit docs / error messages if this becomes an issue in the future 👍
…er_scheduler_plugin fail if unused kwargs are passed Fixes dask#5698
bbd48f2
to
26c5f8c
Compare
Unit Test Results 18 files ± 0 18 suites ±0 8h 25m 25s ⏱️ - 1h 6m 21s For more details on these failures and errors, see this check. Results for commit 26c5f8c. ± Comparison against base commit 30ffa9c. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @graingert -- this is in
distributed.Client.register_worker_plugin
silently discards**kwargs
#5698pre-commit run --all-files