-
-
Notifications
You must be signed in to change notification settings - Fork 719
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't connect to cluster subprocesses at shutdown #6829
Don't connect to cluster subprocesses at shutdown #6829
Conversation
@@ -586,7 +590,7 @@ def cluster( | |||
nanny=False, | |||
worker_kwargs=None, | |||
active_rpc_timeout=10, | |||
disconnect_timeout=20, | |||
shutdown_timeout=20, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This argument name was unused across the codebase, so changing it seems fine.
async def close(): | ||
logger.debug("Closing out test cluster") | ||
alive_workers = [ | ||
w["address"] | ||
for w in workers_by_pid.values() | ||
if w["proc"].is_alive() | ||
] | ||
await disconnect_all( | ||
alive_workers, | ||
timeout=disconnect_timeout, | ||
rpc_kwargs=rpc_kwargs, | ||
) | ||
if scheduler.is_alive(): | ||
await disconnect( | ||
saddr, timeout=disconnect_timeout, rpc_kwargs=rpc_kwargs | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change is that I just deleted this entire finally
section (and therefore removed the try
block). The stack.callback(_kill_join, scheduler, shutdown_timeout)
will accomplish an equivalent thing without RPCs.
Unit Test ResultsSee test report for an extended history of previous test failures. This is useful for diagnosing flaky tests. 15 files ±0 15 suites ±0 6h 53m 55s ⏱️ + 13m 1s For more details on these failures and errors, see this check. Results for commit 993f420. ± Comparison against base commit 4af2d0a. |
Failures look unrelated:
|
proc.join() | ||
def _kill_join(proc, timeout): | ||
proc.kill() | ||
proc.join(timeout) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe rely on the pytest timeout?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also maybe kill them all at the same time then join them all
Closes #6828. See that issue for explanation.
The one thing we lose with this over the RPC is a clean
close
call—kill
is more abrupt. If we'd like to maintain the cleanclose
, I can easily do that by sending aSIGINT
, waiting, thenSIGKILL
only if necessary.pre-commit run --all-files