Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add shutdown method to CommandRunner #16930

Merged
merged 5 commits into from
Sep 21, 2022

Conversation

tdyas
Copy link
Contributor

@tdyas tdyas commented Sep 20, 2022

The Docker command runner needs to cleanly remove cached containers when it shuts down. The removal API must be called in an async context, however, so the shutdown cannot be performed from drop handler since that is a non-async context.

Add a shutdown method to CommandRunner and implement it for all relevant command runners. For the Docker command runner, this means that the shutdown logic for ContainerCache will be invoked in an async context.

[ci skip-build-wheels]

@tdyas tdyas added the category:internal CI, fixes for not-yet-released features, etc. label Sep 20, 2022
Copy link
Member

@stuhood stuhood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

src/rust/engine/process_execution/src/lib.rs Outdated Show resolved Hide resolved
src/rust/engine/process_execution/src/lib.rs Outdated Show resolved Hide resolved
src/python/pants/pantsd/service/scheduler_service.py Outdated Show resolved Hide resolved
Tom Dyas and others added 5 commits September 21, 2022 13:15
[ci skip-build-wheels]
Co-authored-by: Stu Hood <stuhood@gmail.com>
# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
[ci skip-rust]

[ci skip-build-wheels]
[ci skip-build-wheels]
Copy link
Member

@stuhood stuhood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@tdyas tdyas merged commit 5b68eb0 into pantsbuild:main Sep 21, 2022
@tdyas tdyas deleted the command_runner_shutdown branch September 21, 2022 19:41
@stuhood
Copy link
Member

stuhood commented Sep 21, 2022

@tdyas : For posterity, here is an example of the hang that was located via #16927: https://gist.github.com/stuhood/d05d4fcc2b2fd7e464fa397c362eb894

Looking at it again, the thing that jumps out at me is that the thing that is being Droped is actually a PySession... and then what I think is happening is that it is triggering Drop for the Session -> Core -> CommandRunner, etc.

So... it's possible that this is because Drop for our Python wrapper types doesn't release the GIL...? Or perhaps you see something more specific in the stack, where we're hanging trying to tear something down?

@stuhood
Copy link
Member

stuhood commented Sep 21, 2022

Or perhaps you see something more specific in the stack, where we're hanging trying to tear something down?

Oh. Is it possible that we're failing to connect to docker? It looks like we do that regardless of whether we have any container ids to cancel.

That could probably also skip tearing anything down if the DockerOnceCell had never been successfully initialized...

@tdyas
Copy link
Contributor Author

tdyas commented Sep 21, 2022

Oh. Is it possible that we're failing to connect to docker? It looks like we do that regardless of whether we have any container ids to cancel.

That could probably also skip tearing anything down if the DockerOnceCell had never been successfully initialized...

I'll put that fix in now as a separate PR since this one already landed.

@tdyas
Copy link
Contributor Author

tdyas commented Sep 21, 2022

#16951

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category:internal CI, fixes for not-yet-released features, etc.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants