Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

container exits but output stream remains open #251

Open
tdyas opened this issue Sep 3, 2022 · 10 comments
Open

container exits but output stream remains open #251

tdyas opened this issue Sep 3, 2022 · 10 comments

Comments

@tdyas
Copy link
Contributor

tdyas commented Sep 3, 2022

I am trying to capture output from a running container using Docker::attach_container but it appears that the output stream does not close even though the container exited (and which was reported via the wait_container stream). The code at issue is in this PR pantsbuild/pants#16766.

That code monitors the output stream returned by attach_container and the stream from wait_container for the exit code with a tokio::select!. When the container exits, the code breaks out of the monitoring loop and then tries to see if any output remains on the output stream, but that stream never closes so the code hangs there.

Am I somehow not using the APIs correctly? (The machine is macOS 12.5.1 with Docker Desktop 4.11.1.)

@fussybeaver
Copy link
Owner

So, with the 0.13 version of Bollard, wait_container will only emit an Err if the Docker server responds with a non-200 status code - for example, if the container doesn't exist or some other network related error. In that version of Bollard, you could capture the status code emitted by the wait_container operation and handle it how you need to.

However, I've just merged a PR that should change that behaviour #250 - in the latest master, streams that return an error in the JSON will also return an error in Bollard.

Let me know in your testing if that works for you.

@tdyas
Copy link
Contributor Author

tdyas commented Sep 4, 2022

So, with the 0.13 version of Bollard, wait_container will only emit an Err if the Docker server responds with a non-200 status code - for example, if the container doesn't exist or some other network related error. In that version of Bollard, you could capture the status code emitted by the wait_container operation and handle it how you need to.

To clarify, the issue that I am seeing is not with the wait_container stream. That stream actually does report the container exit correctly. The issue is that the attach_container output stream does not close once the container has gone away. This means that, after the container exit is reported on the wait_container stream, the code that checks the attach_container output stream hangs because the attach_container output stream never closes.

The context is that the project in question is a build tool that needs to capture all output from running a build action in a Docker container. I observed a race condition in testing where my code would fail to capture some output. My thought was to check the attach_container output stream to make sure there was no "left-over" output remaining on that stream after the container exit was reported.

Regarding #250, I will try it out with the hope that maybe that it will change the behavior of the attach_container output stream.

@tdyas
Copy link
Contributor Author

tdyas commented Sep 4, 2022

I tried the latest master branch (at cfebd306a4). No change in behavior; the attach_container still remains open even after the container has exited (and as reported on the wait_container stream).

test.log

@algernon
Copy link

algernon commented Sep 4, 2022

I'm having similar issues - or so I think, at least. I may have a repro case too, one that isn't particularly big, either. I have to clean it up a little first, it's in a private repo at the moment, have to lift it out of there.

@fussybeaver
Copy link
Owner

Ok, thanks for explaining a bit. That does sound quite different. I would appreciate a minimal test case -- @antoinert do you have any idea why this might be happening?

@tdyas
Copy link
Contributor Author

tdyas commented Sep 6, 2022

Ok, thanks for explaining a bit. That does sound quite different. I would appreciate a minimal test case -- @antoinert do you have any idea why this might be happening?

#254 has a minimal reproduction.

@fussybeaver
Copy link
Owner

Thanks for the reduced example.

I took a quick look, and can confirm that the streams are hanging. Some playing around though confirmed there is a timing issue with the running container and the client that attaches the output. If you change the cmd to vec["sleep", "3"] or execute attach_container prior to start_container, the streams are closed properly. So, there's a workaround there.

I'll be taking a closer look at the hyper logs to see if there's something else I can see.

@tdyas
Copy link
Contributor Author

tdyas commented Sep 13, 2022

For my own use case, I ended up using executions (which return the output stream), and so this issue is not an issue for me any more. I can close unless someone else has a reason to keep it open.

In any event, it seems like the proper way to attach to a container would be to call attach_container first before calling start_container?

@fussybeaver
Copy link
Owner

Let's keep the ticket open.

I tried comparing the hyper / tokio logs between the two, but didn't see anything of notice. The server should have the responsibility of closing the stream, so it should be reproducible on the moby codebase - that's my next step regarding this ticket.

@fussybeaver
Copy link
Owner

I have done some more due diligence to this issue to try and understand what is going on, and replicated the example PR using the golang moby client SDK: https://github.com/moby/moby/compare/master...fussybeaver:moby:ND-attach-ordering-issue?expand=1

One thing struck me initially - removing the stream=true parameter means nothing is emitted. The documentation does suggest this though https://docs.docker.com/engine/api/v1.41/#tag/Container/operation/ContainerAttach

As demonstrated in this ticket (and in this closed moby ticket), the intended useage of the API is containerCreate, containerAttach, containerStart, containerWait. In moby, when containerAttach is triggered for a stopped container, the behaviour of the client SDK is to hang on the attach call. According to a similar bug ticket on the moby project, this is by design, in case you call attach before a container has started. I verified the hung client would continue by restarting a previously stopped container.

Similarly in Bollard, I assume that a stream will only close when a container is stopped. We should probably change the documentation to make this point clear when using this API. I'm open to any suggestions though on making this more ergonomic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants