Graceful handling of sockets (or whatever) getting closed while in use #36

njsmith · 2017-01-24T06:04:05Z

Suppose we have one task happily doing its thing:

async def task1(sock):
    await sock.sendall(b"...")

and simultaneously, another task is a jerk:

async def task2(sock):
    sock.close()

It would be nice to handle this gracefully.

How graceful can we get? There is a limit here, which is that (a) it's Python so we can't actually stop people from closing things if they insist, and (b) the OS APIs we depend on don't necessarily handle this in a helpful way. Specifically I believe that for epoll and kqueue, if a file descriptor that's they're watching gets closed, they just silently stop watching it, which in the situation above would mean task1 blocks forever or until cancelled. (Windows select -- or at least the select.select wrapper on Windows -- seems to return immediately with the given socket object marked as readable.)

As an extra complication, there are really two cases here: the one where the object gets closed just before we hand it to the IO layer, and the one where it gets closed while in possession of the IO layer.

And for sockets, one more wrinkle: when a stdlib socket.socket object is closed, then its fileno() starts returning -1. This is actually kinda convenient, because at least we can't accidentally pass in a valid fd/handle that has since been assigned to a different object.

Some things we could do:

In our close methods, first check with the IOManager whether the object is in use, and if so cancel those uses first. (On Windows we can't necessarily cancel immediately, but I guess that's OK b/c on Windows it looks like closing the handle will essentially trigger a cancellation already; it's the other platforms where we have to emulate this.)
In IOManager methods that take an object-with-fileno()-or-fd-or-handle, make sure to validate the fd/handle while still in the caller's context. I think on epoll/kqueue we're OK right now because the wait_* methods immediately register the fd, and on Windows the register_for_iocp method is similar. But for Windows select, the socket could be invalid and we won't notice until it gets selected on in the select thread. Or it could become invalid on its way to the select thread, or in between calls to select... right now I think this will just cause the select loop to blow up.

The text was updated successfully, but these errors were encountered:

njsmith · 2017-02-12T06:57:31Z

I fixed some of these issues while working on other things -- not sure if there's any more to do here or not.

Definitely still need tests added.

njsmith · 2017-05-09T00:53:47Z

On further thought, behaving better on Unixes is probably doable; something like, having a trio.hazmat.closing_fd function that notifies the I/O loop that it should clear out anything using that fd, and then call it from SocketType.close and similar. It can't be perfect (no way to stop someone doing os.close or whatever), but then it's your fault for not calling the friendly API we provided, isn't it.

njsmith · 2017-05-11T05:02:42Z

If/when this is implemented, we should add checks to the generic stream tests that make sure that if you call a close method on a stream that's blocked inside send_all / receive_some / send_eof, then it terminates.

njsmith · 2018-01-21T10:10:00Z

Oh look, apparently there is some obscure private API in socket.socket that allows some tricks here: https://bugs.python.org/issue32038

However, I'm not sure: maybe it only allows to keep the socket object open until a wait_readable/wait_writable finishes, when what we probably want is for close to immediately interrupt such routines.

Still todo: - full test coverage - updating the stream layer to match - is InterruptedByCloseError the best name? Should it inherit OSError? - Which layers should use which exception? Fixes python-triogh-36, python-triogh-459

Fixes python-triogh-36, python-triogh-459

njsmith added the polish label Jan 24, 2017

njsmith mentioned this issue Oct 26, 2017

Closing a stream doesn't affect readers #341

Closed

njsmith mentioned this issue Jan 14, 2018

Portable way of waiting for sockets when all you have is a socket descriptor, not a socket object #400

Closed

njsmith mentioned this issue Jan 21, 2018

Introduce a task object for per-task join, cancel, and result retrieval #410

Closed

njsmith mentioned this issue Feb 25, 2018

stream.aclose() not working properly? #459

Closed

njsmith mentioned this issue Feb 26, 2018

When a socket/fd is closed, wake up outstanding waiters #460

Merged

njsmith added a commit to njsmith/trio that referenced this issue Jul 16, 2018

When a socket/fd is closed, wake up outstanding waiters

9816bdf

Fixes python-triogh-36, python-triogh-459

njsmith added a commit to njsmith/trio that referenced this issue Jul 16, 2018

When a socket/fd is closed, wake up outstanding waiters

0c3a8f8

Fixes python-triogh-36, python-triogh-459

njsmith added a commit to njsmith/trio that referenced this issue Jul 16, 2018

When a socket/fd is closed, wake up outstanding waiters

7734dbd

Fixes python-triogh-36, python-triogh-459

oremanj closed this as completed in #460 Jul 17, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Graceful handling of sockets (or whatever) getting closed while in use #36

Graceful handling of sockets (or whatever) getting closed while in use #36

njsmith commented Jan 24, 2017

njsmith commented Feb 12, 2017 •

edited

Loading

njsmith commented May 9, 2017

njsmith commented May 11, 2017

njsmith commented Jan 21, 2018

Graceful handling of sockets (or whatever) getting closed while in use #36

Graceful handling of sockets (or whatever) getting closed while in use #36

Comments

njsmith commented Jan 24, 2017

njsmith commented Feb 12, 2017 • edited Loading

njsmith commented May 9, 2017

njsmith commented May 11, 2017

njsmith commented Jan 21, 2018

njsmith commented Feb 12, 2017 •

edited

Loading