Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Graceful handling of sockets (or whatever) getting closed while in use #36

Closed
njsmith opened this issue Jan 24, 2017 · 4 comments
Closed
Labels

Comments

@njsmith
Copy link
Member

njsmith commented Jan 24, 2017

Suppose we have one task happily doing its thing:

async def task1(sock):
    await sock.sendall(b"...")

and simultaneously, another task is a jerk:

async def task2(sock):
    sock.close()

It would be nice to handle this gracefully.

How graceful can we get? There is a limit here, which is that (a) it's Python so we can't actually stop people from closing things if they insist, and (b) the OS APIs we depend on don't necessarily handle this in a helpful way. Specifically I believe that for epoll and kqueue, if a file descriptor that's they're watching gets closed, they just silently stop watching it, which in the situation above would mean task1 blocks forever or until cancelled. (Windows select -- or at least the select.select wrapper on Windows -- seems to return immediately with the given socket object marked as readable.)

As an extra complication, there are really two cases here: the one where the object gets closed just before we hand it to the IO layer, and the one where it gets closed while in possession of the IO layer.

And for sockets, one more wrinkle: when a stdlib socket.socket object is closed, then its fileno() starts returning -1. This is actually kinda convenient, because at least we can't accidentally pass in a valid fd/handle that has since been assigned to a different object.

Some things we could do:

  • In our close methods, first check with the IOManager whether the object is in use, and if so cancel those uses first. (On Windows we can't necessarily cancel immediately, but I guess that's OK b/c on Windows it looks like closing the handle will essentially trigger a cancellation already; it's the other platforms where we have to emulate this.)

  • In IOManager methods that take an object-with-fileno()-or-fd-or-handle, make sure to validate the fd/handle while still in the caller's context. I think on epoll/kqueue we're OK right now because the wait_* methods immediately register the fd, and on Windows the register_for_iocp method is similar. But for Windows select, the socket could be invalid and we won't notice until it gets selected on in the select thread. Or it could become invalid on its way to the select thread, or in between calls to select... right now I think this will just cause the select loop to blow up.

@njsmith njsmith added the polish label Jan 24, 2017
@njsmith
Copy link
Member Author

njsmith commented Feb 12, 2017

I fixed some of these issues while working on other things -- not sure if there's any more to do here or not.

Definitely still need tests added.

@njsmith
Copy link
Member Author

njsmith commented May 9, 2017

On further thought, behaving better on Unixes is probably doable; something like, having a trio.hazmat.closing_fd function that notifies the I/O loop that it should clear out anything using that fd, and then call it from SocketType.close and similar. It can't be perfect (no way to stop someone doing os.close or whatever), but then it's your fault for not calling the friendly API we provided, isn't it.

@njsmith
Copy link
Member Author

njsmith commented May 11, 2017

If/when this is implemented, we should add checks to the generic stream tests that make sure that if you call a close method on a stream that's blocked inside send_all / receive_some / send_eof, then it terminates.

@njsmith
Copy link
Member Author

njsmith commented Jan 21, 2018

Oh look, apparently there is some obscure private API in socket.socket that allows some tricks here: https://bugs.python.org/issue32038

However, I'm not sure: maybe it only allows to keep the socket object open until a wait_readable/wait_writable finishes, when what we probably want is for close to immediately interrupt such routines.

njsmith added a commit to njsmith/trio that referenced this issue Feb 26, 2018
Still todo:

- full test coverage
- updating the stream layer to match
- is InterruptedByCloseError the best name? Should it inherit OSError?
- Which layers should use which exception?

Fixes python-triogh-36, python-triogh-459
njsmith added a commit to njsmith/trio that referenced this issue Feb 26, 2018
Still todo:

- full test coverage
- updating the stream layer to match
- is InterruptedByCloseError the best name? Should it inherit OSError?
- Which layers should use which exception?

Fixes python-triogh-36, python-triogh-459
njsmith added a commit to njsmith/trio that referenced this issue Jul 8, 2018
Still todo:

- full test coverage
- updating the stream layer to match
- is InterruptedByCloseError the best name? Should it inherit OSError?
- Which layers should use which exception?

Fixes python-triogh-36, python-triogh-459
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant