-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Socket listeners do not get close notifications #14
Comments
Curio has proven to be a bit more problematic in this regard. We use its native I/O waiting traps but there seems to be no way to abort this operation short of cancelling the entire task. |
Curio only provides an async wrapper layer on top of native sockets. As such, the solution to this problem is likely going to be the same as it would be for normal synchronous code involving threads and sockets. A bit of research on this topic indicates that calling After a bit of searching around, the suggested solution to this issue is to use |
Ooh, cool. Thanks, I'll try that! |
@dabeaz did you try with only TCP sockets? I'm not having much luck with UDP sockets:
|
The |
Stepping back for a moment, is this issue addressing an actual issue in application code that exists in the real world or is it addressing a theoretical issue in someone's head? The reason I ask is that virtually every example of network programming I've ever seen follows a rough pattern of this:
Having an outside task come in from afar and If the goal is to get a task to stop reading a socket, the proper solution (to me) seems to be to cancel it, not to pull the rug from under it by closing the socket from somewhere else. Honestly, my inclination for Curio is to address this with a note in the documentation to |
For now, this is a theoretical issue but I did encounter it while writing a trivial UDP test where the listener loop launches a new task that closes the socket (and thus finishes the async generator). More than that, there is also notify_fd_close() in trio. If not, I would've considered this to be purely overengineering on my behalf. For the record, I have a working fix for this with tests added. |
I've had this in practice before in a websocket library, where a websocket connection was waiting for events off a socket but then had the socket closed under it, leaving that WS connection dead forever. What kind of API can I expose to then fix that? I can't cancel the reader since the reading task isn't under my control. I can send a close frame and hope the server closes it, but that's no guarentee. The only two options are to internally spawn a task for every single read operation, and then use a queue and cancel that as appropriate, but that's barely even async/await at that point, it's more callbacky. The other option is for curio to actually fix this in the core. |
Who/what was responsible for closing the underlying socket in this case? Did you do that? Was it in third-party code? Was this in Curio? Interested in more details about the exact situation. (Note: would think that using sock.shutdown() instead of sock.close() would probably handle this case more gracefully). |
It was me, in curio. It was a |
Here's the Trio PR where we added it, if you want to get context: python-trio/trio#460 @Fuyukai is one of three different people who got bitten by this and reported issues... it's definitely a footgun. The kernel already knows who's blocked on any given fd at any given moment, so it's easy for it to error those out if the fd is about to go away. And even if you think that it's a mistake for one task to close an fd that another task is waiting on, if it does happen then it's not very convenient to have the waiting task hang forever, and then trigger weird errors later after the fd number is reassigned to a new file. |
@dabeaz |
On UDP: If you're in a position where a task can call |
The problem with "just cancel it" is that it then means every single wrapper library needs to then either perform weird internal task spawning semantics to steamroll over curio's inability to close safely, or have the user have to track the tasks instead, effectively requiring one of the two to create an entire supervision layer tracking tasks that need to be cancelled as well until an entire task supervision library is re-invented on top of curio. For every single time a socket is interacted with and potentially closed (i.e. every single time any library implements a protocol over a socket). Or, curio could add a mechanism to safely close the socket, and then libraries and users don't need to constantly have to avoid this footgun in every single piece of code that touches a socket that may ever be closed in some layer of the abstraction level. |
I'm sorry, but I'm not getting this. The whole point of having cancellation in a library like Curio is so that you don't have to build some kind of hacked up half-broken cancellation scheme based on I'm also not getting where it's now a suddenly a good idea to go around haphazardly closing resources that are being used by different tasks/threads? Is there now a "Lucy" design pattern based on the football handling in the Peanuts cartoon? I don't mean to be overly dense here, but closing resources and handling tasks in this way seems like a really bad idea. |
There are clearly two distinct schools of thought at work here. Me? I'm just munching popcorn until consensus is reached. That said, I think that we can all agree that leaving a task waiting forever on a socket that has been closed is bad. Another approach would be to detect that another task is waiting on that socket and then raise an exception to the task that attempts to close the socket. The same thing would probably be a good thing to do if a task tries to read or write to the socket while another task is waiting on it. |
Curio already raises an error for the case of a task trying to read/write on a socket while another task is waiting to perform the same operation on the same socket. Extending |
If you're writing a library that implements a protocol on top of an extra socket, how are you meant to cancel the thing reading from your protocol? Say, given this abc: class SomeProto(object):
def __init__(self, socket):
...
async def read_event(self) -> SomeEvent:
...
async def send_event(self, evt: SomeEvent):
...
async def close(self):
... # does cleanup and then closes the socket Somebody else's code calls |
I guess a more general design question is whether or not the implementation of a protocol should be combined together with aspects of task management. For example, is closing a protocol effectively the same as cancelling a task using it? |
You might not know which task is using the protocol, and you might want to have the reading (or writing) procedure raise a "real" exception instead of a cancellation. For instance, a common pattern for multiplexed protocols is that the client grabs a lock, writes some data, adds itself to the response-handling queue, and then waits for the response to arrive. If there is none, it wants to get an exception. If the write locks up, a keepalive handler will close the socket and then send an error to all queue entries. I do not want to be forced to create yet another queue and a separate "writer" task just because closing the socket results in a hung-up writer. That's an annoying rewrite when converting |
But would using |
There are request/response protocols that don't use TCP. We're not talking about arbitrary file descriptor integers here, we're talking about sockets, or some other Python-level data structure. You try to use the thing after it's been closed, you get an error – presumably because the act of closing sets the file descriptor to -1, sets a It's not as if people habitually called |
I get the not everything is TCP bit. However, I 'm still not buying the argument that letting arbitrary tasks walk around and call I'm not trying to be overly difficult here, but you have these nice async libraries with cancellation. But, instead of using that, we're doing this? That makes no sense. |
In order to leverage this paticular nice async library with cancellation, you have to set your wrapper up like this: async def read_event(self) -> SomeEvent:
task = await curio.spawn(self._real_read_event)
self.tasks.append(task)
return await task.join()
async def _real_read_event(self):
# read and feed state machine, etc
return some_event
async def close(self):
for task in self.tasks:
await task.cancel() Or you have to mark close as "doesn't actually work" and make the user track all their tasks manually instead which sort of defeats half of the point of using this nice async library. |
Well, that's actually not far from what I might have in mind. Proper task management. Maybe it's formulated slightly differently, but basically that's the gist of it.
Task cancellation. Not a bunch of code worrying about closed sockets. |
Sorry for being a bit blunt here, but this is exactly the sort of housekeeping dance-around which we (meaning, people who have been using So instead of the cognitive load of adding ten lines of hopefully-correct but feels-like-it's-not-really-composeable and definitely-not-well-testable boilerplate to every piece of code that accesses a socket, with Trio I can simply close it and be sure that all users get what they deserve, i.e. an exception stating that the socket is now closed. Instead of, say, a cancellation that can mean anything. I don't want my library to force any task management on me, it's difficult enough to structure tasks in a Real Application even without such constraints. Instead, I want my library to do basic socket management for me – that's the reason why it has proper in Trio, coordination happens when I close the thing. I might add some exception handlers for Thus, we ask you to please solve this problem in curio, instead of forcing every user to solve it for themselves. Thank you. |
More accurately, I would like curio to at least give me the primitives I need to implement such semantics on anyio. I'm not asking curio to change its philosophy. For the record, this is how I currently handle socket close notifications on curio: class Socket(BaseSocket):
_reader_tasks = {}
_writer_tasks = {}
async def _wait_readable(self):
task = await curio.current_task()
self._reader_tasks[self._raw_socket] = task
try:
await curio.traps._read_wait(self._raw_socket)
except curio.TaskCancelled:
if self._raw_socket.fileno() == -1:
raise ClosedResourceError from None
else:
raise
finally:
del self._reader_tasks[self._raw_socket]
async def _notify_close(self) -> None:
task = Socket._reader_tasks.get(self._raw_socket)
if task:
await task.cancel(blocking=False)
task = Socket._writer_tasks.get(self._raw_socket)
if task:
await task.cancel(blocking=False) The |
From the Curio FAQ: "Curio remains committed to not doing much of anything the best it can." When I wrote that, I was mainly thinking about HTTP. However, I think it also applies to bolting some kind of muddled mess of code related to the semantics of close/cancel onto the side of sockets. I don't really want to do it. Sorry. That said, it's important to emphasis that Curio is really more of a low-level library for building your own framework or your own I/O primitives. There's nothing particularly sacred about the Curio socket implementation. In fact, the whole point of Curio "not doing anything" was for it to stay out of the way of you doing things. Build a different socket class if it makes sense. Or use inheritance. I look at the I've got some ideas on how I might make it easier to get some useful information out of the Curio kernel that can help with the above implementation, but I'm not inclined to expand Curio sockets with a bunch of new behavior beyond that. |
A somewhat random aside---a tricky thread cancellation problem came up in a class I was teaching this week. Perhaps I was thinking of this thread, but in trying to solve it, I suggested that maybe you could just close the associated file and hope that the thread would error out of whatever it was waiting for. Well, that definitely worked except that the "error out" was actually a segmentation fault of the entire interpreter. So, I think I still stand by my original thoughts on this issue. For what it's worth, the thread did, in fact, die. |
@dabeaz haha wut. I've never managed to get closing a file to affect a waiting thread at all. What on Earth was this thread doing? |
If one task is waiting for a socket to become ready for reading or writing and another task closes it, the former task is left waiting forever. Trio has a native mechanism to handle this gracefully, and for the other two we could whip up a solution fairly easily.
The text was updated successfully, but these errors were encountered: