-
Notifications
You must be signed in to change notification settings - Fork 178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ensure we don't loop trying to write to a channel thats not connected (fix 100% CPU) #419
Conversation
src/waitress/channel.py
Outdated
|
||
def handle_write(self): | ||
# Precondition: there's data in the out buffer to be sent, or | ||
# there's a pending will_close request | ||
|
||
if not self.connected: | ||
if not self.connected and not self.close_when_flushed: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if we should also prevent a return if total_outbufs_len > 0 as this could also be a case where you want to write to get an error and close the channel
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@digitalresistor thoughts?
@@ -67,6 +67,9 @@ def __init__(self, server, sock, addr, adj, map=None): | |||
self.outbuf_lock = threading.Condition() | |||
|
|||
wasyncore.dispatcher.__init__(self, sock, map=map) | |||
if not self.connected: | |||
# Sometimes can be closed quickly and getpeername fails. | |||
self.handle_close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@digitalresistor @d-mauer I'm still not sure on this fix. I think I read somewhere how windows can sometimes fail on getpeername?
The other fix will still prevent the looping bug by letting it write and error out. This one will close it before it wastes the app time if indeed the connection really is closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If getpeername() fails on Windows then it would get self.connected set to False anyway, this would cause the bug. So trying to keep going after getpeername() failed is not sustainable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
connect = False doesn't cause the bug alone. It also needs the request to be malformed as that prevents both reading and writing and maintainance from cleaning it up. So in someways the real bug is in handle_write
if not self.connected:
return
The test i put in shows that in the most likely scenario that makes this occur it's trying to close the channel but is prevented from doing so by the above line.
The more I think about it @d-maurer is correct that this should be changed to self.closed or something that explicitly prevents a close from happening twice. Thats the safest minimal change.
Closing this since #435 was merged. Thank you! |
fixes #418
This fixes a race condition where a quick forced socket close will make getpeername to fail and connected == False. If the request then has an error close_after_flush is set and this will result in a 100% CPU loop.
This wiil ensure the channel is still closed in this case.
TODO: