-
Notifications
You must be signed in to change notification settings - Fork 7.3k
Socket not closed when using TLS #8615
Comments
What do you mean by 'properly'? |
@indutny the problem shows a fd leak, after running the sample on Rhel 6 or Ubuntu 14, the system is starving of fd, and lsof shows many fd with "can't identify protocol", |
After some more investigation, I'd like to share my thoughts on the source of this issue. I wasn't able to reproduce the issue without using web sockets. While I found it surprising at first, it seems that 1e066e4 may expose an issue with how ws (the WebSocket implementation used by Socket.io) handles cleaning up sockets. ws cleans up resources used by a WebSocket connection in cleanupWebsocketResources. However, this function removes all events handlers for the underlying socket, including the handler for the The problem is that the Removing the call to removeAllListeners in ws fixes the issue, but it probably has other undesirable effects. How does 1e066e4 break this dubious behavior? Let's consider the case when the SSL shutdown is done by the client first, and the server hasn't done a SSL shutdown. The server will call Later, when we actually read EOF, then another change in 1e066e4a4a88f97af865d965f104b5fe8136797f prevents At that point, and as mentionned above, we rely on the Calling @migounette Could you please let me know if you came up with similar conclusions? Also, if the above explanation makes sense to you, could you please try one of above mentionned workarounds in your code and let us know if that fixes your issues? @indutny I would be very happy if you could give us your thought on this too :) Thank you! |
@migounette is ws module using legacy TLS API? Anyway, going to check it all out anyway soon. |
What do you mean by legacy TLS API ?
Yann On Tue, Oct 28, 2014 at 4:58 PM, Fedor Indutny notifications@github.com
|
@migounette There was some confusion about the commit I pointed to (1e066e4). It looked like it was a change that impacted only the TLS legacy API (the first chunk is in I think we all agree now on what the root of the issue is, now the question is what do we fix and how do we fix it? It's unfortunate that ws removes all listeners, including internal ones like However, other modules do and will call Ideally, calling
I don't know if we want to investigate 1) or 2) in the near future. We could also probably fix ws in the short term, that probably wouldn't hurt, but should not be considered as a robust long-term solution. @tjfontaine @indutny Does that sound like a reasonable summary of the situation? |
I agree, any use of removeAllListeners may lead to a false positive bug Tomorrow, I will look deeply for an elegant solution and respect the legacy Thanks for the report On Tue, Oct 28, 2014 at 11:00 PM, Julien Gilli notifications@github.com
|
@misterdjules I agree that we should consider internalising some events in the future, but for now I'm absolutely sure that |
@indutny I agree based on the documentation: "Removes all listeners, or those of the specified event. It's not a good idea to remove listeners that were added elsewhere in the code, especially when it's on an emitter that you didn't create (e.g. sockets or file streams)" So a quick fix is to patch ws. @misterdjules do you want me to propose the fix to ws project ? |
Proposal for fixing "ws": "0.4.31" used by socket.io 1.0.x as dependency This fix shows no more fd leak on our test bed and after a long run. function removeAllListeners(instance) { |
Man, I wonder if |
@migounette @tjfontaine Closing this issue as my recommendation is that we fix the original problem in ws. I also submitted a PR to ws that only removes custom event handlers. |
Currently, it seems that sometimes the socket is not properly closed when using TLS to connect from a client to a server, get some data from the server and close the connection. The client hangs while it waits for the connection to be closed, and eventually exits after approximately 20 seconds.
In order to reproduce this issue, clone the repository that contains the code to reproduce the issue and follow the instructions in the README file.
The current code uses socket.io. Of course, socket.io is a big external dependency and I'm working on a reproduction that doesn't involve socket.io.
The issue cannot be reproduced with Node.js 0.10. Thanks to @migounette, we've been able to identify that the issue appears in node 0.11.10. After some more investigation, it seems that 1e066e4 is the change responsible for the issue.
The text was updated successfully, but these errors were encountered: