-
Notifications
You must be signed in to change notification settings - Fork 422
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
coap_io.c: Fix file descriptor leaks when "connections" fail #131
Conversation
Travis failure in t_wellknown4 appears to be because it cannot connect to in6addr_loopback in t_wkc_tests_create(). Same issue seen in "Pull Request #128 Support for ECDSA use via tinydtls" |
/* client-side ICMP destination unreachable, ignore it */ | ||
coap_log(LOG_WARNING, "coap_network_read: unreachable\n"); | ||
return -2; | ||
/* client-side ICMP destination unreachable, ignore it */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A UDP socket should be closed when the client session ends, because despite the error it is still valid. The -2 return value already guarantees the session state is reset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the target server is not available, but trying to set up a DTLS session, and a -2 is returned from coap_network_read, the logic in coap_read_session() calls coap_session_reset() but never calls coap_socket_close() leaving the socket file descriptor still open.
If an application keeps on trying to call coap_new_client_session_psk() - not just the once as coap-client does, and you look at the file descriptors, you will see them increasing - as :
ls -l /proc/19691/fd
total 0
lrwx------ 1 root root 64 Feb 12 19:38 0 -> /dev/pts/0
lrwx------ 1 root root 64 Feb 12 19:38 1 -> /dev/pts/0
lrwx------ 1 root root 64 Feb 12 19:39 10 -> socket:[11118524] <==
lrwx------ 1 root root 64 Feb 12 19:39 11 -> socket:[11118774]
lrwx------ 1 root root 64 Feb 12 19:40 12 -> socket:[11119012]
lrwx------ 1 root root 64 Feb 12 19:41 13 -> socket:[11119208]
lrwx------ 1 root root 64 Feb 12 19:38 2 -> /dev/pts/0
lrwx------ 1 root root 64 Feb 12 19:38 3 -> socket:[11118042]
.. Additional other fds.
netstat -anp | grep client
udp 0 0 192.168.0.189:37381 192.168.191.2:5684 ESTABLISHED 19691/./client
udp 0 0 192.168.0.189:50195 192.168.191.2:5684 ESTABLISHED 19691/./client
udp 0 0 192.168.0.189:35091 192.168.191.2:5684 ESTABLISHED 19691/./client
udp 0 0 192.168.0.189:40860 192.168.191.2:5684 ESTABLISHED 19691/./client
udp 0 0 192.168.0.189:56481 192.168.191.2:5684 ESTABLISHED 19691/./client
udp 0 0 192.168.0.189:42532 192.168.191.2:5684 ESTABLISHED 19691/./client
udp 0 0 192.168.0.189:33191 192.168.191.2:5684 ESTABLISHED 19691/./client
udp 0 0 192.168.0.189:54443 192.168.191.2:5684 ESTABLISHED 19691/./client
udp 0 0 192.168.0.189:44846 192.168.191.2:5684 ESTABLISHED 19691/./client
udp 0 0 192.168.0.189:45871 192.168.191.2:5684 ESTABLISHED 19691/./client
udp 0 0 192.168.0.189:48725 192.168.191.2:5684 ESTABLISHED 19691/./client
udp 0 0 192.168.0.189:50531 192.168.191.2:5684 ESTABLISHED 19691/./client
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By design the number of open sockets will increase if you create multiple client sessions to the same server, but this isn’t a leak. Presumably your would release the previous session before you create a new one if your application logic calls for only one socket. Unlike server sessions, client sessions always have to be actively managed by the client, regardless of the state of the session, since only the client knows if it needs to keep the session opened or if it is done with it.
In this particular case (datagram with icmp not reachable) there is no reason to close the socket. You can send a new message right away on the same socket and it may or may not succeed, the error is not sticky.
/* client-side ICMP destination unreachable, ignore it */ | ||
coap_log(LOG_WARNING, "coap_network_read: unreachable\n"); | ||
return -2; | ||
/* client-side ICMP destination unreachable, ignore it */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hear you re ICMP Not reachable. The application requires one socket per target server, and as DTLS is never established, the existing socket cannot continue to be used. We have to close the existing session and start a new coap_new_client_session_psk() to start the DTLS negotiations again. I will remove the coap_socket_close() from this change and see how it can be done from within the application.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, as the return of -2 from coap_network_read() triggers a call to coap_session_reset() which not only resets any pending traffic, but also frees off the TLS session in coap_dtls_free_session(), so the session is unusable for any further TLS traffic -not sure if this is the intention.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it is the intention. A new DTLS session will automatically be established when a new message is queued for sending. At least this is what it is supposed to do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Our code is now correctly freeing off the session before starting the next one and no fd leeks found so far. It was my intention to test out the auto re-establishment today, but I did not have time.
If a TCP connection connection fails, or a an ICMP Unreachable is received for a UDP packet, then the file descriptor is not closed off. Eventually all fails when no more file descriptors are allowed.
Reverting out this part of the change. This should be managed and done at the application layer.
6c9afdc
to
eb84476
Compare
As COAP_NACK_RST is used for 1. ICMP Port Unreachable message is received 2. CoAP reply RST message. there needs to be a way to differentiate between these conditions. If there is detected to be an ICMP response issue to sending a UDP packet, report this as COAP_NACK_ICMP_ISSUE instead of COAP_NACK_RST. If the reported issue to coap_session_disconnected() is COAP_NACK_ICMP_ISSUE, then make sure that the packet (plain or DTLS) gets re-transmitted. As hinted in obgm#131 discussion, retry of the packet should be attempted, (as per https://tools.ietf.org/html/rfc7252#section-4.2 ICMP Unreachable) this fix will cause the plain packet to get retransmitted up to the maximum retry count if type CON. Likewise, DTLS will get restarted up to the maximum retry count if type CON. examples/client.c: Include COAP_NACK_ICMP_ISSUE in the nack_handler. include/coap2/coap_io.h: Add in COAP_NACK_ICMP_ISSUE to coap_nack_reason_t. src/coap_session.c: Update coap_session_disconnected() so that plain/DTLS can restart. src/net.c: Fix coap_wait_ack() to exponentially back off the re-transmit times. Update coap_read_session() to report COAP_NACK_ICMP_ISSUE instead of COAP_NACK_RST.
As COAP_NACK_RST is used for 1. ICMP Port Unreachable message is received 2. CoAP reply RST message. there needs to be a way to differentiate between these conditions. If there is detected to be an ICMP response issue to sending a UDP packet, report this as COAP_NACK_ICMP_ISSUE instead of COAP_NACK_RST. If the reported issue to coap_session_disconnected() is COAP_NACK_ICMP_ISSUE, then make sure that the packet (plain or DTLS) gets re-transmitted. As hinted in obgm#131 discussion, retry of the packet should be attempted, (as per https://tools.ietf.org/html/rfc7252#section-4.2 ICMP Unreachable) this fix will cause the plain packet to get retransmitted up to the maximum retry count if type CON. Likewise, DTLS will get restarted up to the maximum retry count if type CON. examples/client.c: Include COAP_NACK_ICMP_ISSUE in the nack_handler. include/coap2/coap_io.h: Add in COAP_NACK_ICMP_ISSUE to coap_nack_reason_t. src/coap_session.c: Update coap_session_disconnected() so that plain/DTLS can restart. src/net.c: Fix coap_wait_ack() to exponentially back off the re-transmit times. Update coap_read_session() to report COAP_NACK_ICMP_ISSUE instead of COAP_NACK_RST. man/coap_handler.txt.in: Update documentation
If a TCP connection connection fails, or a an ICMP Unreachable is received
for a UDP packet, then the file descriptor is not closed off. Eventually
all fails when no more file descriptors are allowed.