Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

coap_io.c: Fix file descriptor leaks when "connections" fail #131

Merged
merged 2 commits into from
Feb 20, 2018

Conversation

mrdeep1
Copy link
Collaborator

@mrdeep1 mrdeep1 commented Feb 9, 2018

If a TCP connection connection fails, or a an ICMP Unreachable is received
for a UDP packet, then the file descriptor is not closed off. Eventually
all fails when no more file descriptors are allowed.

@mrdeep1
Copy link
Collaborator Author

mrdeep1 commented Feb 10, 2018

Travis failure in t_wellknown4 appears to be because it cannot connect to in6addr_loopback in t_wkc_tests_create().

Same issue seen in "Pull Request #128 Support for ECDSA use via tinydtls"

/* client-side ICMP destination unreachable, ignore it */
coap_log(LOG_WARNING, "coap_network_read: unreachable\n");
return -2;
/* client-side ICMP destination unreachable, ignore it */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A UDP socket should be closed when the client session ends, because despite the error it is still valid. The -2 return value already guarantees the session state is reset.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the target server is not available, but trying to set up a DTLS session, and a -2 is returned from coap_network_read, the logic in coap_read_session() calls coap_session_reset() but never calls coap_socket_close() leaving the socket file descriptor still open.

If an application keeps on trying to call coap_new_client_session_psk() - not just the once as coap-client does, and you look at the file descriptors, you will see them increasing - as :

ls -l /proc/19691/fd

total 0
lrwx------ 1 root root 64 Feb 12 19:38 0 -> /dev/pts/0
lrwx------ 1 root root 64 Feb 12 19:38 1 -> /dev/pts/0
lrwx------ 1 root root 64 Feb 12 19:39 10 -> socket:[11118524] <==
lrwx------ 1 root root 64 Feb 12 19:39 11 -> socket:[11118774]
lrwx------ 1 root root 64 Feb 12 19:40 12 -> socket:[11119012]
lrwx------ 1 root root 64 Feb 12 19:41 13 -> socket:[11119208]
lrwx------ 1 root root 64 Feb 12 19:38 2 -> /dev/pts/0
lrwx------ 1 root root 64 Feb 12 19:38 3 -> socket:[11118042]
.. Additional other fds.

netstat -anp | grep client

udp 0 0 192.168.0.189:37381 192.168.191.2:5684 ESTABLISHED 19691/./client
udp 0 0 192.168.0.189:50195 192.168.191.2:5684 ESTABLISHED 19691/./client
udp 0 0 192.168.0.189:35091 192.168.191.2:5684 ESTABLISHED 19691/./client
udp 0 0 192.168.0.189:40860 192.168.191.2:5684 ESTABLISHED 19691/./client
udp 0 0 192.168.0.189:56481 192.168.191.2:5684 ESTABLISHED 19691/./client
udp 0 0 192.168.0.189:42532 192.168.191.2:5684 ESTABLISHED 19691/./client
udp 0 0 192.168.0.189:33191 192.168.191.2:5684 ESTABLISHED 19691/./client
udp 0 0 192.168.0.189:54443 192.168.191.2:5684 ESTABLISHED 19691/./client
udp 0 0 192.168.0.189:44846 192.168.191.2:5684 ESTABLISHED 19691/./client
udp 0 0 192.168.0.189:45871 192.168.191.2:5684 ESTABLISHED 19691/./client
udp 0 0 192.168.0.189:48725 192.168.191.2:5684 ESTABLISHED 19691/./client
udp 0 0 192.168.0.189:50531 192.168.191.2:5684 ESTABLISHED 19691/./client

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By design the number of open sockets will increase if you create multiple client sessions to the same server, but this isn’t a leak. Presumably your would release the previous session before you create a new one if your application logic calls for only one socket. Unlike server sessions, client sessions always have to be actively managed by the client, regardless of the state of the session, since only the client knows if it needs to keep the session opened or if it is done with it.
In this particular case (datagram with icmp not reachable) there is no reason to close the socket. You can send a new message right away on the same socket and it may or may not succeed, the error is not sticky.

/* client-side ICMP destination unreachable, ignore it */
coap_log(LOG_WARNING, "coap_network_read: unreachable\n");
return -2;
/* client-side ICMP destination unreachable, ignore it */
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hear you re ICMP Not reachable. The application requires one socket per target server, and as DTLS is never established, the existing socket cannot continue to be used. We have to close the existing session and start a new coap_new_client_session_psk() to start the DTLS negotiations again. I will remove the coap_socket_close() from this change and see how it can be done from within the application.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, as the return of -2 from coap_network_read() triggers a call to coap_session_reset() which not only resets any pending traffic, but also frees off the TLS session in coap_dtls_free_session(), so the session is unusable for any further TLS traffic -not sure if this is the intention.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it is the intention. A new DTLS session will automatically be established when a new message is queued for sending. At least this is what it is supposed to do.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our code is now correctly freeing off the session before starting the next one and no fd leeks found so far. It was my intention to test out the auto re-establishment today, but I did not have time.

mrdeep1 and others added 2 commits February 15, 2018 09:51
If a TCP connection connection fails, or a an ICMP Unreachable is received
for a UDP packet, then the file descriptor is not closed off.  Eventually
all fails when no more file descriptors are allowed.
Reverting out this part of the change.  This should be managed and done at
the application layer.
@obgm obgm merged commit 08f7f5c into obgm:develop Feb 20, 2018
@mrdeep1 mrdeep1 deleted the fd-close-leak branch March 7, 2018 16:44
mrdeep1 added a commit to mrdeep1/libcoap that referenced this pull request Jun 5, 2019
As COAP_NACK_RST is used for
1. ICMP Port Unreachable message is received
2. CoAP reply RST message.
there needs to be a way to differentiate between these conditions.

If there is detected to be an ICMP response issue to sending a UDP
packet, report this as COAP_NACK_ICMP_ISSUE instead of COAP_NACK_RST.

If the reported issue to coap_session_disconnected() is COAP_NACK_ICMP_ISSUE,
then make sure that the packet (plain or DTLS) gets re-transmitted.

As hinted in obgm#131 discussion, retry of the packet should be attempted,
(as per https://tools.ietf.org/html/rfc7252#section-4.2 ICMP Unreachable)
this fix will cause the plain packet to get retransmitted up to the maximum
retry count if type CON.

Likewise, DTLS will get restarted up to the maximum retry count if type CON.

examples/client.c:

Include COAP_NACK_ICMP_ISSUE in the nack_handler.

include/coap2/coap_io.h:

Add in COAP_NACK_ICMP_ISSUE to coap_nack_reason_t.

src/coap_session.c:

Update coap_session_disconnected() so that plain/DTLS can restart.

src/net.c:

Fix coap_wait_ack() to exponentially back off the re-transmit times.

Update coap_read_session() to report COAP_NACK_ICMP_ISSUE instead of
COAP_NACK_RST.
mrdeep1 added a commit to mrdeep1/libcoap that referenced this pull request Jul 19, 2019
As COAP_NACK_RST is used for
1. ICMP Port Unreachable message is received
2. CoAP reply RST message.
there needs to be a way to differentiate between these conditions.

If there is detected to be an ICMP response issue to sending a UDP
packet, report this as COAP_NACK_ICMP_ISSUE instead of COAP_NACK_RST.

If the reported issue to coap_session_disconnected() is COAP_NACK_ICMP_ISSUE,
then make sure that the packet (plain or DTLS) gets re-transmitted.

As hinted in obgm#131 discussion, retry of the packet should be attempted,
(as per https://tools.ietf.org/html/rfc7252#section-4.2 ICMP Unreachable)
this fix will cause the plain packet to get retransmitted up to the maximum
retry count if type CON.

Likewise, DTLS will get restarted up to the maximum retry count if type CON.

examples/client.c:

Include COAP_NACK_ICMP_ISSUE in the nack_handler.

include/coap2/coap_io.h:

Add in COAP_NACK_ICMP_ISSUE to coap_nack_reason_t.

src/coap_session.c:

Update coap_session_disconnected() so that plain/DTLS can restart.

src/net.c:

Fix coap_wait_ack() to exponentially back off the re-transmit times.

Update coap_read_session() to report COAP_NACK_ICMP_ISSUE instead of
COAP_NACK_RST.

man/coap_handler.txt.in:

Update documentation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants