net: tcp: When closing the connection send FIN immediately #19747

jukkar · 2019-10-10T12:44:00Z

The earlier code was queuing the FIN that is sent when connection
is closed. This caused long delay (200 ms) before the peer at the
other end noticed that the connection was actually closed.

Fixes #19678

Signed-off-by: Jukka Rissanen jukka.rissanen@linux.intel.com

jukkar · 2019-10-10T13:53:47Z

I will debug this a bit more, apache bench reports peer resets.

pfalcon · 2019-10-10T15:10:45Z

This caused long delay (200 ms) before the peer at the
other end noticed that the connection was actually closed.

If you debugged and found that cause, kudos. In my list, that 200+ms delay appeared after last @tbursztyka's last refactor (7f2cb02 and friends) (note: not pointing fingers, but providing information/context). I should have reported it, though can't find a reference (e.g., it's not in #13943, which is another issue happened soon after refactor).

But with the in mind, this:

net: tcp: When closing the connection send FIN immediately. The earlier code was queuing the FIN that is sent when connection is closed.

- is not correct. FIN is just as TCP segment as any other (e.g. a data segment). Actually, all TCP segments are the same. They should be queued. They should be retransmitted if response doesn't arrive at time. They should be sent in proper order. This starts with SYN segment - it should be sent via standard TCP means, in particular, retransmitted if no reply. We handle it an adhoc way, and due to this don't retransmit, which is a known issue (#9944).

The same applies to FIN: it should be put in the TCP segment queue. Which may contain TCP data segments not fully sent yet. FIN should be sent only after they are sent out. But they may be retransmitted yet, so there should be no hurry to send FIN. Finally, FIN itself might be retransmitted a few times.

Anyway, writing that based on the description here in the PR, let me look at the code.

pfalcon

So yeah, based on the previous comment (#19747 (comment)), this is not a correct change and I have to give -1.

FIN really should be queued. But it shouldn't linger there without a reason. Let's figure out what causes that 200ms with it being sent out (or whatever the actual underlying issue may be).

The earlier code was always queuing the FIN that is sent when connection is closed. This caused long delay (200 ms) before the peer at the other end noticed that the connection was actually closed. Now check if there is nothing in the queue, then send the FIN immediately. If there is some data in the queue, flush it when a valid ack has been received. Fixes zephyrproject-rtos#19678 Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com>

jukkar · 2019-10-10T15:40:23Z

If you debugged and found that cause, kudos. In my list, that 200+ms delay appeared after last @tbursztyka's last refactor (7f2cb02 and friends) (note: not pointing fingers, but providing information/context). I should have reported it, though can't find a reference (e.g., it's not in #13943, which is another issue happened soon after refactor).

Difficult to know the culprit commit. Last commit that changed things around the active close was your commit 338dc8a long time ago.

pfalcon · 2019-10-10T16:55:32Z

subsys/net/ip/tcp.c

@@ -972,6 +972,11 @@ int net_tcp_send_pkt(struct net_pkt *pkt)
 	return net_send_data(pkt);
 }

+static void flush_queue(struct net_context *context)
+{
+	(void)net_tcp_send_data(context, NULL, NULL);


Comment to this function says:

/* For now, just send all queued data synchronously. Need to * add window handling and retry/ACK logic. */

So, it still would be a workaround, not a solution, and could lead to silent data loss.

pfalcon · 2019-10-11T11:17:01Z

Tested this with standard test method I used for years (qemu_x86+SLIP, dump_http_server, ab -n1000):

Percentage of the requests served within a certain time (ms)
  50%     22
  66%     25
  75%     29
  80%     33
  90%   1026
  95%   1035
  98%   1047
  99%   1051
 100%   1069 (longest request)

That IMHO only deteriorates the situation, as instead of stable ~200ms we get high delay dispersion of unknown reason. So, I'll keep my -1 and extend my call to understand where 200ms delays comes from (while it shouldn't be there), instead of throwing random workaround.

(Mostly a crosspost from #19678 (comment))

pfalcon · 2019-10-11T22:36:15Z

As was found in #19678 (comment) , this effectively tries to undo the effect of 45df09c . And arguably, it does it in a roundabout way, which still raises questions how correct it is (i.e. what are corner cases and unobvious implications of doing it this way).

I'd suggest to seriously consider reinstating the previous code as another option (while of course addressing @dleach02's concern as described in the above commit message). Or alternatively, explain in more detail why the solution here is better. Note that such an explanation would need to account for the actual test data: #19678 (comment) , which shows that with this patch, there's noticeable and high-dispersion delay distribution of unknown nature for connection handling times, whereas simply reverting 45df09c would lead to a nice flat delay distribution with flat dispersion.

jukkar · 2019-10-14T10:39:48Z

@pfalcon thanks for thorough investigation about the issue. After looking the code, I think we should not revert commit 45df09c as the original code was doing a bit suspicious things. It for example did not check the 6lo tech check that is done in net_tcp_send_data() when calling net_tcp_send_pkt().

Another suspicious stuff that current code does is in net_tcp_send_data()

zephyr/subsys/net/ip/tcp.c

Lines 1017 to 1024 in bb9c453

    
           ret = net_tcp_send_pkt(pkt); 
        
           if (ret < 0 && !is_6lo_technology(pkt)) { 
        
           	NET_DBG("[%p] pkt %p not sent (%d)", 
        
           		context->tcp, pkt, ret); 
        
           	net_pkt_unref(pkt); 
        
           } 
        
           net_pkt_set_queued(pkt, true);

The function does not return if he packet was not sent, but continues to call net_pkt_set_queued(pkt, true); which does not look right.

jukkar · 2019-10-14T10:46:13Z

I think what this PR proposes is more or less correct thing to do:

If the there are no other TCP segments queued to be sent, it then sends the final FIN immediately
If there are some data segments to be sent, the FIN is queued and sent after those ones
After receiving a valid ack, we flush the queue in order to send any pending FIN

pfalcon

@pfalcon wrote:

50% 22
90% 1026

That IMHO only deteriorates the situation, as instead of stable ~200ms we get high delay dispersion of unknown reason.

@jukkar wrote in #19678 (comment):

Sample: dump_http_server
Git: master + fix #19747
Results: The request take about 1000ms. Because we are no longer waiting 200ms, ab gets results quicker and can send new request earlier. The dumb_http_server is a single threaded server which will wait for incoming connections, serve them and then wait another connection. At the moment we do not have any queing for received SYN packets so if the application has not called accept(), the incoming SYN is dropped. This is happening here and I verified it from debug logs. The sample app is a bit too slow to accept another connection attempt and this forces the Linux to re-send a new SYN packet after around 1sec timeout. This is not happening always, sometimes dumb_http_server manages to accept the connection in time, thus the ab reported time is <1sec.

To verify this, I took this patch and applied revert for 1a6f4a6 (required a trivial fixup).

After that, testing mps2_an385+overlay-smsc911x.conf (again, to avoid QEMU+SLIP which are known to have issues) leads to:

$ time ab -n1000 http://192.0.2.1:8080/
...
Percentage of the requests served within a certain time (ms)
  50%     12
  66%     13
  75%     14
  80%     14
  90%     15
  95%     16
  98%     17
  99%     18
 100%     23 (longest request)

@jukkar

I think what this PR proposes is more or less correct thing to do:

With the all explanations provided and my testing above which adheres to the explanations, I may only agree with the above, this is definitely an improvement over what we had before, and itself doesn't seem to introduce more issues (but exposes issues we got independently some time ago).

And in a long while, this a case where there was actual thorough testing (with multiple targets, etc.) and results of that testing were documented and discussed/explained, I very much appreciate that.

jukkar added the area: Networking label Oct 10, 2019

jukkar requested review from pfalcon and tbursztyka as code owners October 10, 2019 12:44

jukkar force-pushed the issue-19678-tcp-slow-close branch from 6bf5435 to dd5d282 Compare October 10, 2019 12:49

tbursztyka approved these changes Oct 10, 2019

View reviewed changes

jukkar added the DNM This PR should not be merged (Do Not Merge) label Oct 10, 2019

alemoppo mentioned this pull request Oct 10, 2019

Noticeable delay between processing multiple client connection requests (200ms+) #19678

Closed

pfalcon requested changes Oct 10, 2019

View reviewed changes

jukkar force-pushed the issue-19678-tcp-slow-close branch from dd5d282 to 6c4a515 Compare October 10, 2019 15:33

jukkar removed the DNM This PR should not be merged (Do Not Merge) label Oct 10, 2019

pfalcon reviewed Oct 10, 2019

View reviewed changes

pfalcon requested a review from dleach02 October 11, 2019 22:36

pfalcon approved these changes Oct 16, 2019

View reviewed changes

jukkar merged commit 97b6588 into zephyrproject-rtos:master Oct 17, 2019

jukkar deleted the issue-19678-tcp-slow-close branch October 17, 2019 14:02

jukkar mentioned this pull request Oct 28, 2019

[Backport v1.14-branch] Misc TCP fixes to 1.14 branch #20192

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

net: tcp: When closing the connection send FIN immediately #19747

net: tcp: When closing the connection send FIN immediately #19747

jukkar commented Oct 10, 2019

jukkar commented Oct 10, 2019

pfalcon commented Oct 10, 2019

pfalcon left a comment

jukkar commented Oct 10, 2019

pfalcon Oct 10, 2019

pfalcon commented Oct 11, 2019

pfalcon commented Oct 11, 2019

jukkar commented Oct 14, 2019 •

edited

Loading

jukkar commented Oct 14, 2019

pfalcon left a comment

net: tcp: When closing the connection send FIN immediately #19747

net: tcp: When closing the connection send FIN immediately #19747

Conversation

jukkar commented Oct 10, 2019

jukkar commented Oct 10, 2019

pfalcon commented Oct 10, 2019

pfalcon left a comment

Choose a reason for hiding this comment

jukkar commented Oct 10, 2019

pfalcon Oct 10, 2019

Choose a reason for hiding this comment

pfalcon commented Oct 11, 2019

pfalcon commented Oct 11, 2019

jukkar commented Oct 14, 2019 • edited Loading

jukkar commented Oct 14, 2019

pfalcon left a comment

Choose a reason for hiding this comment

jukkar commented Oct 14, 2019 •

edited

Loading