Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TCP reverse mode locks up most times #111

Closed
bmah888 opened this issue Feb 28, 2014 · 6 comments
Closed

TCP reverse mode locks up most times #111

bmah888 opened this issue Feb 28, 2014 · 6 comments

Comments

@bmah888
Copy link
Contributor

bmah888 commented Feb 28, 2014

From intra2net on November 19, 2013 07:52:43

What steps will reproduce the problem? 1. recompile the 3.0 code as published on the website
2. on the server run: iperf3 --server
3. on the client run: iperf3 -c -V -R What is the expected output? What do you see instead? Expected: 10 seconds of transfer, a detailed statistics afterwards

Most times (about 90%) I get:

iperf version 3.0-RC5 (07 November 2013)
Linux intratest132.net.lan 3.4.51-1.i2n.i686.PAE #1 SMP Fri Jun 28 13:49:25 UTC 2013 i686 i686 i386 GNU/Linux
Time: Tue, 19 Nov 2013 23:31:55 GMT
Connecting to host 172.16.1.133, port 5201
Reverse mode, remote host 172.16.1.133 is sending
Cookie: intratest132.net.lan.1384903915.1454
TCP MSS: 1448 (default)
[ 4] local 172.16.1.132 port 35112 connected to 172.16.1.133 port 5201
Starting Test: protocol: TCP, 1 streams, 131072 byte blocks, omitting 0 seconds, 10 second test
[ ID] Interval Transfer Bandwidth
[ 4] 0.00-1.00 sec 110 MBytes 924 Mbits/sec
[ 4] 1.00-2.00 sec 110 MBytes 926 Mbits/sec
[ 4] 2.00-3.00 sec 111 MBytes 934 Mbits/sec
[ 4] 3.00-4.00 sec 111 MBytes 931 Mbits/sec
[ 4] 4.00-5.00 sec 110 MBytes 926 Mbits/sec
[ 4] 5.00-6.00 sec 110 MBytes 927 Mbits/sec
[ 4] 6.00-7.00 sec 110 MBytes 921 Mbits/sec
[ 4] 7.00-8.00 sec 110 MBytes 928 Mbits/sec
[ 4] 8.00-9.00 sec 110 MBytes 924 Mbits/sec
(now iperf is not continuing, I have to abort with ctrl+c) What version of the product are you using? On what operating system? recompiled from the published 3.0.tar.gz
32bit linux without ipv6 on both machines. Please provide any additional information below. I'll attach a strace of the client.

Attachment: iperf-client-strace.txt.gz

Original issue: http://code.google.com/p/iperf/issues/detail?id=111

@bmah888
Copy link
Contributor Author

bmah888 commented Feb 28, 2014

From bltierney@es.net on November 26, 2013 08:07:31

I was not able to reproduce this. Does anyone else see this behavior?

@bmah888
Copy link
Contributor Author

bmah888 commented Feb 28, 2014

From susant.sahani on December 04, 2013 06:13:31

Reproduced it quite frequently on RHEL 6.X .

Try running the steps 5/6 times . It's getting reproduced.

strace server :

write(5, "\275E\207\265\326\36P\202)An\225x\243\356W\304f\362\370\336\342\206~\365\260\3\210\366}\325\263"..., 131072) = 131072
write(5, "\275E\207\265\326\36P\202)An\225x\243\356W\304f\362\370\336\342\206~\365\260\3\210\366}\325\263"..., 131072) = 131072
write(5, "\275E\207\265\326\36P\202)An\225x\243\356W\304f\362\370\336\342\206~\365\260\3\210\366}\325\263"..., 131072) = 131072
write(5, "\275E\207\265\326\36P\202)An\225x\243\356W\304f\362\370\336\342\206~\365\260\3\210\366}\325\263"..., 131072) = 131072
write(5, "\275E\207\265\326\36P\202)An\225x\243\356W\304f\362\370\336\342\206~\365\260\3\210\366}\325\263"..., 131072) = 131072
write(5, "\275E\207\265\326\36P\202)An\225x\243\356W\304f\362\370\336\342\206~\365\260\3\210\366}\325\263"..., 131072 <====================Blocked

Client strace:

gettimeofday({1386164922, 754122}, NULL) = 0
select(5, [3 4], [], NULL, {0, 0})      = 1 (in [4], left {0, 0})
gettimeofday({1386164922, 754171}, NULL) = 0
select(5, [3 4], [], NULL, {0, 0})      = 1 (in [4], left {0, 0})
gettimeofday({1386164922, 754221}, NULL) = 0
select(5, [3 4], [], NULL, {0, 0})      = 1 (in [4], left {0, 0})
gettimeofday({1386164922, 754269}, NULL) = 0 
select(5, [3 4], [], NULL, {0, 0})      = 1 (in [4], left {0, 0} <================== Select getting TMO

iperf_run_client:
After receiving all the test data
1. The client sends TEST_DONE to the server
2. The Server socket is blocked on write. It never able to receive the the TEST_DONE from the control channel.
3. The Client tries to read data and getting TMO and server blocks on write.

it's a dead lock.

 There should be some kind of TMO in the write and receive

Wrote a patch which fixes this . Setting socket to some amount of TMO .

SO_RCVTIMEO
SO_SNDTIMEO

Attachment: reverse-mode-lockup.patch

@bmah888
Copy link
Contributor Author

bmah888 commented Feb 28, 2014

From jef.poskanzer on December 09, 2013 18:03:58

I can't reproduce this either.

I don't see a TEST_DONE state anywhere in the source. There's TEST_END and IPERF_DONE. See http://code.google.com/p/iperf/wiki/IperfProtocolStates for details on the protocol and states.

@bmah888
Copy link
Contributor Author

bmah888 commented Feb 28, 2014

From jef.poskanzer on December 09, 2013 18:32:54

However, one idea to look at is if this happens on lower-speed links. Perhaps there, the pipe doesn't have time to empty before the receiver closes its read socket. Its certainly possible this could only happen in reverse mode.

@bmah888
Copy link
Contributor Author

bmah888 commented Feb 28, 2014

From susant.sahani on December 09, 2013 23:26:03

yes it's TEST_END . I did a typo there.

#define TEST_END 4

This is no more reproducible because I guess it got fixed by this commit e4d782b488ed
Log message

Fixed bug where -R mode selected on a closed file.

Also added a debugging routine to dump an fd_set.
Affected files expand all collapse all
Modify /src/iperf_server_api.c diff
Modify /src/iperf_util.c diff
Modify /src/iperf_util.h diff

@bmah888
Copy link
Contributor Author

bmah888 commented Feb 28, 2014

From jef.poskanzer on December 10, 2013 06:00:13

Ok! Then let's close this one.

Status: Fixed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant