-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
performance issues #55
Comments
From jef.poskanzer on March 12, 2013 13:48:36 The new --zerocopy option improves performance a lot. We still want improvements in the non-zerocopy case. |
From bltierney@es.net on July 23, 2013 10:34:55 Performance now better, but pushing to next release to see if we can make better still in the future. Labels: -Milestone-3.0-Release Milestone-3.1a1 |
From jef.poskanzer on December 09, 2013 16:35:21 Performance now better still, but there's always room for more. How about we keep this issue open indefinately and use it to record ideas for further improvement. For example, in the most recent round of speedups, I think I might have brought back some gettimeofday() syscalls that I had previously eliminated. Re-removing those might help. Anyway, suggest we change the milestone from 3.1a1 to future. |
I just did a few tests between a couple of the 10G hosts on ESnet's 100G testbed, using (roughly) the tip of the iperf3 master codeline. These are "typical" results (in that I did several runs on each of these with results roughly to within a few percent, but did not attempt to compute confidence intervals, etc.): TCP: It looks like, unless I'm reading things completely wrong, that iperf3 can saturate a 10G link with either UDP or TCP. The "known issues" section of the README implies that UDP was (at least at one time) unable to get above 5Gbps. The original bug report didn't have any details as to what the observed performance was, so I am unable to tell whether the performance is just as "bad" as it was originally or if it's gotten better somehow. @bltierney, any thoughts on this? |
This is great news. what hosts/NICS were you using for this? I'd like to do some more 40G testing before closing this issue. And BTW, I was seeing from very strange results last night with and without On Wed, Apr 9, 2014 at 1:27 PM, Bruce A. Mah notifications@github.comwrote:
Brian Tierney, http://www.es.net/tierney Energy Sciences Network (ESnet), Berkeley National Lab |
If memory serves me right, Brian Tierney wrote:
nersc-diskpt-1 (client) to nersc-diskpt-2 (server). I am not sure what (I realize the above will not make any sense to anyone not within ESnet.)
I guess we can dig into all these at the same time. |
Sigh. The above comment was written by @bmah888, not @bltierney. I have no idea why GitHub got confused on this, other than that I replied via email to the GitHub notification I received on the comment above that. |
Results of more measurements done this week: We have multiple recorded runs of iperf3 doing 10Gbps on cxgb4 (Chelsio) cards with zero loss. However there is another mode where we see consistent packet loss of about 20%. We believe this is related to an issue with interrupts and the CPU core being used for iperf3. @bltierney was able to get consistent results on the diskpt units on the ESnet 100G testbed by tuning CPU affinity with -A 9,9. While real and significant, tere's not a whole lot we can do about this issue at the level of iperf3, since it does not have any visibility into what core it should be running on for best performance. The solution for now will be to document this behavior. |
experiments. Fixes Issue #55 (at least to the extent that it's not really an iperf3 issue).
Documented what we know about this issue, committed documentation changes to the master and 3.0-STABLE branches. Closing as fixed, at least for now. |
@bmah888 since, you do have this handy ticket here- gonna slap these details here. Iperf3Client: root@kube01:~# iperf3 -c 10.100.4.105 --zerocopy
Connecting to host 10.100.4.105, port 5201
[ 5] local 10.100.4.100 port 46158 connected to 10.100.4.105 port 5201
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.00 sec 4.19 GBytes 36.0 Gbits/sec 328 1.36 MBytes
[ 5] 1.00-2.00 sec 4.05 GBytes 34.8 Gbits/sec 239 1.15 MBytes
[ 5] 2.00-3.00 sec 4.60 GBytes 39.5 Gbits/sec 163 1.20 MBytes
[ 5] 3.00-4.00 sec 4.82 GBytes 41.4 Gbits/sec 448 1.26 MBytes
[ 5] 4.00-5.00 sec 3.82 GBytes 32.9 Gbits/sec 187 1.12 MBytes
[ 5] 5.00-6.00 sec 3.44 GBytes 29.6 Gbits/sec 113 1.26 MBytes
[ 5] 6.00-7.00 sec 4.60 GBytes 39.5 Gbits/sec 466 1.01 MBytes
[ 5] 7.00-8.00 sec 4.47 GBytes 38.4 Gbits/sec 410 1.25 MBytes
[ 5] 8.00-9.00 sec 4.66 GBytes 40.1 Gbits/sec 446 1.20 MBytes
[ 5] 9.00-10.00 sec 4.09 GBytes 35.1 Gbits/sec 348 1.24 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 42.8 GBytes 36.7 Gbits/sec 3148 sender
[ 5] 0.00-10.00 sec 42.8 GBytes 36.7 Gbits/sec receiver Server: Using -P 2, yields negligible differences. root@kube01:~# iperf3 -c 10.100.4.105 --zerocopy -P 2
Connecting to host 10.100.4.105, port 5201
[ 5] local 10.100.4.100 port 36242 connected to 10.100.4.105 port 5201
[ 7] local 10.100.4.100 port 36244 connected to 10.100.4.105 port 5201
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.00 sec 2.38 GBytes 20.4 Gbits/sec 0 1.40 MBytes
[ 7] 0.00-1.00 sec 2.39 GBytes 20.5 Gbits/sec 0 1021 KBytes
[SUM] 0.00-1.00 sec 4.77 GBytes 40.9 Gbits/sec 0
- - - - - - - - - - - - - - - - - - - - - - - - -
[ 5] 1.00-2.00 sec 2.44 GBytes 20.9 Gbits/sec 0 1.40 MBytes
[ 7] 1.00-2.00 sec 2.43 GBytes 20.9 Gbits/sec 0 1021 KBytes
[SUM] 1.00-2.00 sec 4.87 GBytes 41.8 Gbits/sec 0
- - - - - - - - - - - - - - - - - - - - - - - - -
[ 5] 2.00-3.00 sec 2.39 GBytes 20.5 Gbits/sec 0 1.40 MBytes
[ 7] 2.00-3.00 sec 2.39 GBytes 20.5 Gbits/sec 0 1021 KBytes
[SUM] 2.00-3.00 sec 4.78 GBytes 41.0 Gbits/sec 0
- - - - - - - - - - - - - - - - - - - - - - - - -
[ 5] 3.00-4.00 sec 2.42 GBytes 20.8 Gbits/sec 0 1.40 MBytes
[ 7] 3.00-4.00 sec 2.42 GBytes 20.8 Gbits/sec 0 1021 KBytes
[SUM] 3.00-4.00 sec 4.84 GBytes 41.5 Gbits/sec 0
- - - - - - - - - - - - - - - - - - - - - - - - -
[ 5] 4.00-5.00 sec 2.41 GBytes 20.7 Gbits/sec 0 1.40 MBytes
[ 7] 4.00-5.00 sec 2.41 GBytes 20.7 Gbits/sec 0 1.11 MBytes
[SUM] 4.00-5.00 sec 4.82 GBytes 41.4 Gbits/sec 0
- - - - - - - - - - - - - - - - - - - - - - - - -
[ 5] 5.00-6.00 sec 2.43 GBytes 20.8 Gbits/sec 0 1.40 MBytes
[ 7] 5.00-6.00 sec 2.43 GBytes 20.9 Gbits/sec 0 1.11 MBytes
[SUM] 5.00-6.00 sec 4.85 GBytes 41.7 Gbits/sec 0
- - - - - - - - - - - - - - - - - - - - - - - - -
[ 5] 6.00-7.00 sec 2.45 GBytes 21.1 Gbits/sec 0 1.40 MBytes
[ 7] 6.00-7.00 sec 2.45 GBytes 21.0 Gbits/sec 12 1.07 MBytes
[SUM] 6.00-7.00 sec 4.90 GBytes 42.1 Gbits/sec 12
- - - - - - - - - - - - - - - - - - - - - - - - -
[ 5] 7.00-8.00 sec 2.44 GBytes 21.0 Gbits/sec 0 1.40 MBytes
[ 7] 7.00-8.00 sec 2.44 GBytes 21.0 Gbits/sec 0 1.07 MBytes
[SUM] 7.00-8.00 sec 4.88 GBytes 41.9 Gbits/sec 0
- - - - - - - - - - - - - - - - - - - - - - - - -
[ 5] 8.00-9.00 sec 2.43 GBytes 20.9 Gbits/sec 0 1.40 MBytes
[ 7] 8.00-9.00 sec 2.43 GBytes 20.9 Gbits/sec 0 1.07 MBytes
[SUM] 8.00-9.00 sec 4.86 GBytes 41.8 Gbits/sec 0
- - - - - - - - - - - - - - - - - - - - - - - - -
[ 5] 9.00-10.00 sec 2.42 GBytes 20.8 Gbits/sec 0 1.40 MBytes
[ 7] 9.00-10.00 sec 2.42 GBytes 20.8 Gbits/sec 0 1.24 MBytes
[SUM] 9.00-10.00 sec 4.84 GBytes 41.5 Gbits/sec 0
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 24.2 GBytes 20.8 Gbits/sec 0 sender
[ 5] 0.00-10.00 sec 24.2 GBytes 20.8 Gbits/sec receiver
[ 7] 0.00-10.00 sec 24.2 GBytes 20.8 Gbits/sec 12 sender
[ 7] 0.00-10.00 sec 24.2 GBytes 20.8 Gbits/sec receiver
[SUM] 0.00-10.00 sec 48.4 GBytes 41.6 Gbits/sec 12 sender
[SUM] 0.00-10.00 sec 48.4 GBytes 41.6 Gbits/sec receiver IPerfUsing iperf without parallel threads, yields... roughly the same as iperf3 root@kube01:~# iperf -c 10.100.4.105
------------------------------------------------------------
Client connecting to 10.100.4.105, TCP port 5001
TCP window size: 16.0 KByte (default)
------------------------------------------------------------
[ 1] local 10.100.4.100 port 36088 connected with 10.100.4.105 port 5001 (icwnd/mss/irtt=14/1448/116)
[ ID] Interval Transfer Bandwidth
[ 1] 0.0000-10.0110 sec 32.5 GBytes 27.8 Gbits/sec BUT- giving a few extra threads- makes a WORLD of difference. root@kube01:~# iperf -c 10.100.4.105
------------------------------------------------------------
Client connecting to 10.100.4.105, TCP port 5001
TCP window size: 16.0 KByte (default)
------------------------------------------------------------
[ 1] local 10.100.4.100 port 36088 connected with 10.100.4.105 port 5001 (icwnd/mss/irtt=14/1448/116)
[ ID] Interval Transfer Bandwidth
[ 1] 0.0000-10.0110 sec 32.5 GBytes 27.8 Gbits/sec
root@kube01:~# iperf -c 10.100.4.105 -P 6
------------------------------------------------------------
Client connecting to 10.100.4.105, TCP port 5001
TCP window size: 85.0 KByte (default)
------------------------------------------------------------
[ 1] local 10.100.4.100 port 45476 connected with 10.100.4.105 port 5001 (icwnd/mss/irtt=14/1448/98)
[ 6] local 10.100.4.100 port 45544 connected with 10.100.4.105 port 5001 (icwnd/mss/irtt=14/1448/101)
[ 4] local 10.100.4.100 port 45514 connected with 10.100.4.105 port 5001 (icwnd/mss/irtt=14/1448/214)
[ 2] local 10.100.4.100 port 45506 connected with 10.100.4.105 port 5001 (icwnd/mss/irtt=14/1448/141)
[ 5] local 10.100.4.100 port 45528 connected with 10.100.4.105 port 5001 (icwnd/mss/irtt=14/1448/162)
[ 3] local 10.100.4.100 port 45492 connected with 10.100.4.105 port 5001 (icwnd/mss/irtt=14/1448/119)
[ ID] Interval Transfer Bandwidth
[ 3] 0.0000-10.0030 sec 14.0 GBytes 12.0 Gbits/sec
[ 1] 0.0000-10.0029 sec 7.19 GBytes 6.18 Gbits/sec
[ 5] 0.0000-10.0027 sec 27.1 GBytes 23.3 Gbits/sec
[ 6] 0.0000-10.0030 sec 13.6 GBytes 11.7 Gbits/sec
[ 2] 0.0000-10.0029 sec 13.3 GBytes 11.5 Gbits/sec
[ 4] 0.0000-10.0028 sec 13.5 GBytes 11.6 Gbits/sec
[SUM] 0.0000-10.0007 sec 88.7 GBytes 76.2 Gbits/sec Any- ideas on how to properly benchmark 40/100/200/400GBe with iperf3? |
@XtremeOwnageDotCom, what is the iperf3 version you are using ( |
That- would explain it. The version included with my distro's package manager is a hair old... 3.12 Appears- debian is a bit behind: https://packages.debian.org/stable/net/iperf3 apt-get install autoconf libtool gcc make
mkdir iperf
cd iperf
wget https://github.com/esnet/iperf/releases/download/3.17.1/iperf-3.17.1.tar.gz
tar -xf iperf-3.17.1.tar.gz
cd iperf-3.17.1
./bootstrap.sh;./configure; make; make install
export LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH
/usr/local/bin/iperf3 -v
iperf 3.17.1 (cJSON 1.7.15) After building on two hosts- and running... Client: /usr/local/bin/iperf3 -c 10.100.4.102 --zerocopy -P 6
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 9.80 GBytes 8.41 Gbits/sec 852 sender
[ 5] 0.00-10.00 sec 9.79 GBytes 8.41 Gbits/sec receiver
[ 7] 0.00-10.00 sec 8.36 GBytes 7.18 Gbits/sec 652 sender
[ 7] 0.00-10.00 sec 8.35 GBytes 7.17 Gbits/sec receiver
[ 9] 0.00-10.00 sec 7.12 GBytes 6.12 Gbits/sec 621 sender
[ 9] 0.00-10.00 sec 7.12 GBytes 6.11 Gbits/sec receiver
[ 11] 0.00-10.00 sec 8.51 GBytes 7.31 Gbits/sec 1360 sender
[ 11] 0.00-10.00 sec 8.50 GBytes 7.30 Gbits/sec receiver
[ 13] 0.00-10.00 sec 6.89 GBytes 5.92 Gbits/sec 285 sender
[ 13] 0.00-10.00 sec 6.88 GBytes 5.91 Gbits/sec receiver
[ 15] 0.00-10.00 sec 8.95 GBytes 7.69 Gbits/sec 219 sender
[ 15] 0.00-10.00 sec 8.95 GBytes 7.68 Gbits/sec receiver
[SUM] 0.00-10.00 sec 49.6 GBytes 42.6 Gbits/sec 3989 sender
[SUM] 0.00-10.00 sec 49.6 GBytes 42.6 Gbits/sec receiver Still- coming up a bit short. Running against a server with much faster single-threaded performance- gives much better results though. root@kube01:~/iperf/iperf-3.17.1# /usr/local/bin/iperf3 -c 10.100.4.105 --zerocopy -P 6
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 16.5 GBytes 14.2 Gbits/sec 220 sender
[ 5] 0.00-10.00 sec 16.5 GBytes 14.2 Gbits/sec receiver
[ 7] 0.00-10.00 sec 10.8 GBytes 9.24 Gbits/sec 682 sender
[ 7] 0.00-10.00 sec 10.8 GBytes 9.23 Gbits/sec receiver
[ 9] 0.00-10.00 sec 9.63 GBytes 8.27 Gbits/sec 970 sender
[ 9] 0.00-10.00 sec 9.62 GBytes 8.26 Gbits/sec receiver
[ 11] 0.00-10.00 sec 16.3 GBytes 14.0 Gbits/sec 32 sender
[ 11] 0.00-10.00 sec 16.3 GBytes 14.0 Gbits/sec receiver
[ 13] 0.00-10.00 sec 16.6 GBytes 14.2 Gbits/sec 230 sender
[ 13] 0.00-10.00 sec 16.6 GBytes 14.2 Gbits/sec receiver
[ 15] 0.00-10.00 sec 13.6 GBytes 11.7 Gbits/sec 238 sender
[ 15] 0.00-10.00 sec 13.6 GBytes 11.7 Gbits/sec receiver
[SUM] 0.00-10.00 sec 83.4 GBytes 71.6 Gbits/sec 2372 sender
[SUM] 0.00-10.00 sec 83.3 GBytes 71.6 Gbits/sec receiver I'll ignore the slower host for now- as I am in the middle of network configuration changes, switch updates, etc, but- overall, really happy the -P option was added back in. |
From bltierney@es.net on December 13, 2012 09:58:43
The reported single flow throughput of iperf3 is considerably lower than nuttcp and netperf on 10G and 40G hosts.
This is particularly true for UDP
Original issue: http://code.google.com/p/iperf/issues/detail?id=55
The text was updated successfully, but these errors were encountered: