Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug - Mixed speed broadband only measures the Maximum upload not the download #1374

Open
DavidACraig1975 opened this issue Aug 16, 2022 · 15 comments

Comments

@DavidACraig1975
Copy link

Context Testing from WFH Workstations at 1Gbps Down and 50Mbps up to Business Network Connection at 1Gbps Up/Down

  • Version of iperf3: 3.1.3

  • Hardware: mixed

  • Operating system (and distribution, if any): Windows

Bug Report

  • Expected Behavior - would expect that the results would show a download bandwidth of 1000Mbps and and up of 50Mbps

  • Actual Behavior - Both up and down seem to be getting the max upload speed of 50Mbps, (works fine between local network devices)

  • Steps to Reproduce - Run on consumer broadband with different upload and download speeds

  • Possible Solution - I am guessing it is sending traffic up and pulling the same traffic down causing the bottle neck on the upload speed. maybe it would be better to initiate a download from the server and initiate a upload to the server from the client.

Enhancement Request

  • Current behavior

  • Desired behavior - to clearly state the upload speed and download speed, in a normal LAN this could also indicate a wireing issue if it is not the same.

  • Implementation notes

@davidBar-On
Copy link
Contributor

@DavidACraig1975, since you didn't include the command line you used to run iperf3, it is not clear how you did the test. In general, the default mode of iperf3 is to send data only from the client to the server. If the client is sending using the uplink then the rate will be limited by the uplink rate. The server data in this case is the amount of data received, not sent.

If you want that the server will send data to the client, use the -R (reverse) option. If you want that the test will be done in both directions, use the --bidir option.

By the way, note that you are using a very old version of iperf3 and it is recommended to use a newer version.

@AnatoliChe
Copy link

The same problem:
`

iperf3 -s

and the same host:

iperf3 -c 127.0.0.1 --bidir

Connecting to host 127.0.0.1, port 5201
[ 5] local 127.0.0.1 port 37206 connected to 127.0.0.1 port 5201
[ 7] local 127.0.0.1 port 37208 connected to 127.0.0.1 port 5201
[ ID][Role] Interval Transfer Bitrate Retr Cwnd
[ 5][TX-C] 0.00-1.00 sec 3.54 GBytes 30.4 Gbits/sec 0 1.69 MBytes
[ 7][RX-C] 0.00-1.00 sec 363 MBytes 3.04 Gbits/sec
[ 5][TX-C] 1.00-2.00 sec 3.83 GBytes 32.9 Gbits/sec 0 2.00 MBytes
[ 7][RX-C] 1.00-2.00 sec 393 MBytes 3.29 Gbits/sec
[ 5][TX-C] 2.00-3.00 sec 3.72 GBytes 31.9 Gbits/sec 0 2.37 MBytes
[ 7][RX-C] 2.00-3.00 sec 380 MBytes 3.19 Gbits/sec
[ 5][TX-C] 3.00-4.00 sec 3.82 GBytes 32.8 Gbits/sec 0 2.37 MBytes
[ 7][RX-C] 3.00-4.00 sec 392 MBytes 3.28 Gbits/sec
[ 5][TX-C] 4.00-5.00 sec 3.46 GBytes 29.7 Gbits/sec 0 2.37 MBytes
[ 7][RX-C] 4.00-5.00 sec 354 MBytes 2.97 Gbits/sec
[ 5][TX-C] 5.00-6.00 sec 3.77 GBytes 32.4 Gbits/sec 0 2.37 MBytes
[ 7][RX-C] 5.00-6.00 sec 386 MBytes 3.24 Gbits/sec
[ 5][TX-C] 6.00-7.00 sec 3.51 GBytes 30.1 Gbits/sec 1 3.56 MBytes
[ 7][RX-C] 6.00-7.00 sec 370 MBytes 3.10 Gbits/sec
[ 5][TX-C] 7.00-8.00 sec 3.80 GBytes 32.6 Gbits/sec 0 3.56 MBytes
[ 7][RX-C] 7.00-8.00 sec 389 MBytes 3.26 Gbits/sec
[ 5][TX-C] 8.00-9.00 sec 3.83 GBytes 32.9 Gbits/sec 0 3.56 MBytes
[ 7][RX-C] 8.00-9.00 sec 392 MBytes 3.29 Gbits/sec
[ 5][TX-C] 9.00-10.00 sec 3.79 GBytes 32.5 Gbits/sec 0 3.56 MBytes
[ 7][RX-C] 9.00-10.00 sec 388 MBytes 3.25 Gbits/sec


[ ID][Role] Interval Transfer Bitrate Retr
[ 5][TX-C] 0.00-10.00 sec 37.1 GBytes 31.8 Gbits/sec 1 sender
[ 5][TX-C] 0.00-10.00 sec 37.1 GBytes 31.8 Gbits/sec receiver
[ 7][RX-C] 0.00-10.00 sec 3.72 GBytes 3.20 Gbits/sec 0 sender
[ 7][RX-C] 0.00-10.00 sec 3.72 GBytes 3.19 Gbits/sec receiver

iperf Done.
`

ps
It's possible to have some work around with -b option.

@davidBar-On
Copy link
Contributor

@AnatoliChe, it is very strange that the RX rate is exactly 10% of the TX rate. Which iperf3 version to you use (iperf3 -v)? What is the operating system?

@AnatoliChe
Copy link

iperf3 -v
iperf 3.9 (cJSON 1.7.13)
Linux host 5.10.0-14-amd64 #1 SMP Debian 5.10.113-1 (2022-04-29) x86_64
Optional features available: CPU affinity setting, IPv6 flow label, SCTP, TCP congestion algorithm setting, sendfile / zerocopy, socket pacing, authentication

cat /etc/debian_version
11.3

@davidBar-On
Copy link
Contributor

I tried the same test on my machine, using both iperf3 versions 3.7 and 3.12. On both I get about 30Gbps for both TX and RX. I also briefly searched the code changes since 3.9 and I don't see any change that is related to the bandwidth calculation or display.

Can you try running the test once without the --bidir and once with replacing --bidir with -R (reverse mode)? This is to see if the issue is related to the direction the test data is sent or maybe there is some kind of interaction between the data sent in both directions.

@AnatoliChe
Copy link

AnatoliChe commented Oct 15, 2022

Sure!
By the way we can see it and with iperf3 3.12

take new server Dell R350, Intel(R) Xeon(R) E-2388G CPU @ 3.20GHz
free
total used free shared buff/cache available
Mem: 131798136 872144 130394944 5856 531048 129840312

new clear install Debian
cat /etc/debian_version
11.3

iperf3 -v iperf 3.9 (cJSON 1.7.13) Linux ursa 5.10.0-14-amd64 #1 SMP Debian 5.10.113-1 (2022-04-29) x86_64 Optional features available: CPU affinity setting, IPv6 flow label, SCTP, TCP congestion algorithm setting, sendfile / zerocopy, socket pacing, authentication
iperf3 -s

`iperf3 -c 127.0.0.1
Connecting to host 127.0.0.1, port 5201
[ 5] local 127.0.0.1 port 40506 connected to 127.0.0.1 port 5201
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.00 sec 10.4 GBytes 89.4 Gbits/sec 0 1.25 MBytes
[ 5] 1.00-2.00 sec 10.5 GBytes 90.0 Gbits/sec 0 1.25 MBytes
[ 5] 2.00-3.00 sec 10.5 GBytes 90.2 Gbits/sec 0 1.25 MBytes
[ 5] 3.00-4.00 sec 10.5 GBytes 89.9 Gbits/sec 0 1.25 MBytes
[ 5] 4.00-5.00 sec 10.5 GBytes 90.1 Gbits/sec 0 1.25 MBytes
[ 5] 5.00-6.00 sec 10.5 GBytes 89.8 Gbits/sec 0 1.25 MBytes
[ 5] 6.00-7.00 sec 10.3 GBytes 88.7 Gbits/sec 0 1.25 MBytes
[ 5] 7.00-8.00 sec 10.2 GBytes 87.5 Gbits/sec 0 1.25 MBytes
[ 5] 8.00-9.00 sec 10.3 GBytes 88.3 Gbits/sec 0 1.25 MBytes
[ 5] 9.00-10.00 sec 10.3 GBytes 88.4 Gbits/sec 0 1.25 MBytes


[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 104 GBytes 89.2 Gbits/sec 0 sender
[ 5] 0.00-10.00 sec 104 GBytes 89.2 Gbits/sec receiver

iperf Done.
`

`iperf3 -c 127.0.0.1 -R
Connecting to host 127.0.0.1, port 5201
Reverse mode, remote host 127.0.0.1 is sending
[ 5] local 127.0.0.1 port 40510 connected to 127.0.0.1 port 5201
[ ID] Interval Transfer Bitrate
[ 5] 0.00-1.00 sec 10.5 GBytes 90.1 Gbits/sec
[ 5] 1.00-2.00 sec 10.6 GBytes 91.1 Gbits/sec
[ 5] 2.00-3.00 sec 10.6 GBytes 90.8 Gbits/sec
[ 5] 3.00-4.00 sec 10.6 GBytes 90.8 Gbits/sec
[ 5] 4.00-5.00 sec 10.6 GBytes 90.7 Gbits/sec
[ 5] 5.00-6.00 sec 10.6 GBytes 90.8 Gbits/sec
[ 5] 6.00-7.00 sec 10.6 GBytes 90.9 Gbits/sec
[ 5] 7.00-8.00 sec 10.6 GBytes 90.8 Gbits/sec
[ 5] 8.00-9.00 sec 10.6 GBytes 90.9 Gbits/sec
[ 5] 9.00-10.00 sec 10.6 GBytes 90.9 Gbits/sec


[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 106 GBytes 90.8 Gbits/sec 0 sender
[ 5] 0.00-10.00 sec 106 GBytes 90.8 Gbits/sec receiver

iperf Done.

iperf3 -c 127.0.0.1 --bidir
Connecting to host 127.0.0.1, port 5201
[ 5] local 127.0.0.1 port 40500 connected to 127.0.0.1 port 5201
[ 7] local 127.0.0.1 port 40502 connected to 127.0.0.1 port 5201
[ ID][Role] Interval Transfer Bitrate Retr Cwnd
[ 5][TX-C] 0.00-1.00 sec 9.98 GBytes 85.8 Gbits/sec 0 1.44 MBytes
[ 7][RX-C] 0.00-1.00 sec 1022 MBytes 8.58 Gbits/sec
[ 5][TX-C] 1.00-2.00 sec 10.1 GBytes 86.4 Gbits/sec 0 1.44 MBytes
[ 7][RX-C] 1.00-2.00 sec 1.01 GBytes 8.64 Gbits/sec
[ 5][TX-C] 2.00-3.00 sec 10.1 GBytes 86.6 Gbits/sec 0 1.44 MBytes
[ 7][RX-C] 2.00-3.00 sec 1.01 GBytes 8.66 Gbits/sec
[ 5][TX-C] 3.00-4.00 sec 10.1 GBytes 86.5 Gbits/sec 0 1.44 MBytes
[ 7][RX-C] 3.00-4.00 sec 1.01 GBytes 8.65 Gbits/sec
[ 5][TX-C] 4.00-5.00 sec 10.1 GBytes 86.6 Gbits/sec 0 1.44 MBytes
[ 7][RX-C] 4.00-5.00 sec 1.01 GBytes 8.66 Gbits/sec
[ 5][TX-C] 5.00-6.00 sec 9.96 GBytes 85.6 Gbits/sec 0 1.44 MBytes
[ 7][RX-C] 5.00-6.00 sec 1020 MBytes 8.56 Gbits/sec
[ 5][TX-C] 6.00-7.00 sec 9.93 GBytes 85.3 Gbits/sec 0 2.19 MBytes
[ 7][RX-C] 6.00-7.00 sec 1017 MBytes 8.53 Gbits/sec
[ 5][TX-C] 7.00-8.00 sec 9.92 GBytes 85.2 Gbits/sec 0 2.19 MBytes
[ 7][RX-C] 7.00-8.00 sec 1016 MBytes 8.52 Gbits/sec
[ 5][TX-C] 8.00-9.00 sec 9.92 GBytes 85.2 Gbits/sec 0 2.19 MBytes
[ 7][RX-C] 8.00-9.00 sec 1016 MBytes 8.52 Gbits/sec
[ 5][TX-C] 9.00-10.00 sec 10.0 GBytes 86.1 Gbits/sec 0 2.19 MBytes
[ 7][RX-C] 9.00-10.00 sec 1.00 GBytes 8.61 Gbits/sec


[ ID][Role] Interval Transfer Bitrate Retr
[ 5][TX-C] 0.00-10.00 sec 100 GBytes 85.9 Gbits/sec 0 sender
[ 5][TX-C] 0.00-10.00 sec 100 GBytes 85.9 Gbits/sec receiver
[ 7][RX-C] 0.00-10.00 sec 10.0 GBytes 8.60 Gbits/sec 0 sender
[ 7][RX-C] 0.00-10.00 sec 10.0 GBytes 8.59 Gbits/sec receiver

iperf Done.
git clone https://github.com/esnet/iperf
./configure
make -j8
cd src
./iperf3 -v
iperf 3.12 (cJSON 1.7.15)
Linux ursa 5.10.0-14-amd64 #1 SMP Debian 5.10.113-1 (2022-04-29) x86_64
Optional features available: CPU affinity setting, IPv6 flow label, TCP congestion algorithm setting, sendfile / zerocopy, socket pacing, authentication, bind to device, support IPv4 don't fragment
ldd ./iperf3
not a dynamic executable
./iperf3 -s
iperf/src# ./iperf3 -c 127.0.0.1 --bidir
Connecting to host 127.0.0.1, port 5201
[ 5] local 127.0.0.1 port 40516 connected to 127.0.0.1 port 5201
[ 7] local 127.0.0.1 port 40518 connected to 127.0.0.1 port 5201
[ ID][Role] Interval Transfer Bitrate Retr Cwnd
[ 5][TX-C] 0.00-1.00 sec 9.85 GBytes 84.6 Gbits/sec 0 1.37 MBytes
[ 7][RX-C] 0.00-1.00 sec 1009 MBytes 8.46 Gbits/sec
[ 5][TX-C] 1.00-2.00 sec 10.0 GBytes 86.0 Gbits/sec 0 1.37 MBytes
[ 7][RX-C] 1.00-2.00 sec 1.00 GBytes 8.60 Gbits/sec
[ 5][TX-C] 2.00-3.00 sec 10.0 GBytes 86.3 Gbits/sec 0 1.37 MBytes
[ 7][RX-C] 2.00-3.00 sec 1.00 GBytes 8.63 Gbits/sec
[ 5][TX-C] 3.00-4.00 sec 10.0 GBytes 86.2 Gbits/sec 0 1.37 MBytes
[ 7][RX-C] 3.00-4.00 sec 1.00 GBytes 8.62 Gbits/sec
[ 5][TX-C] 4.00-5.00 sec 10.0 GBytes 86.3 Gbits/sec 0 1.37 MBytes
[ 7][RX-C] 4.00-5.00 sec 1.00 GBytes 8.63 Gbits/sec
[ 5][TX-C] 5.00-6.00 sec 10.0 GBytes 86.2 Gbits/sec 0 1.37 MBytes
[ 7][RX-C] 5.00-6.00 sec 1.00 GBytes 8.62 Gbits/sec
[ 5][TX-C] 6.00-7.00 sec 10.0 GBytes 86.3 Gbits/sec 0 1.37 MBytes
[ 7][RX-C] 6.00-7.00 sec 1.00 GBytes 8.63 Gbits/sec
[ 5][TX-C] 7.00-8.00 sec 10.1 GBytes 86.3 Gbits/sec 0 1.37 MBytes
[ 7][RX-C] 7.00-8.00 sec 1.01 GBytes 8.63 Gbits/sec
[ 5][TX-C] 8.00-9.00 sec 10.0 GBytes 86.3 Gbits/sec 0 1.37 MBytes
[ 7][RX-C] 8.00-9.00 sec 1.00 GBytes 8.63 Gbits/sec
[ 5][TX-C] 9.00-10.00 sec 10.0 GBytes 86.3 Gbits/sec 0 1.37 MBytes
[ 7][RX-C] 9.00-10.00 sec 1.00 GBytes 8.63 Gbits/sec


[ ID][Role] Interval Transfer Bitrate Retr
[ 5][TX-C] 0.00-10.00 sec 100 GBytes 86.1 Gbits/sec 0 sender
[ 5][TX-C] 0.00-10.00 sec 100 GBytes 86.1 Gbits/sec receiver
[ 7][RX-C] 0.00-10.00 sec 10.0 GBytes 8.62 Gbits/sec 0 sender
[ 7][RX-C] 0.00-10.00 sec 10.0 GBytes 8.61 Gbits/sec receiver

iperf Done.


top
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
12220 root 20 0 7252 3440 2996 R 100.0 0.0 0:28.43 iperf3
12189 root 20 0 7252 3608 3008 S 79.7 0.0 0:30.85 iperf3
`

@davidBar-On
Copy link
Contributor

Thanks for the detailed input.
Since when --bidir when is not used the throughput in both directions is about the same, I suspect that the issue is related to the TCP buffering. E.g., theoretical scenario: for some reason the TX stream is prioritized over the RX stream. Since the the high throughput TX fills the system TCP buffers, the system keeps only 10% of the buffers for lower priority streams. This is why the RX throughput is 10% of the TX throughput. (Note that on my machine the throughput is almost evenly divided between the RX and TX streams.)

I am not not familiar enough with the Debian Linux settings to suggest direct evaluation of this hypothesis. However, the following tests may help:

  1. Run the test with -P2 (and without -R or -bidir). The purpose is to see whether the two streams on the same direction get the same priority, or the throughput of one of them will be only 10% of the other. The test may be repeater using -R -P2 to see whether there is a difference if the streams are TX or RX.
  2. Run two servers, using different port for each (e.g. the second may use -p 5202). Run two tests in parallel - each using a different server: one with -R and one without it. The purpose is to see whether the issue is related to having both TX and RX streams from/to the same server/client (if throughput will be the same for both tests), or whether this is probably a system issue (if throughput of one test will be 10% of the other test).

@AnatoliChe
Copy link

With -P2 I have random results
first:
`
/tmp/iperf/src# ./iperf3 -c 127.0.0.1 -P2 --bidir
[ ID][Role] Interval Transfer Bitrate Retr
[ 5][TX-C] 0.00-10.00 sec 25.3 GBytes 21.7 Gbits/sec 0 sender
[ 5][TX-C] 0.00-10.00 sec 25.2 GBytes 21.7 Gbits/sec receiver
[ 7][TX-C] 0.00-10.00 sec 25.3 GBytes 21.7 Gbits/sec 0 sender
[ 7][TX-C] 0.00-10.00 sec 25.2 GBytes 21.7 Gbits/sec receiver
[SUM][TX-C] 0.00-10.00 sec 50.5 GBytes 43.4 Gbits/sec 0 sender
[SUM][TX-C] 0.00-10.00 sec 50.5 GBytes 43.4 Gbits/sec receiver
[ 9][RX-C] 0.00-10.00 sec 34.0 GBytes 29.2 Gbits/sec 0 sender
[ 9][RX-C] 0.00-10.00 sec 34.0 GBytes 29.2 Gbits/sec receiver
[ 11][RX-C] 0.00-10.00 sec 34.0 GBytes 29.2 Gbits/sec 0 sender
[ 11][RX-C] 0.00-10.00 sec 34.0 GBytes 29.2 Gbits/sec receiver
[SUM][RX-C] 0.00-10.00 sec 68.1 GBytes 58.5 Gbits/sec 0 sender
[SUM][RX-C] 0.00-10.00 sec 68.0 GBytes 58.4 Gbits/sec receiver

`

second:
`
[ ID][Role] Interval Transfer Bitrate Retr
[ 5][TX-C] 0.00-10.00 sec 49.5 GBytes 42.5 Gbits/sec 0 sender
[ 5][TX-C] 0.00-10.00 sec 49.5 GBytes 42.5 Gbits/sec receiver
[ 7][TX-C] 0.00-10.00 sec 49.5 GBytes 42.5 Gbits/sec 0 sender
[ 7][TX-C] 0.00-10.00 sec 49.5 GBytes 42.5 Gbits/sec receiver
[SUM][TX-C] 0.00-10.00 sec 99.1 GBytes 85.1 Gbits/sec 0 sender
[SUM][TX-C] 0.00-10.00 sec 99.1 GBytes 85.1 Gbits/sec receiver
[ 9][RX-C] 0.00-10.00 sec 4.97 GBytes 4.27 Gbits/sec 0 sender
[ 9][RX-C] 0.00-10.00 sec 4.95 GBytes 4.25 Gbits/sec receiver
[ 11][RX-C] 0.00-10.00 sec 4.97 GBytes 4.27 Gbits/sec 0 sender
[ 11][RX-C] 0.00-10.00 sec 4.95 GBytes 4.25 Gbits/sec receiver
[SUM][RX-C] 0.00-10.00 sec 9.94 GBytes 8.54 Gbits/sec 0 sender
[SUM][RX-C] 0.00-10.00 sec 9.91 GBytes 8.51 Gbits/sec receiver

`

I belive it's limited by CPU.
'cause iperf3 uses only one core?

When I start 2 servers with diff ports and 2 client simultaneously I have
[ 5] 0.00-10.00 sec 92.6 GBytes 79.6 Gbits/sec 0 sender
[ 5] 0.00-10.00 sec 92.6 GBytes 79.6 Gbits/sec receiver
for both pairs client/server.

And have top:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
14277 root 20 0 7144 3464 3016 R 100.0 0.0 0:22.59 iperf3
14278 root 20 0 7144 3320 2868 R 100.0 0.0 0:21.87 iperf3
14210 root 20 0 7124 3348 2996 R 79.0 0.0 0:28.59 iperf3
14224 root 20 0 7124 3280 2928 R 78.0 0.0 0:27.01 iperf3

So I pretty sure it's problem of architecture of iperf3 which uses only one core. In bidir mode it's a problem...

@davidBar-On
Copy link
Contributor

So I pretty sure it's problem of architecture of iperf3 which uses only one core

I agree this is an issue, and in this case the single CPU performance seem to limit the total throughput as you suggest. The only option available is to run the server and client on different CPUs, using the --affinity option.

However, I still don't understand what is the reason that one stream throughput is only 10% of the other. Even if only one CPU is used, then usually the throughput is evenly divided between the streams. Can it depend on the specific CPUs allocated for the server and the client? If this is the case, then using the --affinity may help - at least to verify whether this is the source of the issue.

Also, if you have any suggestion/guess about the reason of the 10% throughput this will be very helpful for understanding the iperf3 limitations and suggested usage.

@AnatoliChe
Copy link

if you take a look at htop, you can notice server takes 100% of CPU, client only 80%.
So RX by client can do more work and can generate more packets.
so -A won't help, scheduler works perfectly.
-b can help.

iperf3 -s -A0-4

Server listening on 5201 (test #1)

Accepted connection from 127.0.0.1, port 61082
[ 5] local 127.0.0.1 port 5201 connected to 127.0.0.1 port 61084
[ 8] local 127.0.0.1 port 5201 connected to 127.0.0.1 port 61086

[ ID][Role] Interval Transfer Bitrate Retr
[ 5][RX-S] 0.00-120.00 sec 130 GBytes 9.29 Gbits/sec receiver
[ 8][TX-S] 0.00-120.00 sec 1.15 TBytes 84.5 Gbits/sec 0 sender

/tmp/iperf/src# ./iperf3 -c 127.0.0.1 --bidir -A5-8 -t 120
Connecting to host 127.0.0.1, port 5201
[ 5] local 127.0.0.1 port 61084 connected to 127.0.0.1 port 5201
[ 7] local 127.0.0.1 port 61086 connected to 127.0.0.1 port 5201
[ ID][Role] Interval Transfer Bitrate Retr
[ 5][TX-C] 0.00-120.00 sec 130 GBytes 9.29 Gbits/sec 0 sender
[ 5][TX-C] 0.00-120.00 sec 130 GBytes 9.29 Gbits/sec receiver
[ 7][RX-C] 0.00-120.00 sec 1.15 TBytes 84.5 Gbits/sec 0 sender
[ 7][RX-C] 0.00-120.00 sec 1.15 TBytes 84.5 Gbits/sec receiver

iperf Done.

htop

0[|||||||||||||||||||||||||||||||||||||||||||100.0%] 4[ 0.0%] 8[ 0.0%] 12[ 0.0%]
1[| 0.5%] 5[|||||||||||||||||||||||||||||||||||||||| 78.9%] 9[ 0.0%] 13[ 0.0%]
2[ 0.0%] 6[ 0.0%] 10[ 0.0%] 14[ 0.0%]
3[ 0.0%] 7[ 0.0%] 11[ 0.0%] 15[ 0.0%]
Mem[||| 885M/126G] Tasks: 55, 110 thr; 2 running
Swp[ 0K/238G] Load average: 0.88 0.25 0.09
Uptime: 3 days, 08:18:32

PID USER PRI NI VIRT RES SHR S CPU%▽MEM% TIME+ Command
19082 root 20 0 7252 3364 2920 R 100. 0.0 0:26.79 /tmp/iperf/src/.libs/iperf3 -s -A0-4
19096 root 20 0 7252 3440 2992 S 83.0 0.0 0:22.43 /tmp/iperf/src/.libs/iperf3 -c 127.0.0.1 --bidir -A5-8 -t 120

@davidBar-On
Copy link
Contributor

davidBar-On commented Oct 16, 2022

if you take a look at htop, you can notice server takes 100% of CPU, client only 80%.

I now understand why you suggest that the single CPU allocation may be the issue, and I agree that it seem to be a better explanation then my suggestion about buffers usage.

There is one issue about the htop output that I don't understand. Per iperf3 code and help (and I also tried it) the -A 0-4 you used is wrong. The -A option format is either -A n - running on CPU n, or -A n,m on the client side which means running the client on CPU n and the server on CPU m. Setting -A 0-4 for the server and -A 5-8 for the client means that the server should run on CPU 0 and the client should run on CPU 5 (the "-4" and "-8" in the option values are ignored). However, the htop output seem to show that the client runs on CPU 1 and not 5 (is this because there are only 4 CPUs?).

It will help if you can run the test again - once with client option like -A 1,2 (client on CPU 1 and server on CPU 2), and once with client option like -A 1,1 (running both on the same CPU).

Note that on my machine, with -A 1,2, top shows:

%Cpu1  :  2.7 us, 89.3 sy,  0.0 ni,  0.7 id,  0.0 wa,  0.0 hi,  7.4 si,  0.0 st
%Cpu2  :  3.3 us, 87.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  9.7 si,  0.0 st

And -A 1,1 shows:

%Cpu1  :  2.3 us, 92.7 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  5.0 si,  0.0 st
%Cpu2  :  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st

@AnatoliChe
Copy link

cpu 0
0[|||||||||||||||||||||||||||||||||||||||||||100.0%]
cpu 5
5[|||||||||||||||||||||||||||||||||||||||| 78.9%]
in the previous message

@AnatoliChe
Copy link

./iperf3 -s -A 1,2 iperf3: parameter error - some option you are trying to set is client only

iperf3 -s -A 1
./iperf3 -c 127.0.0.1 --bidir -A2 -t 120
[ ID][Role] Interval Transfer Bitrate Retr
[ 5][TX-C] 0.00-120.00 sec 117 GBytes 8.40 Gbits/sec 0 sender
[ 5][TX-C] 0.00-120.00 sec 117 GBytes 8.40 Gbits/sec receiver
[ 7][RX-C] 0.00-120.00 sec 1.15 TBytes 84.0 Gbits/sec 0 sender
[ 7][RX-C] 0.00-120.00 sec 1.15 TBytes 84.0 Gbits/sec receiver

iperf Done.

0[|                                            0.7%]    4[                                             0.0%]     8[                                             0.0%]   12[                                             0.0%]
1[|||||||||||||||||||||||||||||||||||||||||||100.0%]    5[                                             0.0%]     9[                                             0.0%]   13[                                             0.0%]
2[|||||||||||||||||||||||||||||||||||||||     77.1%]    6[                                             0.0%]    10[                                             0.0%]   14[                                             0.0%]
3[                                             0.0%]    7[                                             0.0%]    11[                                             0.0%]   15[                                             0.0%]

Mem[||| 883M/126G] Tasks: 55, 110 thr; 3 running
Swp[ 0K/238G] Load average: 1.61 1.02 0.44
Uptime: 4 days, 00:50:01

PID USER      PRI  NI  VIRT   RES   SHR S CPU%▽MEM%   TIME+  Command

22640 root 20 0 7252 3680 2996 R 99.3 0.0 3:42.44 /tmp/iperf/src/.libs/iperf3 -s -A 1
22762 root 20 0 7252 3432 2992 R 80.0 0.0 0:30.86 /tmp/iperf/src/.libs/iperf3 -c 127.0.0.1 --bidir -A2 -t 120

at rhe same CPU server and client:
iperf3 -s -A 1
iperf3 -c 127.0.0.1 --bidir -A1
[ ID][Role] Interval Transfer Bitrate Retr
[ 5][TX-C] 0.00-10.00 sec 49.5 GBytes 42.5 Gbits/sec 0 sender
[ 5][TX-C] 0.00-10.01 sec 49.5 GBytes 42.5 Gbits/sec receiver
[ 7][RX-C] 0.00-10.00 sec 32.6 GBytes 28.0 Gbits/sec 0 sender
[ 7][RX-C] 0.00-10.01 sec 32.6 GBytes 28.0 Gbits/sec receiver

iperf Done.

@davidBar-On
Copy link
Contributor

cpu 5
5[|||||||||||||||||||||||||||||||||||||||| 78.9%]

Oops! My mistake. I read the line number as the CPU number ....

Thanks for sending the additional results. It shows that the 10% throughput issue is when the server and client are running on different CPUs. However, although using the same CPU seem to be better, "improving" to exactly two thirds (66.7%) is strange. Again, on my machine it is about the same performance on both directions. I also don't see anything in the iperf3 code that can lead to such throughput distribution.

I suspect that the throughput difference (the exact 10% and two thirds) is related to system settings, although I don't know what they may be. Both the priority and nice values are the same for both processes. Reading the Debian SCHED(7) man page, I don't see any scheduling policy settings that can lead this behavior. I assume that the -b helps just because it reduces that CPU (and buffers?) usage.

Currently I don't have further suggestions for how to evaluate this issue, except for evaluating the system settings.

@kralo
Copy link

kralo commented May 8, 2023

I had the same issue on a fiber "marketed 100/50" Mbps FTTH Line (provisioned around 113/56).

Two things that worked for me: setting fq-rate to the higher limit together with -N, i.e.

u@debian-vpn:~$ iperf3 -c speedtest.wtnet.de -p 5203 --bidir --fq-rate 100M -N
[ ID][Role] Interval           Transfer     Bitrate         Retr
[  5][TX-C]   0.00-31.68  sec   185 MBytes  49.0 Mbits/sec   27             sender
[  5][TX-C]   0.00-31.68  sec  0.00 Bytes  0.00 bits/sec                  receiver
[  7][RX-C]   0.00-31.68  sec  0.00 Bytes  0.00 bits/sec                  sender
[  7][RX-C]   0.00-31.68  sec   377 MBytes  99.8 Mbits/sec                  receiver

or selecting a different congestion control algorithm

u@debian-vpn:~$ iperf3 -c speedtest.wtnet.de -p 5203 -t 10 --bidir -b 103M -C cubic -N
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID][Role] Interval           Transfer     Bitrate         Retr
[  5][TX-C]   0.00-10.00  sec  59.5 MBytes  49.9 Mbits/sec    3             sender
[  5][TX-C]   0.00-10.03  sec  57.2 MBytes  47.9 Mbits/sec                  receiver
[  7][RX-C]   0.00-10.00  sec   123 MBytes   103 Mbits/sec    0             sender
[  7][RX-C]   0.00-10.03  sec   122 MBytes   102 Mbits/sec                  receiver

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants