Performance degradation on Fedora CoreOS after upgrade from F32 to F33 #755

wkruse · 2021-02-26T10:23:00Z

Describe the bug
We are running load and performance tests with our Java application in a test Kubernetes cluster regularly. After upgrading from 32.20201104.3.0 to 33.20201201.3.0 (and up to 33.20210201.3.0) we discovered a severe performance degradation in application response times. As we were upgrading Kubernetes (https://typhoon.psdn.io) along the way, we weren't sure if it was related to the Kubernetes version. We re-provisioned 32.20201104.3.0 with Kubernetes 1.20.2 and the performance degradation went away. We also tested 33.20210117.3.2 with Kubernetes 1.20.2 and different network fabrics (Calico, Flannel) without any positive effect on the performance degradation. So, it looks like something in F33 breaks it for us. It also looks like the CPU load increased in F33.
We would be happy to run further tests to narrow down the issue, but we are out of ideas how to proceed. Any hints/ideas would be highly appreciated.

System details

Bare Metal (Dell PowerEdge R630, Intel(R) Xeon(R) CPU E5-2640 v3 @ 2.60GHz, 64 GB RAM, 2x Intel(R) 2P X520/2P I350 rNDC 10 Gbit NICs)
Fedora CoreOS version 33.20210201.3.0, mitigations OFF
Kubernetes 1.19.3-1.20.4 (https://typhoon.psdn.io)

Ignition config
https://gist.github.com/wkruse/107e3be2ffb2ead7c26a955fe8f0b0e8

Additional infos

Some iperf3 test results

K8s 1.19.3 Calico (Version 3.16.4) FCOS 32.20201004.3.0 (mitigations off)

========================================================
 Benchmark Results
=========================================================
 Name            : knb-300588
 Date            : 2021-02-16 14:01:35 UTC
 Generator       : knb
 Version         : 1.5.0
 Server          : node11.xxx
 Client          : node04.xxx
 UDP Socket size : auto
=========================================================
  Discovered CPU         : Intel(R) Xeon(R) CPU E5-2640 v3 @ 2.60GHz
  Discovered Kernel      : 5.8.12-200.fc32.x86_64
  Discovered k8s version : v1.19.3
  Discovered MTU         : 1480
  Idle :
    bandwidth = 0 Mbit/s
    client cpu = total 0.49% (user 0.30%, nice 0.00%, system 0.18%, iowait 0.01%, steal 0.00%)
    server cpu = total 0.34% (user 0.22%, nice 0.00%, system 0.12%, iowait 0.00%, steal 0.00%)
    client ram = 1182 MB
    server ram = 1082 MB
  Pod to pod :
    TCP :
      bandwidth = 5881 Mbit/s
      client cpu = total 3.17% (user 0.30%, nice 0.00%, system 2.86%, iowait 0.01%, steal 0.00%)
      server cpu = total 5.06% (user 0.20%, nice 0.00%, system 4.86%, iowait 0.00%, steal 0.00%)
      client ram = 1209 MB
      server ram = 1102 MB
    UDP :
      bandwidth = 2076 Mbit/s
      client cpu = total 3.72% (user 0.32%, nice 0.00%, system 3.40%, iowait 0.00%, steal 0.00%)
      server cpu = total 2.64% (user 0.35%, nice 0.00%, system 2.29%, iowait 0.00%, steal 0.00%)
      client ram = 1205 MB
      server ram = 1091 MB
  Pod to Service :
    TCP :
      bandwidth = 6215 Mbit/s
      client cpu = total 3.11% (user 0.30%, nice 0.00%, system 2.81%, iowait 0.00%, steal 0.00%)
      server cpu = total 4.61% (user 0.20%, nice 0.00%, system 4.41%, iowait 0.00%, steal 0.00%)
      client ram = 1217 MB
      server ram = 1101 MB
    UDP :
      bandwidth = 1618 Mbit/s
      client cpu = total 3.42% (user 0.33%, nice 0.00%, system 3.09%, iowait 0.00%, steal 0.00%)
      server cpu = total 1.91% (user 0.30%, nice 0.00%, system 1.61%, iowait 0.00%, steal 0.00%)
      client ram = 1214 MB
      server ram = 1099 MB
=========================================================

K8s 1.20.2 calico (Version 3.17.2) FCOS 33.20210201.3.0 (mitigations off)

=========================================================
 Benchmark Results
=========================================================
 Name            : knb-418856
 Date            : 2021-02-23 07:31:34 UTC
 Generator       : knb
 Version         : 1.5.0
 Server          : node11.xxx
 Client          : node04.xxx
 UDP Socket size : auto
=========================================================
  Discovered CPU         : Intel(R) Xeon(R) CPU E5-2640 v3 @ 2.60GHz
  Discovered Kernel      : 5.10.12-200.fc33.x86_64
  Discovered k8s version : v1.20.2
  Discovered MTU         : 1480
  Idle :
    bandwidth = 0 Mbit/s
    client cpu = total 1.21% (user 0.77%, nice 0.00%, system 0.43%, iowait 0.01%, steal 0.00%)
    server cpu = total 0.82% (user 0.51%, nice 0.00%, system 0.31%, iowait 0.00%, steal 0.00%)
    client ram = 118707 MB
    server ram = 21095 MB
  Pod to pod :
    TCP :
      bandwidth = 6427 Mbit/s
      client cpu = total 5.15% (user 0.84%, nice 0.00%, system 4.30%, iowait 0.01%, steal 0.00%)
      server cpu = total 5.06% (user 0.58%, nice 0.00%, system 4.48%, iowait 0.00%, steal 0.00%)
      client ram = 118713 MB
      server ram = 21095 MB
    UDP :
      bandwidth = 2082 Mbit/s
      client cpu = total 4.04% (user 0.74%, nice 0.00%, system 3.30%, iowait 0.00%, steal 0.00%)
      server cpu = total 3.48% (user 0.75%, nice 0.00%, system 2.73%, iowait 0.00%, steal 0.00%)
      client ram = 118709 MB
      server ram = 21094 MB
  Pod to Service :
    TCP :
      bandwidth = 6370 Mbit/s
      client cpu = total 4.94% (user 0.64%, nice 0.00%, system 4.29%, iowait 0.01%, steal 0.00%)
      server cpu = total 4.78% (user 0.48%, nice 0.00%, system 4.30%, iowait 0.00%, steal 0.00%)
      client ram = 118715 MB
      server ram = 21092 MB
    UDP :
      bandwidth = 1630 Mbit/s
      client cpu = total 4.22% (user 0.77%, nice 0.00%, system 3.44%, iowait 0.01%, steal 0.00%)
      server cpu = total 3.18% (user 0.66%, nice 0.00%, system 2.52%, iowait 0.00%, steal 0.00%)
      client ram = 118712 MB
      server ram = 21092 MB
=========================================================

We also tested with different MTU sizes (1440, 1460, 1480, 1500) without any positive effect.

The text was updated successfully, but these errors were encountered:

lucab · 2021-03-03T09:25:00Z

Thanks for the report. Does the network degradation affect host-to-host communications too, or is it only visible on the overlay networks?

From a quick look at the report, I have a few doubts:

the symptom is increased HTTP response times, right?. This may translate into unexpected latency added by some component in the stack. I don't think iperf is a good tool to measure that, as it mostly benchmark bandwidth.
the iperf report doesn't look too worrisome to me. There is some floating in CPU and bandwidth numbers, but it may fit into physiological variance across experiments. (There is a huge difference in RAM numbers though, but it may not be relevant)
I see a stacked config with teaming + virtual-IP. Make sure they aren't somehow flapping under load, as that may result in jitter at the application level.
I see a proxy setup. Make sure those connection aren't being wrongly proxied (e.g. no_proxy vs NO _PROXY envs).
I see your sysctl are maxing out tx/rx buffers, which usually results in trading off latency for bandwidth. Double check whether you really benefit from that, and whether tcp_low_latency (Nagle's Algorithm) has a role in this.

Making a random guess, DNS is the usual suspect when there is a sudden noticeable increase in latency. The common scenario is that, without aggressive connection reusing, something in the stack tries to perform a DNS resolution on each new connection. It may not be noticeable in most cases, but it may suddenly spike due to changes in how the stack handles caching / search-domains / retries / negative responses / etc.

wkruse · 2021-03-05T15:44:19Z

Thank you for the quick response. We didn't test host-to-host, as according to iperf3 there was no network bandwidth degradation.

the symptoms are increased response times and increased CPU load, it is kind of difficult to pinpoint to either network or CPU as each affect each other and our system is sensitive to any form of additional latency or CPU cycles
we repeated our load test with F33 and no teaming (static ip configuration, 1x NIC) without any improvements (virtual-IPs are only used to expose cluster services, they are not used during our load test - we checked the logs and didn't see any flapping)
our proxy is only used for bootstrapping the cluster, but we removed our proxy settings completely and repeated our load test with F33 without any improvements
we removed /etc/sysctl.d/90-override.conf with sysctl network tunings completely and repeated our load test with F33 without any improvements (we tuned for bandwidth a while ago and just kept the settings around)

Regarding DNS, we are looking into it. Our applications are actually geared towards aggressive connection reuse via connection pooling. We will try to collect some metrics to prove it. We also collected some flame graphs in F32 and F33 using perf tools to detect deviations between the OS versions, but we didn't find significant differencies. For us it looks like everything runs a bit slower in F33.

wkruse · 2021-03-12T10:36:53Z

We enabled DNS metrics and compared it between F32 and F33 without any significant differences, especially concerning the number of packets and response times.

For illustration of the gap between F32 and F33 have a look at the CPU load (yellow is F33, blue is F32) running two workers processing same type of jobs

and the throughput (6x workers, 5x workers running F32, 1x worker running F33, each pair running the same type of a job)

But maybe we are onto something (which could also explain the performance drop as we moved from CoreOS to Fedora CoreOS #542 - not the network but the overall performance part).

During our investigation of low level OS settings and metrics (scheduling, executed time-slices per CPU, system and user load etc.) we discovered that /sys/devices/system/cpu/intel_pstate/min_perf_pct was set to 35, after changing it to 100 we got better results in both F32 and F33 load tests (still retaining the performance gap between F32 and F33). The scheduling between F32 and F33 changed (due to Linux kernel upgrade), could that be an issue?

Could it be, that CoreOS defaults were optimized for max. server performance and Fedora CoreOS is more geared towards desktop workloads and energy efficiency? Do you have any recommendations to optimize for max. server performance and minimal latencies?

wkruse · 2021-03-26T09:03:08Z

Re-running our load and performance tests several times with 33.20210301.3.1 and Typhoon Kubernetes v1.20.5 we got the results from F32. It looks like the scheduling has changed and in the last version it looks like it was in F32.

wkruse added the kind/bug label Feb 26, 2021

wkruse changed the title ~~Performance degradation on Fedora CoreOS after upgrade from F32 auf F33~~ Performance degradation on Fedora CoreOS after upgrade from F32 to F33 Feb 26, 2021

lucab added the needs/more-information label Mar 3, 2021

wkruse closed this as completed Mar 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance degradation on Fedora CoreOS after upgrade from F32 to F33 #755

Performance degradation on Fedora CoreOS after upgrade from F32 to F33 #755

wkruse commented Feb 26, 2021

lucab commented Mar 3, 2021

wkruse commented Mar 5, 2021

wkruse commented Mar 12, 2021

wkruse commented Mar 26, 2021

Performance degradation on Fedora CoreOS after upgrade from F32 to F33 #755

Performance degradation on Fedora CoreOS after upgrade from F32 to F33 #755

Comments

wkruse commented Feb 26, 2021

lucab commented Mar 3, 2021

wkruse commented Mar 5, 2021

wkruse commented Mar 12, 2021

wkruse commented Mar 26, 2021