Add tcp window clamp example #172

bobrik · 2022-11-25T00:57:52Z

Well behaving download from the linked report:

ivan@vm:~$ curl -o /dev/null http://localhost:4444
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 3616M    0 3616M    0     0  33.7M      0 --:--:--  0:01:47 --:--:-- 44.7M

When it gets clamped:

ivan@vm:~$ curl -o /dev/null http://localhost:4444
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 4378M    0 4378M    0     0  30.3M      0 --:--:--  0:02:24 --:--:-- 9832k

The corresponding counter:

  # HELP ebpf_exporter_tcp_window_clamps_total Number of times that TCP window was clamped to a low value
  # TYPE ebpf_exporter_tcp_window_clamps_total counter
  ebpf_exporter_tcp_window_clamps_total 19668

Well behaving download from the linked report: ``` ivan@vm:~$ curl -o /dev/null http://localhost:4444 % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 3616M 0 3616M 0 0 33.7M 0 --:--:-- 0:01:47 --:--:-- 44.7M ``` When it gets clamped: ``` ivan@vm:~$ curl -o /dev/null http://localhost:4444 % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 4378M 0 4378M 0 0 30.3M 0 --:--:-- 0:02:24 --:--:-- 9832k ``` The corresponding counter: ``` # HELP ebpf_exporter_tcp_window_clamps_total Number of times that TCP window was clamped to a low value # TYPE ebpf_exporter_tcp_window_clamps_total counter ebpf_exporter_tcp_window_clamps_total 19668 ```

wenlxie · 2022-11-28T09:15:22Z

examples/tcp-window-clamps.bpf.c

+    return 0;
+}
+
+SEC("kretprobe/tcp_try_rmem_schedule")


I met performance issue when use kreprobe to probe function ipt_do_table() in kernel 5.4.0. So I'd suggest that you'd better have a LnP test before this enabled on production.

What do you mean by LNP test?

load and performance, stress test

That's a given. I did a quick test with kernel source download in a VM:

ivan@vm:~$ curl -o /dev/null https://cdn.kernel.org/pub/linux/kernel/v6.x/linux-6.0.11.tar.xz % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 127M 100 127M 0 0 44.5M 0 0:00:02 0:00:02 --:--:-- 44.5M

I see ~4k packets going via iptables for this:

[3973:134454161] -A INPUT -i eth0 -p tcp -m tcp --sport 443 -j ACCEPT

There are 1286 bpf program runs and ~8ms of CPU time spent:

# HELP ebpf_exporter_ebpf_program_info Info about ebpf programs # TYPE ebpf_exporter_ebpf_program_info gauge ebpf_exporter_ebpf_program_info{config="tcp-window-clamps",id="552",program="tcp_try_rmem_schedule",tag="f14b021593f58e05"} 1 ebpf_exporter_ebpf_program_info{config="tcp-window-clamps",id="553",program="tcp_try_rmem_schedule_ret",tag="d88afa963de02adb"} 1 # HELP ebpf_exporter_ebpf_program_run_count_total How many times has the program been executed # TYPE ebpf_exporter_ebpf_program_run_count_total counter ebpf_exporter_ebpf_program_run_count_total{id="552"} 1286 ebpf_exporter_ebpf_program_run_count_total{id="553"} 1286 # HELP ebpf_exporter_ebpf_program_run_time_seconds How long has the program been executing # TYPE ebpf_exporter_ebpf_program_run_time_seconds counter ebpf_exporter_ebpf_program_run_time_seconds{id="552"} 0.00434047 ebpf_exporter_ebpf_program_run_time_seconds{id="553"} 0.003490548

With fexit based probe it drops to 1190 runs and 3ms of CPU time:

# HELP ebpf_exporter_ebpf_program_run_count_total How many times has the program been executed # TYPE ebpf_exporter_ebpf_program_run_count_total counter ebpf_exporter_ebpf_program_run_count_total{id="547"} 1190 # HELP ebpf_exporter_ebpf_program_run_time_seconds How long has the program been executing # TYPE ebpf_exporter_ebpf_program_run_time_seconds counter ebpf_exporter_ebpf_program_run_time_seconds{id="547"} 0.002882893

The number of runs depends on how buffers are drained.

With curl spending 711ms of combined system and user time it comes down to 1.13% for kprobe + kretprobe and 0.42% for fexit. Whether that's an acceptable overhead is up to any consumer (as with any other config).

@bobrik Thanks for the data.
In old kernel versions(at least 5.4.0), kretprobe is implemented with acquire global locks raw_spin_lock_irqsave(&rp->lock, flags) before get a free instance in pre_handler_kretprobe(). When there are high traffics (different 5 tuples) which means they are handled by different CPUs, it may trigger this issue. You can see high si usage and high latency of tcp packets.

examples/tcp-window-clamps.bpf.c

This prevents low tgid tasks from colliding with cpu id, however unlikely that is.

bobrik force-pushed the ivan/tcp-clamping branch from a66d87b to 9f44f30 Compare November 25, 2022 17:46

bobrik added 2 commits November 26, 2022 09:59

Allow using fexit in tcp-window-clamp example

50adc0f

Allow passing $CFLAGS to examples

3132cfc

wenlxie reviewed Nov 28, 2022

View reviewed changes

ignatk reviewed Dec 5, 2022

View reviewed changes

examples/tcp-window-clamps.bpf.c Outdated Show resolved Hide resolved

Use compound key in tcp-window-clamps example

f07fdd9

This prevents low tgid tasks from colliding with cpu id, however unlikely that is.

bobrik merged commit e833ea8 into cloudflare:master Dec 7, 2022

bobrik deleted the ivan/tcp-clamping branch December 7, 2022 23:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add tcp window clamp example #172

Add tcp window clamp example #172

bobrik commented Nov 25, 2022

wenlxie Nov 28, 2022 •

edited

Loading

bobrik Dec 5, 2022

wenlxie Dec 6, 2022

bobrik Dec 7, 2022

wenlxie Dec 9, 2022 •

edited

Loading

Add tcp window clamp example #172

Add tcp window clamp example #172

Conversation

bobrik commented Nov 25, 2022

wenlxie Nov 28, 2022 • edited Loading

Choose a reason for hiding this comment

bobrik Dec 5, 2022

Choose a reason for hiding this comment

wenlxie Dec 6, 2022

Choose a reason for hiding this comment

bobrik Dec 7, 2022

Choose a reason for hiding this comment

wenlxie Dec 9, 2022 • edited Loading

Choose a reason for hiding this comment

wenlxie Nov 28, 2022 •

edited

Loading

wenlxie Dec 9, 2022 •

edited

Loading