Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cfs-throttling example #311

Merged
merged 1 commit into from
Oct 24, 2023
Merged

Conversation

bobrik
Copy link
Contributor

@bobrik bobrik commented Oct 24, 2023

Example throttled program:

ivan@vm:~$ sudo systemd-run --pty --quiet --collect --unit stress-ng.service --property CPUQuota=8% stress-ng --cpu 1
stress-ng: info:  [70953] defaulting to a 86400 second (1 day, 0.00 secs) run per stressor
stress-ng: info:  [70953] dispatching hogs: 1 cpu

One can observe throttling with bpftrace:

ivan@vm:~/projects/ebpf_exporter$ sudo bpftrace -e 'kprobe:unthrottle_cfs_rq { $cfs_rq = (struct cfs_rq *) arg0; $throttled_nsec = $cfs_rq->rq->clock - $cfs_rq->throttled_clock; printf("unthrottle after %4dms (overall: %6lums): %s\n", $throttled_nsec / 1000000, $cfs_rq->tg->cfs_bandwidth.throttled_time / 1000000, cgroup_path($cfs_rq->tg->css.cgroup->kn->id)); }'
Attaching 1 probe...
unthrottle after   95ms (overall: 520871ms): unified:/system.slice/stress-ng.service
unthrottle after   91ms (overall: 520966ms): unified:/system.slice/stress-ng.service
unthrottle after   92ms (overall: 521058ms): unified:/system.slice/stress-ng.service
unthrottle after   88ms (overall: 521150ms): unified:/system.slice/stress-ng.service
unthrottle after   91ms (overall: 521239ms): unified:/system.slice/stress-ng.service
unthrottle after   96ms (overall: 521330ms): unified:/system.slice/stress-ng.service
unthrottle after   87ms (overall: 521426ms): unified:/system.slice/stress-ng.service
unthrottle after   95ms (overall: 521514ms): unified:/system.slice/stress-ng.service
unthrottle after   90ms (overall: 521610ms): unified:/system.slice/stress-ng.service
unthrottle after   90ms (overall: 521700ms): unified:/system.slice/stress-ng.service

The values here also match what cpu.stat is reporting:

ivan@vm:~/projects/ebpf_exporter$ cat /sys/fs/cgroup/system.slice/stress-ng.service/cpu.stat
usage_usec 45447035
user_usec 45378526
system_usec 68508
nr_periods 5681
nr_throttled 5680
throttled_usec 522346085
nr_bursts 0
burst_usec 0

Example output from ebpf_exporter:

ivan@vm:~/projects/ebpf_exporter$ curl -s http://ip6-localhost:9435/metrics | fgrep cfs
ebpf_exporter_cfs_throttling_seconds_bucket{cgroup="/sys/fs/cgroup/system.slice/stress-ng.service",le="1e-06"} 0
ebpf_exporter_cfs_throttling_seconds_bucket{cgroup="/sys/fs/cgroup/system.slice/stress-ng.service",le="2e-06"} 0
ebpf_exporter_cfs_throttling_seconds_bucket{cgroup="/sys/fs/cgroup/system.slice/stress-ng.service",le="4e-06"} 0
ebpf_exporter_cfs_throttling_seconds_bucket{cgroup="/sys/fs/cgroup/system.slice/stress-ng.service",le="8e-06"} 0
ebpf_exporter_cfs_throttling_seconds_bucket{cgroup="/sys/fs/cgroup/system.slice/stress-ng.service",le="1.6e-05"} 0
ebpf_exporter_cfs_throttling_seconds_bucket{cgroup="/sys/fs/cgroup/system.slice/stress-ng.service",le="3.2e-05"} 0
ebpf_exporter_cfs_throttling_seconds_bucket{cgroup="/sys/fs/cgroup/system.slice/stress-ng.service",le="6.4e-05"} 0
ebpf_exporter_cfs_throttling_seconds_bucket{cgroup="/sys/fs/cgroup/system.slice/stress-ng.service",le="0.000128"} 0
ebpf_exporter_cfs_throttling_seconds_bucket{cgroup="/sys/fs/cgroup/system.slice/stress-ng.service",le="0.000256"} 0
ebpf_exporter_cfs_throttling_seconds_bucket{cgroup="/sys/fs/cgroup/system.slice/stress-ng.service",le="0.000512"} 0
ebpf_exporter_cfs_throttling_seconds_bucket{cgroup="/sys/fs/cgroup/system.slice/stress-ng.service",le="0.001024"} 0
ebpf_exporter_cfs_throttling_seconds_bucket{cgroup="/sys/fs/cgroup/system.slice/stress-ng.service",le="0.002048"} 0
ebpf_exporter_cfs_throttling_seconds_bucket{cgroup="/sys/fs/cgroup/system.slice/stress-ng.service",le="0.004096"} 0
ebpf_exporter_cfs_throttling_seconds_bucket{cgroup="/sys/fs/cgroup/system.slice/stress-ng.service",le="0.008192"} 0
ebpf_exporter_cfs_throttling_seconds_bucket{cgroup="/sys/fs/cgroup/system.slice/stress-ng.service",le="0.016384"} 0
ebpf_exporter_cfs_throttling_seconds_bucket{cgroup="/sys/fs/cgroup/system.slice/stress-ng.service",le="0.032768"} 18
ebpf_exporter_cfs_throttling_seconds_bucket{cgroup="/sys/fs/cgroup/system.slice/stress-ng.service",le="0.065536"} 168
ebpf_exporter_cfs_throttling_seconds_bucket{cgroup="/sys/fs/cgroup/system.slice/stress-ng.service",le="0.131072"} 168
ebpf_exporter_cfs_throttling_seconds_bucket{cgroup="/sys/fs/cgroup/system.slice/stress-ng.service",le="0.262144"} 168
ebpf_exporter_cfs_throttling_seconds_bucket{cgroup="/sys/fs/cgroup/system.slice/stress-ng.service",le="0.524288"} 168
ebpf_exporter_cfs_throttling_seconds_bucket{cgroup="/sys/fs/cgroup/system.slice/stress-ng.service",le="1.048576"} 168
ebpf_exporter_cfs_throttling_seconds_bucket{cgroup="/sys/fs/cgroup/system.slice/stress-ng.service",le="+Inf"} 168
ebpf_exporter_cfs_throttling_seconds_sum{cgroup="/sys/fs/cgroup/system.slice/stress-ng.service"} 14.407136
ebpf_exporter_cfs_throttling_seconds_count{cgroup="/sys/fs/cgroup/system.slice/stress-ng.service"} 168
ebpf_exporter_ebpf_program_info{config="cfs-throttling",id="1312",program="unthrottle_cfs_rq",tag="3887b84da035bf7a"} 1
ebpf_exporter_enabled_configs{name="cfs-throttling"} 1

cc @pims

Example throttled program:

```
ivan@vm:~$ sudo systemd-run --pty --quiet --collect --unit stress-ng.service --property CPUQuota=8% stress-ng --cpu 1
stress-ng: info:  [70953] defaulting to a 86400 second (1 day, 0.00 secs) run per stressor
stress-ng: info:  [70953] dispatching hogs: 1 cpu
```

One can observe throttling with `bpftrace`:

```
ivan@vm:~/projects/ebpf_exporter$ sudo bpftrace -e 'kprobe:unthrottle_cfs_rq { $cfs_rq = (struct cfs_rq *) arg0; $throttled_nsec = $cfs_rq->rq->clock - $cfs_rq->throttled_clock; printf("unthrottle after %4dms (overall: %6lums): %s\n", $throttled_nsec / 1000000, $cfs_rq->tg->cfs_bandwidth.throttled_time / 1000000, cgroup_path($cfs_rq->tg->css.cgroup->kn->id)); }'
Attaching 1 probe...
unthrottle after   95ms (overall: 520871ms): unified:/system.slice/stress-ng.service
unthrottle after   91ms (overall: 520966ms): unified:/system.slice/stress-ng.service
unthrottle after   92ms (overall: 521058ms): unified:/system.slice/stress-ng.service
unthrottle after   88ms (overall: 521150ms): unified:/system.slice/stress-ng.service
unthrottle after   91ms (overall: 521239ms): unified:/system.slice/stress-ng.service
unthrottle after   96ms (overall: 521330ms): unified:/system.slice/stress-ng.service
unthrottle after   87ms (overall: 521426ms): unified:/system.slice/stress-ng.service
unthrottle after   95ms (overall: 521514ms): unified:/system.slice/stress-ng.service
unthrottle after   90ms (overall: 521610ms): unified:/system.slice/stress-ng.service
unthrottle after   90ms (overall: 521700ms): unified:/system.slice/stress-ng.service
```

The values here also match what `cpu.stat` is reporting:

```
ivan@vm:~/projects/ebpf_exporter$ cat /sys/fs/cgroup/system.slice/stress-ng.service/cpu.stat
usage_usec 45447035
user_usec 45378526
system_usec 68508
nr_periods 5681
nr_throttled 5680
throttled_usec 522346085
nr_bursts 0
burst_usec 0
```

Example output from `ebpf_exporter`:

```
ivan@vm:~/projects/ebpf_exporter$ curl -s http://ip6-localhost:9435/metrics | fgrep cfs
ebpf_exporter_cfs_throttling_seconds_bucket{cgroup="/sys/fs/cgroup/system.slice/stress-ng.service",le="1e-06"} 0
ebpf_exporter_cfs_throttling_seconds_bucket{cgroup="/sys/fs/cgroup/system.slice/stress-ng.service",le="2e-06"} 0
ebpf_exporter_cfs_throttling_seconds_bucket{cgroup="/sys/fs/cgroup/system.slice/stress-ng.service",le="4e-06"} 0
ebpf_exporter_cfs_throttling_seconds_bucket{cgroup="/sys/fs/cgroup/system.slice/stress-ng.service",le="8e-06"} 0
ebpf_exporter_cfs_throttling_seconds_bucket{cgroup="/sys/fs/cgroup/system.slice/stress-ng.service",le="1.6e-05"} 0
ebpf_exporter_cfs_throttling_seconds_bucket{cgroup="/sys/fs/cgroup/system.slice/stress-ng.service",le="3.2e-05"} 0
ebpf_exporter_cfs_throttling_seconds_bucket{cgroup="/sys/fs/cgroup/system.slice/stress-ng.service",le="6.4e-05"} 0
ebpf_exporter_cfs_throttling_seconds_bucket{cgroup="/sys/fs/cgroup/system.slice/stress-ng.service",le="0.000128"} 0
ebpf_exporter_cfs_throttling_seconds_bucket{cgroup="/sys/fs/cgroup/system.slice/stress-ng.service",le="0.000256"} 0
ebpf_exporter_cfs_throttling_seconds_bucket{cgroup="/sys/fs/cgroup/system.slice/stress-ng.service",le="0.000512"} 0
ebpf_exporter_cfs_throttling_seconds_bucket{cgroup="/sys/fs/cgroup/system.slice/stress-ng.service",le="0.001024"} 0
ebpf_exporter_cfs_throttling_seconds_bucket{cgroup="/sys/fs/cgroup/system.slice/stress-ng.service",le="0.002048"} 0
ebpf_exporter_cfs_throttling_seconds_bucket{cgroup="/sys/fs/cgroup/system.slice/stress-ng.service",le="0.004096"} 0
ebpf_exporter_cfs_throttling_seconds_bucket{cgroup="/sys/fs/cgroup/system.slice/stress-ng.service",le="0.008192"} 0
ebpf_exporter_cfs_throttling_seconds_bucket{cgroup="/sys/fs/cgroup/system.slice/stress-ng.service",le="0.016384"} 0
ebpf_exporter_cfs_throttling_seconds_bucket{cgroup="/sys/fs/cgroup/system.slice/stress-ng.service",le="0.032768"} 18
ebpf_exporter_cfs_throttling_seconds_bucket{cgroup="/sys/fs/cgroup/system.slice/stress-ng.service",le="0.065536"} 168
ebpf_exporter_cfs_throttling_seconds_bucket{cgroup="/sys/fs/cgroup/system.slice/stress-ng.service",le="0.131072"} 168
ebpf_exporter_cfs_throttling_seconds_bucket{cgroup="/sys/fs/cgroup/system.slice/stress-ng.service",le="0.262144"} 168
ebpf_exporter_cfs_throttling_seconds_bucket{cgroup="/sys/fs/cgroup/system.slice/stress-ng.service",le="0.524288"} 168
ebpf_exporter_cfs_throttling_seconds_bucket{cgroup="/sys/fs/cgroup/system.slice/stress-ng.service",le="1.048576"} 168
ebpf_exporter_cfs_throttling_seconds_bucket{cgroup="/sys/fs/cgroup/system.slice/stress-ng.service",le="+Inf"} 168
ebpf_exporter_cfs_throttling_seconds_sum{cgroup="/sys/fs/cgroup/system.slice/stress-ng.service"} 14.407136
ebpf_exporter_cfs_throttling_seconds_count{cgroup="/sys/fs/cgroup/system.slice/stress-ng.service"} 168
ebpf_exporter_ebpf_program_info{config="cfs-throttling",id="1312",program="unthrottle_cfs_rq",tag="3887b84da035bf7a"} 1
ebpf_exporter_enabled_configs{name="cfs-throttling"} 1
```
@bobrik bobrik merged commit 2d7f834 into cloudflare:master Oct 24, 2023
17 checks passed
@bobrik bobrik deleted the ivan/cfs-throttling branch October 24, 2023 02:56
SEC("fentry/unthrottle_cfs_rq")
int BPF_PROG(unthrottle_cfs_rq, struct cfs_rq *cfs_rq)
{
u64 throttled_us = (cfs_rq->rq->clock - cfs_rq->throttled_clock) / 1000;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cfs_rq->throttled_clock can be zero, at least in newer kernels:

@bobrik
Copy link
Contributor Author

bobrik commented Oct 24, 2023

Thanks to some tracing (#297, not yet committed demo), it is clear that things were not as they seemed:

ivan@vm:~/projects/ebpf_exporter$ sudo systemd-run --pty --quiet --collect --unit demo.service --property CPUQuota=7% ./demo
image

Even with GOMAXPROCS=1 there might be multiple spans:

ivan@vm:~/projects/ebpf_exporter$ sudo systemd-run --pty --quiet --collect --unit demo.service --property CPUQuota=7% --property Environment=GOMAXPROCS=1 ./demo
image

Some more work is needed to properly understand this first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant