Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cgroups support for Linux targets #96

Merged
merged 2 commits into from
Apr 9, 2020
Merged

Add cgroups support for Linux targets #96

merged 2 commits into from
Apr 9, 2020

Conversation

seanmonstar
Copy link
Owner

Closes #80

@olix0r
Copy link

olix0r commented Mar 30, 2020

Thanks @seanmonstar !

cc @lucab

@seanmonstar
Copy link
Owner Author

It may be a good idea to add a CI job that run the tests inside docker with cgroups limiting to just 1 CPU.

@seanmonstar seanmonstar force-pushed the containerized branch 12 times, most recently from 9687f43 to d03233a Compare March 31, 2020 20:57
@seanmonstar
Copy link
Owner Author

Well hey there, docker CI job testing with and without setting --cpus works!

Copy link

@olix0r olix0r left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't vouch for all of the details but the overall approach looks good to me!

src/linux.rs Show resolved Hide resolved
src/linux.rs Show resolved Hide resolved
src/linux.rs Outdated Show resolved Hide resolved
src/linux.rs Show resolved Hide resolved
src/linux.rs Outdated Show resolved Hide resolved
@lucab
Copy link

lucab commented Apr 1, 2020

Overall looks fine to me too, thanks!

I think there is a global question about flooring vs ceiling the CPU quota.

Additionally, this only covers cgroup v1. It may need to be re-touched at some point in the future for cgroup v2.

src/linux.rs Show resolved Hide resolved
src/linux.rs Show resolved Hide resolved
src/linux.rs Show resolved Hide resolved
@seanmonstar seanmonstar force-pushed the containerized branch 3 times, most recently from d293f04 to 4109846 Compare April 1, 2020 20:09
@strohel
Copy link

strohel commented Apr 3, 2020

This functionality seems to be perfect. In a setup with 4 physical cores and Docker container being limited to 1.0 CPU, a benchmark of a simple async actix-web based microservice resulted in 1 worker thread being run instead of 4, causing up to 50% increase in CPU efficiency:
image
Full benchmark report: https://storage.googleapis.com/strohel-num-cpus-containerized/bench-results.html

Workloads like this also ask for ceiling, rather than flooring, non-unit CPU soft-limits, as otherwise actix-web would not be able to saturate available resources (I can perform benchmark with e.g. 1.5 CPUs if desired).

This change can easily have high reach and noticeable impact (benchmark above being probably the extreme case), as num_cpus::get seems to be used in major async runtimes (and beyond), and it is probably not uncommon to limit CPU usage in container orchestration systems like Kubernetes. Maybe it would be worth a prominent announcement?

@seanmonstar seanmonstar force-pushed the containerized branch 2 times, most recently from 1ed69f1 to 9c103df Compare April 9, 2020 18:49
@seanmonstar
Copy link
Owner Author

Ok, got some tweaks in, along with ceiling the result (and a test that --cpus=1.5 for docker works as expected).

@seanmonstar seanmonstar merged commit b676af1 into master Apr 9, 2020
@seanmonstar seanmonstar deleted the containerized branch April 9, 2020 19:49
@seanmonstar
Copy link
Owner Author

@strohel mind if I use that graph in the release notes?

@strohel
Copy link

strohel commented Apr 9, 2020

@strohel mind if I use that graph in the release notes?

Sure! An SVG version (if handy) is at https://storage.googleapis.com/strohel-num-cpus-containerized/bench-results.cpu_per_request_figure.svg

It should be probably accompanied by some context from my above post in order not to create false expectations (the same efficiency increase could be achieved by manually setting ideal number of workers; I still think this is a huge win as people probably don't do it).

}

fn init_cgroups() {
// Should only be called once
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if the cgroup limits are changed while a program is running?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Test/document cgroups behavior
6 participants