Skip to content

Commit fbfa379

Browse files
committed
Blog post: New conversion from cgroup v1 CPU shares to v2 CPU weight
Signed-off-by: Itamar Holder <iholder@redhat.com>
1 parent e768588 commit fbfa379

5 files changed

+172
-0
lines changed
59.4 KB
Loading
37 KB
Loading
39.1 KB
Loading
Lines changed: 172 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,172 @@
1+
---
2+
layout: blog
3+
title: 'New conversion from cgroup v1 CPU shares to v2 CPU weight'
4+
date: 2025-10-25T05:00:00-08:00
5+
draft: true
6+
math: true
7+
slug: new-cgroup-v1-to-v2-cpu-conversion-formula
8+
author: >
9+
[Itamar Holder](https://github.com/iholder101) (Red Hat)
10+
---
11+
12+
I'm excited to announce the implementation of an improved conversion formula
13+
from cgroup v1 CPU shares to cgroup v2 CPU weight. This enhancement addresses
14+
critical issues with CPU priority allocation for Kubernetes workloads when
15+
running on systems with cgroup v2.
16+
17+
## Background
18+
19+
Kubernetes was originally designed with cgroup v1 in mind, where CPU shares
20+
were defined simply by assigning the container's CPU requests in millicpu
21+
form.
22+
23+
For example, a container requesting 1 CPU (1024m) would get \(cpu.shares = 1024\).
24+
25+
After a while, cgroup v1 was stared being replaced by its successor,
26+
cgroup v2. In cgroup v2, the concept of CPU shares (which ranges from 2 to
27+
262144, or from \(2^1\) to \(2^18\)) was replaced with CPU weight (which ranges from
28+
\([1, 10000]\), or \(10^0\) to \(10^4\).
29+
30+
With the transition to cgroup v2,
31+
[KEP-2254](https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/2254-cgroup-v2)
32+
introduced a conversion formula to map cgroup v1 CPU shares to cgroup v2 CPU
33+
weight. The conversion formula was defined as:
34+
35+
```math
36+
cpu.weight = (1 + ((cpu.shares - 2) * 9999) / 262142)
37+
```
38+
39+
This formula linearly maps between \([2^1 - 2^18]\) to \([10^0 - 10^4]\).
40+
![2025-10-25-new-cgroup-v1-to-v2-conversion-formula-linear-conversion.png](2025-10-25-new-cgroup-v1-to-v2-conversion-formula-linear-conversion.png)
41+
42+
While this approach is simple, the linear mapping imposes a few significant
43+
problems and impacts both performance and configuration granularity.
44+
45+
## Problems with Previous Conversion Formula
46+
47+
The current conversion formula creates two major issues:
48+
49+
### 1. Reduced Priority Against Non-Kubernetes Workloads
50+
51+
In cgroup v1, the default CPU shares is `1024`, meaning a container
52+
requesting 1 CPU has equal priority with system processes that live outside
53+
of Kubernetes' scope.
54+
However, in cgroup v2, the default CPU weight is `100`, but the current
55+
formula converts 1 CPU (1024m) to only `~39` weight - less than 40% of the
56+
default.
57+
58+
**Example:**
59+
- Container requesting 1 CPU (1024m)
60+
- cgroup v1: `cpu.shares = 1024` (equal to default)
61+
- cgroup v2 (current): `cpu.weight = 39` (much lower than default 100)
62+
63+
This means that after moving to cgroup v2, Kubernetes (or, OCI) workloads would
64+
de-factor reduce their CPU priority against non-Kubernetes processes. The
65+
problem can be severe for setups with many system daemons that run
66+
outside of Kubernetes' scope and expect Kubernetes workloads to have
67+
priority, especially in situations of resource starvation.
68+
69+
### 2. Unmanageable Granularity
70+
71+
The current formula produces very low values for small CPU requests,
72+
limiting the ability to create sub-cgroups within containers for
73+
fine-grained resource distribution (which will possibly be much easier moving
74+
forward, see [KEP #5474](https://github.com/kubernetes/enhancements/issues/5474) for more info).
75+
76+
**Example:**
77+
- Container requesting 100m CPU
78+
- cgroup v1: `cpu.shares = 102`
79+
- cgroup v2 (current): `cpu.weight = 4` (too low for sub-cgroup
80+
configuration)
81+
82+
With cgroup v1, requesting 1 CPU which led to 102 CPU shares was manageable
83+
in the sense that sub-cgroups could have been created inside the main
84+
container, assigning fine-grained CPU priorities for different groups of
85+
processes. With cgroup v2 however, having 4 shares is very hard to
86+
distribute between sub-cgroups since it's not granular enough.
87+
88+
With plans to allow [writable cgroups for unprivileged containers](https://github.com/kubernetes/enhancements/issues/5474),
89+
this becomes even
90+
more relevant.
91+
92+
## New Conversion Formula
93+
94+
### Description
95+
The new formula is more complicated, but does a much better job mapping
96+
between cgroup v1 CPU shares and cgroup v2 CPU weight:
97+
98+
```math
99+
cpu.weight = \lceil 10^{(L^2/612 + 125L/612 - 7/34)} \rceil, where: L = log₂(cpu.shares)
100+
```
101+
102+
The idea is that this is a quadratic function to cross the following values:
103+
- (2, 1): The minimum values for both ranges.
104+
- (1024, 100): The default values for both ranges.
105+
- (262144, 10000): The maximum values for both ranges.
106+
107+
Visually, the new function looks as follows:
108+
![2025-10-25-new-cgroup-v1-to-v2-conversion-formula-new-conversion.png](2025-10-25-new-cgroup-v1-to-v2-conversion-formula-new-conversion.png)
109+
110+
And if we zoom in to the important part:
111+
![2025-10-25-new-cgroup-v1-to-v2-conversion-formula-new-conversion-zoom.png](2025-10-25-new-cgroup-v1-to-v2-conversion-formula-new-conversion-zoom.png)
112+
113+
The new formula is "close to linear", yet it is sophistically designed to
114+
map the ranges in a clever way so the three important points above would
115+
cross.
116+
117+
### How It Solves the Problems
118+
119+
1. **Better Priority Alignment:**
120+
- Container requesting 1 CPU (1024m) will now get a `cpu.weight = 102`. This
121+
value is close to cgroup v2's default 100.
122+
- This restores the intended priority relationship between Kubernetes
123+
workloads and system processes.
124+
125+
2. **Improved Granularity:**
126+
- Container requesting 100m CPU will get `cpu.weight = 17`, (see
127+
[here](https://go.dev/play/p/sLlAfCg54Eg)).
128+
- Enables better fine-grained resource distribution within containers.
129+
130+
## Adoption and integration
131+
132+
This change was implemented as an OCI-level implementation.
133+
In other words, this is not implemented Kubernetes itself, therefore the
134+
adoption of the new conversion formula depends solely on the OCI runtime
135+
adoption.
136+
137+
For example:
138+
* runc: The new formula is enabled from version [1.3.2](https://github.com/opencontainers/runc/releases/tag/v1.3.2).
139+
* crun: The new formula is enabled from version [1.23](https://github.com/containers/crun/releases/tag/1.23).
140+
141+
### Impact on Existing Deployments
142+
143+
**Important:** Some consumers may be affected if they assume the older linear conversion formula.
144+
Applications or monitoring tools that directly calculate expected CPU weight values based on the
145+
previous formula may need updates to account for the new quadratic conversion.
146+
This is particularly relevant for:
147+
148+
- Custom resource management tools that predict CPU weight values.
149+
- Monitoring systems that validate or expect specific weight values.
150+
- Applications that programmatically set or verify CPU weight values.
151+
152+
We recommend testing the new conversion formula in non-production environments before
153+
upgrading OCI runtimes to ensure compatibility with existing tooling.
154+
155+
## Where Can I Learn More?
156+
157+
For those interested in this enhancement:
158+
159+
- [Kubernetes GitHub Issue #131216](https://github.com/kubernetes/kubernetes/issues/131216) - Detailed technical
160+
analysis and examples, including discussions and reasoning for choosing the
161+
above formula.
162+
- [KEP-2254: cgroup v2](https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/2254-cgroup-v2) -
163+
Original cgroup v2 implementation in Kubernetes.
164+
- [Kubernetes cgroup documentation](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/) -
165+
Current resource management guidance.
166+
167+
## How Do I Get Involved?
168+
169+
For those interested in getting involved with Kubernetes node-level
170+
features, join the [Kubernetes Node Special Interest Group](https://github.com/kubernetes/community/tree/master/sig-node).
171+
We always welcome new contributors and diverse perspectives on resource management
172+
challenges.

linear-conversion.png

59.4 KB
Loading

0 commit comments

Comments
 (0)