Skip to content

Commit 40a6197

Browse files
committed
Blog post: New conversion from cgroup v1 CPU shares to v2 CPU weight
Signed-off-by: Itamar Holder <iholder@redhat.com>
1 parent b459159 commit 40a6197

File tree

4 files changed

+171
-0
lines changed

4 files changed

+171
-0
lines changed
Lines changed: 171 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,171 @@
1+
---
2+
layout: blog
3+
title: 'New Conversion from cgroup v1 CPU Shares to v2 CPU Weight'
4+
date: 2025-10-25T05:00:00-08:00
5+
draft: true
6+
math: true
7+
slug: new-cgroup-v1-to-v2-cpu-conversion-formula
8+
author: >
9+
[Itamar Holder](https://github.com/iholder101) (Red Hat)
10+
---
11+
12+
I'm excited to announce the implementation of an improved conversion formula
13+
from cgroup v1 CPU shares to cgroup v2 CPU weight. This enhancement addresses
14+
critical issues with CPU priority allocation for Kubernetes workloads when
15+
running on systems with cgroup v2.
16+
17+
## Background
18+
19+
Kubernetes was originally designed with cgroup v1 in mind, where CPU shares
20+
were defined simply by assigning the container's CPU requests in millicpu
21+
form.
22+
23+
For example, a container requesting 1 CPU (1024m) would get \(cpu.shares = 1024\).
24+
25+
After a while, cgroup v1 started being replaced by its successor,
26+
cgroup v2. In cgroup v2, the concept of CPU shares (which ranges from 2 to
27+
262144, or from 2¹ to 2¹⁸) was replaced with CPU weight (which ranges from
28+
[1, 10000], or 10⁰ to 10⁴).
29+
30+
With the transition to cgroup v2,
31+
[KEP-2254](https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/2254-cgroup-v2)
32+
introduced a conversion formula to map cgroup v1 CPU shares to cgroup v2 CPU
33+
weight. The conversion formula was defined as: `cpu.weight = (1 + ((cpu.shares - 2) * 9999) / 262142)`
34+
35+
This formula linearly maps values from [2¹, 2¹⁸] to [10⁰, 10⁴].
36+
37+
![Linear conversion formula](./linear-conversion.png "formula")
38+
39+
While this approach is simple, the linear mapping imposes a few significant
40+
problems and impacts both performance and configuration granularity.
41+
42+
## Problems with previous conversion formula
43+
44+
The current conversion formula creates two major issues:
45+
46+
### 1. Reduced priority against non-Kubernetes workloads
47+
48+
In cgroup v1, the default value for CPU shares is `1024`, meaning a container
49+
requesting 1 CPU has equal priority with system processes that live outside
50+
of Kubernetes' scope.
51+
However, in cgroup v2, the default CPU weight is `100`, but the current
52+
formula converts 1 CPU (1024m) to only `≈39` weight - less than 40% of the
53+
default.
54+
55+
**Example:**
56+
- Container requesting 1 CPU (1024m)
57+
- cgroup v1: `cpu.shares = 1024` (equal to default)
58+
- cgroup v2 (current): `cpu.weight = 39` (much lower than default 100)
59+
60+
This means that after moving to cgroup v2, Kubernetes (or OCI) workloads would
61+
de-facto reduce their CPU priority against non-Kubernetes processes. The
62+
problem can be severe for setups with many system daemons that run
63+
outside of Kubernetes' scope and expect Kubernetes workloads to have
64+
priority, especially in situations of resource starvation.
65+
66+
### 2. Unmanageable granularity
67+
68+
The current formula produces very low values for small CPU requests,
69+
limiting the ability to create sub-cgroups within containers for
70+
fine-grained resource distribution (which will possibly be much easier moving
71+
forward, see [KEP #5474](https://github.com/kubernetes/enhancements/issues/5474) for more info).
72+
73+
**Example:**
74+
- Container requesting 100m CPU
75+
- cgroup v1: `cpu.shares = 102`
76+
- cgroup v2 (current): `cpu.weight = 4` (too low for sub-cgroup
77+
configuration)
78+
79+
With cgroup v1, requesting 100m CPU which led to 102 CPU shares was manageable
80+
in the sense that sub-cgroups could have been created inside the main
81+
container, assigning fine-grained CPU priorities for different groups of
82+
processes. With cgroup v2 however, having 4 shares is very hard to
83+
distribute between sub-cgroups since it's not granular enough.
84+
85+
With plans to allow [writable cgroups for unprivileged containers](https://github.com/kubernetes/enhancements/issues/5474),
86+
this becomes even
87+
more relevant.
88+
89+
## New conversion formula
90+
91+
### Description
92+
The new formula is more complicated, but does a much better job mapping
93+
between cgroup v1 CPU shares and cgroup v2 CPU weight:
94+
95+
```math
96+
cpu.weight = \lceil 10^{(L^{2}/612 + 125L/612 - 7/34)} \rceil, \text{ where: } L = \log_2(cpu.shares)
97+
```
98+
99+
The idea is that this is a quadratic function to cross the following values:
100+
- (2, 1): The minimum values for both ranges.
101+
- (1024, 100): The default values for both ranges.
102+
- (262144, 10000): The maximum values for both ranges.
103+
104+
Visually, the new function looks as follows:
105+
106+
![2025-10-25-new-cgroup-v1-to-v2-conversion-formula-new-conversion.png](./new-conversion-formula.png)
107+
108+
And if you zoom in to the important part:
109+
110+
![2025-10-25-new-cgroup-v1-to-v2-conversion-formula-new-conversion-zoom.png](./new-conversion-formula-zoom.png)
111+
112+
The new formula is "close to linear", yet it is carefully designed to
113+
map the ranges in a clever way so the three important points above would
114+
cross.
115+
116+
### How it solves the problems
117+
118+
1. **Better priority alignment:**
119+
- A container requesting 1 CPU (1024m) will now get a `cpu.weight = 102`. This
120+
value is close to cgroup v2's default 100.
121+
This restores the intended priority relationship between Kubernetes
122+
workloads and system processes.
123+
124+
2. **Improved granularity:**
125+
- A container requesting 100m CPU will get `cpu.weight = 17`, (see
126+
[here](https://go.dev/play/p/sLlAfCg54Eg)).
127+
Enables better fine-grained resource distribution within containers.
128+
129+
## Adoption and integration
130+
131+
This change was implemented at the OCI layer.
132+
In other words, this is not implemented in Kubernetes itself; therefore the
133+
adoption of the new conversion formula depends solely on the OCI runtime
134+
adoption.
135+
136+
For example:
137+
* runc: The new formula is enabled from version [1.3.2](https://github.com/opencontainers/runc/releases/tag/v1.3.2).
138+
* crun: The new formula is enabled from version [1.23](https://github.com/containers/crun/releases/tag/1.23).
139+
140+
### Impact on existing deployments
141+
142+
**Important:** Some consumers may be affected if they assume the older linear conversion formula.
143+
Applications or monitoring tools that directly calculate expected CPU weight values based on the
144+
previous formula may need updates to account for the new quadratic conversion.
145+
This is particularly relevant for:
146+
147+
- Custom resource management tools that predict CPU weight values.
148+
- Monitoring systems that validate or expect specific weight values.
149+
- Applications that programmatically set or verify CPU weight values.
150+
151+
The Kubernetes project recommends testing the new conversion formula in non-production
152+
environments before upgrading OCI runtimes to ensure compatibility with existing tooling.
153+
154+
## Where can I learn more?
155+
156+
For those interested in this enhancement:
157+
158+
- [Kubernetes GitHub Issue #131216](https://github.com/kubernetes/kubernetes/issues/131216) - Detailed technical
159+
analysis and examples, including discussions and reasoning for choosing the
160+
above formula.
161+
- [KEP-2254: cgroup v2](https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/2254-cgroup-v2) -
162+
Original cgroup v2 implementation in Kubernetes.
163+
- [Kubernetes cgroup documentation](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/) -
164+
Current resource management guidance.
165+
166+
## How do I get involved?
167+
168+
For those interested in getting involved with Kubernetes node-level
169+
features, join the [Kubernetes Node Special Interest Group](https://github.com/kubernetes/community/tree/master/sig-node).
170+
We always welcome new contributors and diverse perspectives on resource management
171+
challenges.
19.2 KB
Loading
37 KB
Loading
39.1 KB
Loading

0 commit comments

Comments
 (0)