-
Notifications
You must be signed in to change notification settings - Fork 14.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add cpu-management-policies.md #4950
Add cpu-management-policies.md #4950
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably link to the main QOS docs early, since much of the text assumes some familiarity with how that works.
Regarding the title, I have seen workload placement used more frequently to describe things like node affinity, taints, topology keys, etc. (mostly cluster-wide scheduling decisions.) Am I off-base, or is there a more descriptive name for this doc?
|
||
### None Policy | ||
|
||
The 'none' policy maintains the historical CPU affinity scheme, providing no |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/historical/legacy
or "pre-v1.8"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replaced with "existing default" and reworded a little
away from the user. This is by design. However, some workloads require | ||
stronger guarantees in terms of latency and/or performance in order to operate | ||
acceptably. The kubelet provides methods for enabling more complex workload | ||
placements policies while keeping the abstraction free from explicit placement |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/placements/placement
scheduling time. Many workloads are not sensitive to this migration and thus | ||
work fine without any intervention. | ||
|
||
However, in workloads where CPU cache affinity significantly affects workload |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition to cache affinity, static allocation can reduce process scheduler latency. This can be a dominant contributor to tail latency for certain applications (easy example: packet forwarding.) Consider combining this with the CFS blurb above in a mini-section about problems the CPU manager addresses?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i just added scheduler latency inline here
|
||
The `static` policy allows guaranteed pods with integer CPU limits to be | ||
assigned to exclusive CPUs on the node. This exclusivity is enforced through | ||
the use of the cpuset cgroup controller. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Link
|
||
### Static Policy | ||
|
||
The `static` policy allows guaranteed pods with integer CPU limits to be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/guaranteed pods/containers in guaranteed pods
`--system-reserved` options. This shared pool is the set of CPUs on which any | ||
BestEffort and Burstable pods run. | ||
|
||
As guaranteed pods that fit the requirements for being statically assigned are |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/pods/containers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, think we should inline an example pod spec or two for illustration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this, we could link to the examples/cpu-manager directory (assuming that makes it in).
|
||
In the event that the shared pool is depleted the kubelet takes two actions: | ||
|
||
The first is evicting all pods that do not have a CPU request as those pods now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/evicting/to evict
The first is evicting all pods that do not have a CPU request as those pods now | ||
have no CPUs on which to run. Burstable pods with a CPU requests are still | ||
allowed to run as the presence of such a pod will keep the shared pool from | ||
depleting the first place. The scheduler would not assign a pod that depletes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/the first place/in the first place
have no CPUs on which to run. Burstable pods with a CPU requests are still | ||
allowed to run as the presence of such a pod will keep the shared pool from | ||
depleting the first place. The scheduler would not assign a pod that depletes | ||
the shared pool due to insufficient CPU if such a burstable pod is already |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sentence confused me a little.
Maybe something like "As long as a node has a burstable pod with a CPU request, no pod that depletes the shared pool can fit there." However, not sure how much value there is in outlining this edge case in this document -- especially since as described, this is an example of the resource math simply working out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll just remove these two sentence and compress the two actions to a bullet list
|
||
The second action is setting a `NodeCPUPressure` node condition to `true` in | ||
the node status. When this condition is true, the scheduler will not assign any | ||
pod to the node that lacks a CPU request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/that lacks a CPU request/that has any container without a CPU request
|
||
The 'none' policy maintains the historical CPU affinity scheme, providing no | ||
affinity beyond what the OS scheduler does automatically. Limits on CPU usage | ||
for [Guaranteed pods](https://kubernetes.io/docs/tasks/configure-pod-container/quality-service-pod) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maybe we want to use relative URL for links, e.g. /docs/tasks/...
7bde4fb
to
64055c5
Compare
@ConnorDoyle also renamed to doc to "cpu management policies" per our previous discussion on the release notes |
Quick-link: rendered page |
both part of a Guaranteed pod and have integer CPU requests are assigned | ||
exclusive CPUs. | ||
|
||
**Note:** When reserving CPU with `--kube-reserved` or `--system-reserved` options, it is advised to use *integer* CPU quantities. Otherwise, the scheduling accounting might not reconcile properly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this makes it sound like the node will break somehow. The edge case here is about stranded resources right? Otherwise there will be a CPU in the shared pool that cannot be taken exclusively because no pod that consumes it entirely can be admitted. Containers in the shared pool should continue to work just fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, i'll remove this sentence
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice 👍
workload. | ||
|
||
In the event that the shared pool is depleted the kubelet takes two actions: | ||
- Evict all pods that include a container that does not have a CPU request as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: looks like this didn't render as a <ul>
in the rendered page
|
||
Consider the containers in the following pod specs: | ||
|
||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding
```yaml
may trigger syntax highlighting
looks good to me, but wondering how we can add dependency from this PR to #49186 in another repo (kubernetes/kubernetes#49186)? |
64055c5
to
f15aa14
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This content looks great! 👍 I left comments inline.
Some general notes:
- Eliminate the passive voice: https://kubernetes.io/docs/home/contribute/style-guide/#use-active-voice
- Keep sentences and paragraphs on a single line (soft wrap only)
- No double spaces after periods
* TOC | ||
{:toc} | ||
|
||
Kubernetes keeps many aspects of how the pod executes on the node abstracted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"pods execute on nodes"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove hard line breaks. From a code perspective, this entire paragraph should be on one line.
Kubernetes keeps many aspects of how the pod executes on the node abstracted | ||
away from the user. This is by design. However, some workloads require | ||
stronger guarantees in terms of latency and/or performance in order to operate | ||
acceptably. The kubelet provides methods for enabling more complex workload |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No double spaces after periods, please.
work fine without any intervention. | ||
|
||
However, in workloads where CPU cache affinity and scheduling latency | ||
significantly affects workload performance, the kubelet allows alternative CPU |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...the kubelet allows alternative CPU management policies to determine some placement preferences on the node.
## CPU Management Policies | ||
|
||
By default, the kubelet uses [CFS quota](https://en.wikipedia.org/wiki/Completely_Fair_Scheduler) | ||
to enforce pod CPU limits. When the node is running many CPU bound pods, this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When a node runs many CPU bound pods, the workload can be moved to different CPU cores depending on whether a pod is throttled and which CPU cores are available at scheduling time.
management policies that allow users to express their desire for certain | ||
placement preferences on the node. | ||
|
||
These management policies are enabled with the `--cpu-manager-policy` kubelet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Enable these management policies with...
requests: | ||
memory: "100Mi" | ||
cpu: "1" | ||
```yaml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove yaml
resources: | ||
limits: | ||
memory: "200Mi" | ||
cpu: "2" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious to see how these red blocks render in an otherwise functional code block. Please @mention me after removing yaml
and rebuilding the preview.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is the dreaded TAB character
cpu: "1" | ||
```yaml | ||
|
||
**Result** Burstable QoS because resource requests != limits. Non-zero CPU request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Turn this paragraph from sentence fragments into a complete paragraph.
cpu: "2" | ||
```yaml | ||
|
||
**Result** Guaranteed QoS because only limits are specified and requests are set to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Turn this paragraph from sentence fragments into a complete paragraph.
limits: | ||
memory: "200Mi" | ||
cpu: "2" | ||
```yaml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove yaml
@ConnorDoyle Will you please comment here when kubernetes/kubernetes#49186 merges? 🙇 |
230d024
to
5de8423
Compare
5de8423
to
a0f1883
Compare
@zacharysarah thanks for the thorough review 👍 I think I addressed everything you mentioned plus some. Question about the soft wrap. Hard wrapping makes reviewing easier. I notice that many other docs do have hard wrapping. Is this a new guideline? If you want it that way, I'll do it. Just double checking. Not double spacing though. |
It does; but for text, it can also cause odd breaks in page rendering.
They do, and sometimes it causes odd breaks in page rendering. 😉
Right now it's just a strong personal preference. I'd like for the docs community to discuss it more before we make it policy. So:
Thanks. I think we're golden as is. 👍 Like you say, it's a widespread practice and would represent a large change for docs PRs. If we decide to introduce new guidelines, we'll make sure they're widely publicized.
😆 💯 |
Approved with edits. 👍 |
@zacharysarah this feature is merged and will be included in v1.8 |
Documentation for the CPU manager feature in 1.8:
https://github.com/kubernetes/community/blob/master/contributors/design-proposals/cpu-manager.md
Feature issue:
kubernetes/enhancements#375
Code PR:
kubernetes/kubernetes#49186
@derekwaynecarr @ConnorDoyle
This change is