Add cpu-management-policies.md #4950

sjenning · 2017-08-15T17:21:51Z

Documentation for the CPU manager feature in 1.8:
https://github.com/kubernetes/community/blob/master/contributors/design-proposals/cpu-manager.md

Feature issue:
kubernetes/enhancements#375

Code PR:
kubernetes/kubernetes#49186

@derekwaynecarr @ConnorDoyle

This change is

ConnorDoyle

We should probably link to the main QOS docs early, since much of the text assumes some familiarity with how that works.

Regarding the title, I have seen workload placement used more frequently to describe things like node affinity, taints, topology keys, etc. (mostly cluster-wide scheduling decisions.) Am I off-base, or is there a more descriptive name for this doc?

ConnorDoyle · 2017-08-15T18:37:04Z

docs/tasks/administer-cluster/workload-placement.md

+
+### None Policy
+
+The 'none' policy maintains the historical CPU affinity scheme, providing no


s/historical/legacy

or "pre-v1.8"?

replaced with "existing default" and reworded a little

ConnorDoyle · 2017-08-15T18:38:09Z

docs/tasks/administer-cluster/workload-placement.md

+away from the user. This is by design.  However, some workloads require
+stronger guarantees in terms of latency and/or performance in order to operate
+acceptably.  The kubelet provides methods for enabling more complex workload
+placements policies while keeping the abstraction free from explicit placement


s/placements/placement

ConnorDoyle · 2017-08-15T18:47:00Z

docs/tasks/administer-cluster/workload-placement.md

+scheduling time.  Many workloads are not sensitive to this migration and thus
+work fine without any intervention.
+
+However, in workloads where CPU cache affinity significantly affects workload


In addition to cache affinity, static allocation can reduce process scheduler latency. This can be a dominant contributor to tail latency for certain applications (easy example: packet forwarding.) Consider combining this with the CFS blurb above in a mini-section about problems the CPU manager addresses?

i just added scheduler latency inline here

ConnorDoyle · 2017-08-15T18:47:51Z

docs/tasks/administer-cluster/workload-placement.md

+
+The `static` policy allows guaranteed pods with integer CPU limits to be
+assigned to exclusive CPUs on the node.  This exclusivity is enforced through
+the use of the cpuset cgroup controller.


ConnorDoyle · 2017-08-15T19:11:08Z

docs/tasks/administer-cluster/workload-placement.md

+
+### Static Policy
+
+The `static` policy allows guaranteed pods with integer CPU limits to be


s/guaranteed pods/containers in guaranteed pods

ConnorDoyle · 2017-08-15T19:16:31Z

docs/tasks/administer-cluster/workload-placement.md

+`--system-reserved` options.  This shared pool is the set of CPUs on which any
+BestEffort and Burstable pods run.
+
+As guaranteed pods that fit the requirements for being statically assigned are


s/pods/containers

Also, think we should inline an example pod spec or two for illustration.

For this, we could link to the examples/cpu-manager directory (assuming that makes it in).

ConnorDoyle · 2017-08-15T19:18:25Z

docs/tasks/administer-cluster/workload-placement.md

+
+In the event that the shared pool is depleted the kubelet takes two actions:
+
+The first is evicting all pods that do not have a CPU request as those pods now


s/evicting/to evict

ConnorDoyle · 2017-08-15T19:19:07Z

docs/tasks/administer-cluster/workload-placement.md

+The first is evicting all pods that do not have a CPU request as those pods now
+have no CPUs on which to run.  Burstable pods with a CPU requests are still
+allowed to run as the presence of such a pod will keep the shared pool from
+depleting the first place. The scheduler would not assign a pod that depletes


s/the first place/in the first place

ConnorDoyle · 2017-08-15T19:25:53Z

docs/tasks/administer-cluster/workload-placement.md

+have no CPUs on which to run.  Burstable pods with a CPU requests are still
+allowed to run as the presence of such a pod will keep the shared pool from
+depleting the first place. The scheduler would not assign a pod that depletes
+the shared pool due to insufficient CPU if such a burstable pod is already


This sentence confused me a little.

Maybe something like "As long as a node has a burstable pod with a CPU request, no pod that depletes the shared pool can fit there." However, not sure how much value there is in outlining this edge case in this document -- especially since as described, this is an example of the resource math simply working out.

I'll just remove these two sentence and compress the two actions to a bullet list

ConnorDoyle · 2017-08-15T19:26:35Z

docs/tasks/administer-cluster/workload-placement.md

+
+The second action is setting a `NodeCPUPressure` node condition to `true` in
+the node status. When this condition is true, the scheduler will not assign any
+pod to the node that lacks a CPU request.


s/that lacks a CPU request/that has any container without a CPU request

tengqm · 2017-08-16T01:30:57Z

docs/tasks/administer-cluster/workload-placement.md

+
+The 'none' policy maintains the historical CPU affinity scheme, providing no
+affinity beyond what the OS scheduler does automatically.  Limits on CPU usage
+for [Guaranteed pods](https://kubernetes.io/docs/tasks/configure-pod-container/quality-service-pod)


nit: maybe we want to use relative URL for links, e.g. /docs/tasks/...

sjenning · 2017-08-16T04:07:15Z

@ConnorDoyle also renamed to doc to "cpu management policies" per our previous discussion on the release notes

ConnorDoyle · 2017-08-16T05:03:07Z

Quick-link: rendered page

ConnorDoyle · 2017-08-16T05:09:18Z

docs/tasks/administer-cluster/cpu-management-policies.md

+both part of a Guaranteed pod and have integer CPU requests are assigned
+exclusive CPUs.
+
+**Note:** When reserving CPU with `--kube-reserved` or `--system-reserved` options, it is advised to use *integer* CPU quantities.  Otherwise, the scheduling accounting might not reconcile properly.


nit: this makes it sound like the node will break somehow. The edge case here is about stranded resources right? Otherwise there will be a CPU in the shared pool that cannot be taken exclusively because no pod that consumes it entirely can be admitted. Containers in the shared pool should continue to work just fine.

sure, i'll remove this sentence

ConnorDoyle

Nice 👍

ConnorDoyle · 2017-08-16T05:13:03Z

docs/tasks/administer-cluster/cpu-management-policies.md

+workload.
+
+In the event that the shared pool is depleted the kubelet takes two actions:
+- Evict all pods that include a container that does not have a CPU request as


nit: looks like this didn't render as a <ul> in the rendered page

ConnorDoyle · 2017-08-16T05:14:24Z

docs/tasks/administer-cluster/cpu-management-policies.md

+
+Consider the containers in the following pod specs:
+
+```


Adding

```yaml

may trigger syntax highlighting

tengqm · 2017-08-16T07:31:13Z

looks good to me, but wondering how we can add dependency from this PR to #49186 in another repo (kubernetes/kubernetes#49186)?

zacharysarah

This content looks great! 👍 I left comments inline.

Some general notes:

Eliminate the passive voice: https://kubernetes.io/docs/home/contribute/style-guide/#use-active-voice
Keep sentences and paragraphs on a single line (soft wrap only)
No double spaces after periods

zacharysarah · 2017-08-17T23:32:43Z

docs/tasks/administer-cluster/cpu-management-policies.md

+* TOC
+{:toc}
+
+Kubernetes keeps many aspects of how the pod executes on the node abstracted


"pods execute on nodes"

Please remove hard line breaks. From a code perspective, this entire paragraph should be on one line.

zacharysarah · 2017-08-17T23:33:01Z

docs/tasks/administer-cluster/cpu-management-policies.md

+Kubernetes keeps many aspects of how the pod executes on the node abstracted
+away from the user. This is by design.  However, some workloads require
+stronger guarantees in terms of latency and/or performance in order to operate
+acceptably.  The kubelet provides methods for enabling more complex workload


No double spaces after periods, please.

zacharysarah · 2017-08-17T23:39:25Z

docs/tasks/administer-cluster/cpu-management-policies.md

+work fine without any intervention.
+
+However, in workloads where CPU cache affinity and scheduling latency
+significantly affects workload performance, the kubelet allows alternative CPU


...the kubelet allows alternative CPU management policies to determine some placement preferences on the node.

zacharysarah · 2017-08-17T23:39:42Z

docs/tasks/administer-cluster/cpu-management-policies.md

+## CPU Management Policies
+
+By default, the kubelet uses [CFS quota](https://en.wikipedia.org/wiki/Completely_Fair_Scheduler)
+to enforce pod CPU limits.  When the node is running many CPU bound pods, this


When a node runs many CPU bound pods, the workload can be moved to different CPU cores depending on whether a pod is throttled and which CPU cores are available at scheduling time.

zacharysarah · 2017-08-17T23:40:40Z

docs/tasks/administer-cluster/cpu-management-policies.md

+management policies that allow users to express their desire for certain
+placement preferences on the node.
+
+These management policies are enabled with the `--cpu-manager-policy` kubelet


Enable these management policies with...

zacharysarah · 2017-08-17T23:55:29Z

docs/tasks/administer-cluster/cpu-management-policies.md

+      requests:
+        memory: "100Mi"
+	cpu: "1"
+```yaml


Remove yaml

zacharysarah · 2017-08-17T23:56:43Z

docs/tasks/administer-cluster/cpu-management-policies.md

+    resources:
+      limits:
+        memory: "200Mi"
+	cpu: "2"


I'm curious to see how these red blocks render in an otherwise functional code block. Please @mention me after removing yaml and rebuilding the preview.

this is the dreaded TAB character

zacharysarah · 2017-08-17T23:57:03Z

docs/tasks/administer-cluster/cpu-management-policies.md

+	cpu: "1"
+```yaml
+
+**Result** Burstable QoS because resource requests != limits.  Non-zero CPU request


Turn this paragraph from sentence fragments into a complete paragraph.

zacharysarah · 2017-08-17T23:57:09Z

docs/tasks/administer-cluster/cpu-management-policies.md

+	cpu: "2"
+```yaml
+
+**Result** Guaranteed QoS because only limits are specified and requests are set to


Turn this paragraph from sentence fragments into a complete paragraph.

zacharysarah · 2017-08-17T23:57:16Z

docs/tasks/administer-cluster/cpu-management-policies.md

+      limits:
+        memory: "200Mi"
+	cpu: "2"
+```yaml


Remove yaml

zacharysarah · 2017-08-18T00:01:30Z

@ConnorDoyle Will you please comment here when kubernetes/kubernetes#49186 merges? 🙇

sjenning · 2017-08-18T02:47:55Z

@zacharysarah thanks for the thorough review 👍 I think I addressed everything you mentioned plus some.

Question about the soft wrap. Hard wrapping makes reviewing easier. I notice that many other docs do have hard wrapping. Is this a new guideline?

If you want it that way, I'll do it. Just double checking. Not double spacing though.

zacharysarah · 2017-08-18T19:15:15Z

Hard wrapping makes reviewing easier.

It does; but for text, it can also cause odd breaks in page rendering.

I notice that many other docs do have hard wrapping.

They do, and sometimes it causes odd breaks in page rendering. 😉

Is this a new guideline?

Right now it's just a strong personal preference. I'd like for the docs community to discuss it more before we make it policy. So:

If you want it that way, I'll do it. Just double checking.

Thanks. I think we're golden as is. 👍 Like you say, it's a widespread practice and would represent a large change for docs PRs. If we decide to introduce new guidelines, we'll make sure they're widely publicized.

Just double checking. Not double spacing though.

😆 💯

zacharysarah · 2017-08-18T19:16:48Z

Approved with edits. 👍

ConnorDoyle · 2017-09-05T17:55:24Z

@zacharysarah this feature is merged and will be included in v1.8

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Aug 15, 2017

k8s-github-robot assigned zacharysarah Aug 15, 2017

ConnorDoyle reviewed Aug 15, 2017

View reviewed changes

tengqm reviewed Aug 16, 2017

View reviewed changes

sjenning force-pushed the cpu-manager-docs-1.8 branch 3 times, most recently from 7bde4fb to 64055c5 Compare August 16, 2017 04:06

ConnorDoyle reviewed Aug 16, 2017

View reviewed changes

ConnorDoyle approved these changes Aug 16, 2017

View reviewed changes

ConnorDoyle reviewed Aug 16, 2017

View reviewed changes

sjenning force-pushed the cpu-manager-docs-1.8 branch from 64055c5 to f15aa14 Compare August 16, 2017 14:02

sjenning changed the title ~~Add workload-placement.md~~ Add cpu-management-policies.md Aug 16, 2017

zacharysarah suggested changes Aug 17, 2017

View reviewed changes

sjenning force-pushed the cpu-manager-docs-1.8 branch 3 times, most recently from 230d024 to 5de8423 Compare August 18, 2017 02:42

add workload placement docs

a0f1883

sjenning force-pushed the cpu-manager-docs-1.8 branch from 5de8423 to a0f1883 Compare August 18, 2017 02:47

zacharysarah added the Docs LGTM label Aug 18, 2017

zacharysarah approved these changes Aug 18, 2017

View reviewed changes

zacharysarah merged commit 5b4ef96 into kubernetes:release-1.8 Aug 18, 2017

steveperry-53 added this to the 1.8 milestone Sep 6, 2017


		### None Policy

		The 'none' policy maintains the historical CPU affinity scheme, providing no


		### Static Policy

		The `static` policy allows guaranteed pods with integer CPU limits to be


		In the event that the shared pool is depleted the kubelet takes two actions:

		The first is evicting all pods that do not have a CPU request as those pods now

Add cpu-management-policies.md #4950

Add cpu-management-policies.md #4950

Conversation

sjenning commented Aug 15, 2017 • edited by k8s-reviewable Loading

ConnorDoyle left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sjenning commented Aug 16, 2017

ConnorDoyle commented Aug 16, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ConnorDoyle left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tengqm commented Aug 16, 2017

zacharysarah left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zacharysarah commented Aug 18, 2017

sjenning commented Aug 18, 2017

zacharysarah commented Aug 18, 2017

zacharysarah commented Aug 18, 2017

ConnorDoyle commented Sep 5, 2017

sjenning commented Aug 15, 2017 •

edited by k8s-reviewable

Loading