Skip to content

Commit

Permalink
Add qos doc to site,readme and introduction
Browse files Browse the repository at this point in the history
  • Loading branch information
kaiyuechen committed Oct 14, 2022
1 parent dc22027 commit 8c40a3a
Show file tree
Hide file tree
Showing 35 changed files with 1,052 additions and 831 deletions.
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,8 +46,9 @@ EffectiveHorizontalPodAutoscaler supports prediction-driven autoscaling. With th

Provide a simple but efficient scheduler that schedule pods based on actual node utilization data,and filters out those nodes with high load to balance the cluster. [learn more](docs/tutorials/scheduling-pods-based-on-actual-node-load.md).

**Colocation with Enhanced QoS**
**Colocation with Enhanced QOS**

QOS-related capabilities ensure the running stability of Pods on Kubernetes. It has the ability of interference detection and active avoidance under the condition of multi-dimensional metrics, and supports reasonable operation and custom metrics access; it has the ability to oversell elastic resources enhanced by the prediction algorithm, reuse and limit the idle resources in the cluster; it has the enhanced bypass cpuset Management capabilities, improve resource utilization efficiency while binding cores. [learn more](docs/tutorials/using-qos-ensurance.md).

## Architecture

Expand Down
3 changes: 2 additions & 1 deletion README_zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,8 +46,9 @@ EffectiveHorizontalPodAutoscaler 支持了预测驱动的弹性。它基于社

动态调度器根据实际的节点利用率构建了一个简单但高效的模型,并过滤掉那些负载高的节点来平衡集群。[了解更多](docs/tutorials/scheduling-pods-based-on-actual-node-load.zh.md)

**基于 QoS 的混部**
**基于 QOS 的混部**

QOS相关能力保证了运行在 Kubernetes 上的 Pod 的稳定性。具有多维指标条件下的干扰检测和主动回避能力,支持精确操作和自定义指标接入;具有预测算法增强的弹性资源超卖能力,复用和限制集群内的空闲资源;具备增强的旁路cpuset管理能力,在绑核的同时提升资源利用效率。[了解更多](docs/tutorials/using-qos-ensurance.zh.md)

## 架构

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -29,19 +29,19 @@

<!-- /TOC -->
## Motivation
当前在crane-agent中,当超过NodeQOSEnsurancePolicy中指定的水位线后,执行evict,throttle等操作时先对低优先级的pod进行排序,当前排序的依据是pod的ProrityClass,然后在排序的pod进行throttle或者evict操作;
当前在crane-agent中,当超过NodeQOS中指定的水位线后,执行evict,throttle等操作时先对低优先级的pod进行排序,当前排序的依据是pod的ProrityClass,然后在排序的pod进行throttle或者evict操作;

目前存在的问题有:

1. 排序只参考ProrityClass,无法满足基于其他特性的排序;同时也无法满足按照水位线精确操作对灵活排序的需求,无法满足尽快让节点达到指定的水位线的要求。例如我们希望尽快降低低优先级业务的cpu使用量时,应该选出cpu使用量较多的pod,这样能够更快地降低cpu用量,保障高优业务不受影响。

2. 在触发NodeQOSEnsurancePolicy中指定的水位线后,会对于节点上的所有低于指定ProrityClass的pod进行操作;例如,当前节点上有10个pod低于指定ProrityClass,在触发水位线后,会对这10个pod都进行操作,但是实际上可能在操作完成对第一个pod的操作后就可以低于NodeQOSEnsurancePolicy中的指标值了,对剩下的pod的操作,属于过度操作,是可以避免的。如果能以NodeQOSEnsurancePolicy中的指标值作为水位线对pod进行精确的操作,操作到刚好低于水位线是更为合适的,就能避免对低优先级服务的过度影响。
2. 在触发NodeQOS中指定的水位线后,会对于节点上的所有低于指定ProrityClass的pod进行操作;例如,当前节点上有10个pod低于指定ProrityClass,在触发水位线后,会对这10个pod都进行操作,但是实际上可能在操作完成对第一个pod的操作后就可以低于NodeQOS中的指标值了,对剩下的pod的操作,属于过度操作,是可以避免的。如果能以NodeQOS中的指标值作为水位线对pod进行精确的操作,操作到刚好低于水位线是更为合适的,就能避免对低优先级服务的过度影响。

### Goals

- 丰富了crane-agent的排序策略,包括以pod cpu用量为主要参照的排序,以pod内存用量为主要参照的排序,基于运行时间的排序,基于扩展资源使用率的排序。
- 实现一套包含排序和精确操作的框架,支持对不同的指标丰富排序规则,并且实现精确操作。
- 实现针对cpu usage和memmory usage的精确操作,当整机负载超过NodeQOSEnsurancePolicy中指定的水位线后,会先对低优先级的pod进行排序,然后按照顺序操作到刚好低于水位线为止。
- 实现针对cpu usage和memmory usage的精确操作,当整机负载超过NodeQOS中指定的水位线后,会先对低优先级的pod进行排序,然后按照顺序操作到刚好低于水位线为止。

## Proposal

Expand Down Expand Up @@ -89,7 +89,7 @@

### metric属性的定义

为了更好的基于NodeQOSEnsurancePolicy配置的metric进行排序和精准控制,对metric引入属性的概念。
为了更好的基于NodeQOS配置的metric进行排序和精准控制,对metric引入属性的概念。

metric的属性包含如下几个:

Expand Down Expand Up @@ -129,7 +129,7 @@ type metric struct {

### 如何根据水位线进行精准控制

- 根据多个NodeQOSEnsurancePolicy及其中的objectiveEnsurances构建多条水位线:
- 根据多个NodeQOS及其中的objectiveEnsurances构建多条水位线:
1. 按照objectiveEnsurances对应的action进行分类,目前crane-agent有3个针对节点Qos进行保障的操作,分别是Evict,ThtottleDown(当前用量高于objectiveEnsurances中的值时对pod进行用量压制)和ThrottleUp(当前用量低于objectiveEnsurances中的值时对pod的用量进行放宽恢复),因此会有三个水位线集合,分别是
ThrottleDownWaterLine,ThrottleUpWaterLine和EvictWaterLine

Expand Down Expand Up @@ -170,15 +170,15 @@ type metric struct {
在executor阶段,根据水位线中的涉及的指标进行其相应的排序,获取最新用量,构造GapToWaterLines,并进行精确操作

#### analyzer阶段
在该阶段进行NodeQOSEnsurancePolicy到WaterLines的转换,并对相同actionName和metricrule的规则进行合并,具体内容上文已经介绍过了
在该阶段进行NodeQOS到WaterLines的转换,并对相同actionName和metricrule的规则进行合并,具体内容上文已经介绍过了

#### executor阶段
压制过程:

1. 首先分析ThrottoleDownGapToWaterLines中涉及的metrics,将这些metrics根据其Quantified属性区分为两部分,如果存在不可Quantified的metric,则通过GetHighestPriorityThrottleAbleMetric获取具有最高ActionPriority的一个throttleAble(具有throttleFunc)的metric对所选择的所有pod进行压制操作,因为但凡存在一个不可Quantified的metric,就无法进行精确的操作

2. 通过getStateFunc()获取当前节点和workload的最新用量,依据ThrottoleDownGapToWaterLines和实时用量构造GapToWaterLine(需要注意的是,在构造GapToWaterLine时,会以注册过的metric进行遍历,所以最终构造出来的GapToWaterLine中的metrics,会是ThrottoleDownGapToWaterLines
中注册过的metric,避免了在NodeQOSEnsurancePolicy中配置错误不存在或未注册metric的情况
中注册过的metric,避免了在NodeQOS中配置错误不存在或未注册metric的情况

3. 如果GapToWaterLine中有metric的实时用量无法获取(HasUsageMissedMetric),则通过GetHighestPriorityThrottleAbleMetric获取具有最高ActionPriority的一个throttleAble(具有throttleFunc)的metric对所选择的所有pod进行压制操作,因为如果存在metric实时用量无法获取,就无法获知和水位线的gap,也就无法进行精确的操作

Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
## 自定义指标干扰检测回避和自定义排序
自定义指标干扰检测回避和自定义排序的使用同 精确执行回避动作 部分中介绍的流程,此处介绍如何自定义自己的指标参与干扰检测回避流程

为了更好的基于NodeQOSEnsurancePolicy配置的metric进行排序和精准控制,对metric引入属性的概念。
为了更好的基于NodeQOS配置的metric进行排序和精准控制,对metric引入属性的概念。

metric的属性包含如下几个,自定义的指标实现这些字段即可:

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ specific Priority, and specific Namespace of pods, above selectors are associate

### Disable Scheduling

The following AvoidanceAction and NodeQOSEnsurancePolicy can be defined. As a result, when the node CPU usage triggers the threshold, disable schedule action for the node will be executed.
The following AvoidanceAction and NodeQOS can be defined. As a result, when the node CPU usage triggers the threshold, disable schedule action for the node will be executed.

The sample YAML looks like below:

Expand Down Expand Up @@ -83,9 +83,6 @@ metadata:
spec:
allowedActions:
- disablescheduling
resourceQOS:
cpuQOS:
cpuPriority: 7
labelSelector:
matchLabels:
preemptible_job: "true"
Expand All @@ -100,7 +97,7 @@ Please check the video to learn more about the scheduling disable actions.
### Throttle
The following AvoidanceAction and NodeQOSEnsurancePolicy can be defined. As a result, when the node CPU usage triggers the threshold, throttle action for the node will be executed.
The following AvoidanceAction and NodeQOS can be defined. As a result, when the node CPU usage triggers the threshold, throttle action for the node will be executed.
The sample YAML looks like below:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -102,7 +102,7 @@ spec:

### Throttle

定义 `AvoidanceAction` 和 `NodeQOSEnsurancePolicy`。
定义 `AvoidanceAction`, `NodeQOS` 和 `PodQOS`。

当节点 CPU 使用率触发回避阈值时,将执行节点的`Throttle Action`。

Expand Down
3 changes: 2 additions & 1 deletion site/content/en/docs/Getting started/introduction.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,8 +39,9 @@ EffectiveHorizontalPodAutoscaler supports prediction-driven autoscaling. With th

Provide a simple but efficient scheduler that schedule pods based on actual node utilization data,and filters out those nodes with high load to balance the cluster. [learn more](/docs/tutorials/scheduling-pods-based-on-actual-node-load).

**Colocation with Enhanced QoS**
**Colocation with Enhanced QOS**

QOS-related capabilities ensure the running stability of Pods on Kubernetes. It has the ability of interference detection and active avoidance under the condition of multi-dimensional metrics, and supports reasonable operation and custom metrics access; it has the ability to oversell elastic resources enhanced by the prediction algorithm, reuse and limit the idle resources in the cluster; it has the enhanced bypass cpuset Management capabilities, improve resource utilization efficiency while binding cores. [learn more](/docs/tutorials/using-qos-ensurance).

## Architecture

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ general-purpose policy engine.

## Motivation
The criteria of abnormality or interference are not that always as simple as something like a metric value is higher than a threshold.
Different users may have different QoS requirements on different applications in different environments. The rule of
Different users may have different QOS requirements on different applications in different environments. The rule of
abnormality detection varies, and it is impossible to implement all of them in code in advance.

### Goals
Expand All @@ -38,12 +38,12 @@ out of scope.

#### Story 1
A user has a critical online application which is latency sensitive running in the cluster, and he wants to use both
the 99th percentile response time and the error code rate as the application QoS indicators. If either of these 2 indicators
the 99th percentile response time and the error code rate as the application QOS indicators. If either of these 2 indicators
deteriorates, the application is thought of being in abnormal status.


#### Story 2
The SRE team finds that if the node CPU utilization is more than 60%, the QoS of some latency sensitive applications
The SRE team finds that if the node CPU utilization is more than 60%, the QOS of some latency sensitive applications
running on it are likely to decline. So they want to keep the node CPU utilization lower than 60%.
If the utilization is higher than this threshold, the BE applications should be suppressed
accordingly.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ The existing problems are:

- The proposal implements some general sorting methods (which will be improved later):

classAndPriority: Compare the Qos class and class value of two pods. Compare Qos class first and then class value; Those with high priority are ranked later and have higher priority
classAndPriority: Compare the QOS class and class value of two pods. Compare QOS class first and then class value; Those with high priority are ranked later and have higher priority

runningTime:Compare the running time of two pods. The one with a long running time is ranked later and has a higher priority

Expand Down Expand Up @@ -136,7 +136,7 @@ You can define your own metric. After the construction is completed, you can reg
### How to control accurately according to the water level

- Build multiple waterlines according to multiple nodeqosensurancepolicies and objectiveinsurances:
1. Classified according to the actions corresponding to objectiveinsurances, the crane agent currently has three operations to guarantee node QoS, namely, evict, thtottledown (to suppress pod usage when the current usage is higher than the value in objectiveinsurances) and throttleup (to relax and recover pod usage when the current usage is lower than the value in objectiveinsurances). Therefore, there will be three waterline sets, namely, throttledownwaterline, Throttleupwaterline and evictwaterline
1. Classified according to the actions corresponding to objectiveinsurances, the crane agent currently has three operations to guarantee node QOS, namely, evict, thtottledown (to suppress pod usage when the current usage is higher than the value in objectiveinsurances) and throttleup (to relax and recover pod usage when the current usage is lower than the value in objectiveinsurances). Therefore, there will be three waterline sets, namely, throttledownwaterline, Throttleupwaterline and evictwaterline

2. Then classify the waterlines in the same operation category according to their metric rules (metric A and metric Z are used as schematic in the figure), and record the value of each objectiveinsurances water level line, which is recorded as waterline;

Expand Down Expand Up @@ -274,5 +274,5 @@ if len(MetricsNotEvcitQuantified) != 0 {

### User Stories

- Users can use crane agent for better QoS guarantees. Support faster node load reduction to ensure that high priority services are not affected. At the same time, the throttle/eviction of low priority services is precisely controlled to avoid excessive operation.
- With the help of the framework of precise operation (throttle/eviction), users can easily realize the QoS function with precise operation and sorting capability based on the user-defined metric without paying attention to details by implementing the attributes and methods related to the user-defined metric.
- Users can use crane agent for better QOS guarantees. Support faster node load reduction to ensure that high priority services are not affected. At the same time, the throttle/eviction of low priority services is precisely controlled to avoid excessive operation.
- With the help of the framework of precise operation (throttle/eviction), users can easily realize the QOS function with precise operation and sorting capability based on the user-defined metric without paying attention to details by implementing the attributes and methods related to the user-defined metric.
6 changes: 3 additions & 3 deletions site/content/en/docs/Roadmap/roadmap-2022.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,13 +15,13 @@ Please let us know if you have urgent needs which are not presented in the plan.
- fadvisor to support billing
### 0.2.0:[released]
- Multiple Metric Adaptor support
- Node QoS Ensurance for CPU
- Node QOS Ensurance for CPU
- Operation Metrics about R3 and EPA applied ratio
### 0.3.0 [released]
- UI with cost visibility and usage optimizations.
- Request Recommendation adapts with Virtual Kubelet
- Multiple Triggers for EPA
- Node QoS Ensurance for Mem
- Node QOS Ensurance for Mem
- Prediction with CPU, Memory, and Business Metrics
- Scalability to support 1K TSP and 1K EPA
### 0.4.0 [released]
Expand All @@ -31,7 +31,7 @@ Please let us know if you have urgent needs which are not presented in the plan.
- Load-aware Scheduler
### 0.6.0 [released]
- Scalability to support 3k TSP and 3k EPA
- Algorithm and QoS Documentation
- Algorithm and QOS Documentation
- EHPA grafana dashboard
- DSP Algorithm Optimization
- Support remote adapter for external metric
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@

---
title: "Colocation with Enhanced QOS"
weight: 9
description: >
Introduction to QOS and colocation related capabilities.
---
Loading

0 comments on commit 8c40a3a

Please sign in to comment.