Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update docs for recommenders #631

Merged
merged 3 commits into from
Nov 30, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions pkg/utils/expression_prom_default.go
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,9 @@ import (
// todo: later we change these templates to configurable like prometheus-adapter
const (
// WorkloadCpuUsageExprTemplate is used to query workload cpu usage by promql, param is namespace,workload-name,duration str
WorkloadCpuUsageExprTemplate = `sum(irate(container_cpu_usage_seconds_total{namespace="%s",pod=~"^%s-.*$"}[%s]))`
WorkloadCpuUsageExprTemplate = `sum(irate(container_cpu_usage_seconds_total{namespace="%s",pod=~"^%s-.*$",container!=""}[%s]))`
// WorkloadMemUsageExprTemplate is used to query workload mem usage by promql, param is namespace, workload-name
WorkloadMemUsageExprTemplate = `sum(container_memory_working_set_bytes{namespace="%s",pod=~"^%s-.*$"})`
WorkloadMemUsageExprTemplate = `sum(container_memory_working_set_bytes{namespace="%s",pod=~"^%s-.*$",container!=""})`

// following is node exporter metric for node cpu/memory usage
// NodeCpuUsageExprTemplate is used to query node cpu usage by promql, param is node name which prometheus scrape, duration str
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
---
title: "IdleNode Recommendation"
description: "Introduce for IdleNode Recommendation"
weight: 15
---

By scanning the status and utilization of nodes, the idle node recommendation helps users to find idle Kubernetes nodes.

## Motivation

In Kubernetes cluster, some nodes often idle due to such factors as node taint, label selector, low packing rate and low utilization rate, which wastes a lot of costs. IdleNode recommendation tries to help users find these nodes to reduce cost.

## Example

```yaml
kind: Recommendation
apiVersion: analysis.crane.io/v1alpha1
metadata:
name: idlenodes-rule-idlenode-5jxn9
namespace: crane-system
labels:
analysis.crane.io/recommendation-rule-name: idlenodes-rule
analysis.crane.io/recommendation-rule-recommender: IdleNode
analysis.crane.io/recommendation-rule-uid: 8921a198-7082-11ed-8b7b-246e960a8d8c
analysis.crane.io/recommendation-target-kind: Node
analysis.crane.io/recommendation-target-name: worker-node-1
analysis.crane.io/recommendation-target-version: v1
beta.kubernetes.io/arch: amd64
beta.kubernetes.io/instance-type: bareMetal
beta.kubernetes.io/os: linux
ownerReferences:
- apiVersion: analysis.crane.io/v1alpha1
kind: RecommendationRule
name: idlenodes-rule
uid: 8921a198-7082-11ed-8b7b-246e960a8d8c
controller: false
blockOwnerDeletion: false
spec:
targetRef:
kind: Node
name: worker-node-1
apiVersion: v1
type: IdleNode
completionStrategy: {}
status:
targetRef: {}
action: Delete
lastUpdateTime: '2022-11-30T07:46:57Z'
```

In this example:

- Recommendation's TargetRef Point to Node:worker-node-1
- Recommendation type is IdleNode
- action is Delete,but offline a node is a complicated operation, we only give recommended advise.

## Implement

Perform the following steps to complete a recommendation process for idle nodes:

1. Scan all nodes and pods in the cluster
2. If all Pods on a node are DaemonSet, the node is considered to be idle
Original file line number Diff line number Diff line change
Expand Up @@ -126,6 +126,8 @@ Currently, Crane support these Recommenders:

- [**Resource Recommendation**](/docs/tutorials/recommendation/resource-recommendation): Use the VPA algorithm to analyze the actual usage of applications and recommend more appropriate resource configurations.
- [**Replicas Recommendation**](/docs/tutorials/recommendation/replicas-recommendation): Use the HPA algorithm to analyze the actual usage of applications and recommend more appropriate replicas configurations.
- [**IdleNode Recommendation**](/docs/tutorials/recommendation/idlenode-recommendation): Find the idle nodes in cluster


### Recommender Framework

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
---
title: "闲置节点推荐"
description: "闲置节点推荐功能介绍"
weight: 15
---

闲置节点推荐通过扫描节点的状态和利用率,帮助用户找到闲置的 Kubernetes node。

## 动机

在使用 Kubernetes 的过程中,常常由于污点配置、label selector、低装箱率、低利用率等因素导致部分节点出现闲置状态,浪费了大量成本,闲置节点推荐尝试帮助用户找到这部分节点来实现成本优化。

## 推荐示例

```yaml
kind: Recommendation
apiVersion: analysis.crane.io/v1alpha1
metadata:
name: idlenodes-rule-idlenode-5jxn9
namespace: crane-system
labels:
analysis.crane.io/recommendation-rule-name: idlenodes-rule
analysis.crane.io/recommendation-rule-recommender: IdleNode
analysis.crane.io/recommendation-rule-uid: 8921a198-7082-11ed-8b7b-246e960a8d8c
analysis.crane.io/recommendation-target-kind: Node
analysis.crane.io/recommendation-target-name: worker-node-1
analysis.crane.io/recommendation-target-version: v1
beta.kubernetes.io/arch: amd64
beta.kubernetes.io/instance-type: bareMetal
beta.kubernetes.io/os: linux
ownerReferences:
- apiVersion: analysis.crane.io/v1alpha1
kind: RecommendationRule
name: idlenodes-rule
uid: 8921a198-7082-11ed-8b7b-246e960a8d8c
controller: false
blockOwnerDeletion: false
spec:
targetRef:
kind: Node
name: worker-node-1
apiVersion: v1
type: IdleNode
completionStrategy: {}
status:
targetRef: {}
action: Delete
lastUpdateTime: '2022-11-30T07:46:57Z'
```

在该示例中:

- 推荐的 TargetRef 指向了 Node:worker-node-1
- 推荐类型为闲置节点推荐
- action 是 Delete,但是下线节点是复杂操作,这里只是给出建议

## 实现原理

闲置节点推荐按以下步骤完成一次推荐过程:

1. 扫描集群中所有节点和节点上的 Pod
2. 如果节点上所有 Pod 都属于 DaemonSet,则判定为闲置节点

Original file line number Diff line number Diff line change
Expand Up @@ -126,6 +126,7 @@ patchData=`kubectl get recommend workloads-rule-replicas-rckvb -n default -o jso

- [**资源推荐**](/zh-cn/docs/tutorials/recommendation/resource-recommendation): 通过 VPA 算法分析应用的真实用量推荐更合适的资源配置
- [**副本数推荐**](/zh-cn/docs/tutorials/recommendation/replicas-recommendation): 通过 HPA 算法分析应用的真实用量推荐更合适的副本数量
- [**闲置节点推荐**](/zh-cn/docs/tutorials/recommendation/idlenode-recommendation): 扫描集群中的闲置节点

### Recommender 框架

Expand Down
134 changes: 113 additions & 21 deletions site/content/zh/docs/Tutorials/Recommendation/replicas-recommendation.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,44 +6,136 @@ weight: 13

Kubernetes 用户在创建应用资源时常常是基于经验值来设置副本数。通过副本数推荐的算法分析应用的真实用量推荐更合适的副本配置,您可以参考并采纳它提升集群的资源利用率。

## 实现原理
## 动机

Kubernetes 工作负载的副本数可以控制 Pod 的数量进行快速的伸缩。然而,如何设置副本数量一直以来是困扰应用管理员的问题,副本数过多会导致大量的资源浪费,而过低的副本数又可能会存在稳定性问题。

社区的 HPA 提供了一种基于负载的动态伸缩机制,Crane 的 EHPA 基于 HPA 实现了基于预测的智能弹性。但是现实世界中,只有部分工作负载可以动态的水平伸缩,大量的工作负载需要在运行时保持固定的副本数。

下图展示了一个利用率过低的例子,该 Pod 的历史使用量的峰值与它的申请量 Request 之间,有30%的资源浪费。

![Resource Waste](/images/resource-waste.jpg)

副本推荐尝试通过基于历史真实用量的分析降低用户配置工作负载副本数的复杂度。

## 推荐示例

一个简单的副本推荐 yaml 文件如下:

```yaml
kind: Recommendation
apiVersion: analysis.crane.io/v1alpha1
metadata:
name: workloads-rule-replicas-p84jv
namespace: kube-system
labels:
addonmanager.kubernetes.io/mode: Reconcile
analysis.crane.io/recommendation-rule-name: workloads-rule
analysis.crane.io/recommendation-rule-recommender: Replicas
analysis.crane.io/recommendation-rule-uid: 18588495-f325-4873-b45a-7acfe9f1ba94
k8s-app: kube-dns
kubernetes.io/cluster-service: 'true'
kubernetes.io/name: CoreDNS
ownerReferences:
- apiVersion: analysis.crane.io/v1alpha1
kind: RecommendationRule
name: workloads-rule
uid: 18588495-f325-4873-b45a-7acfe9f1ba94
controller: false
blockOwnerDeletion: false
spec:
targetRef:
kind: Deployment
namespace: kube-system
name: coredns
apiVersion: apps/v1
type: Replicas
completionStrategy:
completionStrategyType: Once
adoptionType: StatusAndAnnotation
status:
recommendedValue:
replicasRecommendation:
replicas: 1
targetRef: { }
recommendedInfo: '{"spec":{"replicas":1}}'
currentInfo: '{"spec":{"replicas":2}}'
action: Patch
conditions:
- type: Ready
status: 'True'
lastTransitionTime: '2022-11-28T08:07:36Z'
reason: RecommendationReady
message: Recommendation is ready
lastUpdateTime: '2022-11-29T11:07:45Z'
```

基于 Workload 历史 CPU 负载,找到过去七天内每小时负载最低的 CPU 用量,计算按50%(可配置)利用率和 Workload CPU Request 应配置的副本数
在该示例中:

### Filter 阶段
- 推荐的 TargetRef 指向 kube-system 的 Deployment:coredns
- 推荐类型为副本推荐
- adoptionType 是 StatusAndAnnotation,表示将推荐结果展示在 recommendation.status 和 Deployment 的 Annotation
- recommendedInfo 显示了推荐的副本数(recommendedValue 已经 deprecated),currentInfo 显示了当前的副本数,格式是 Json ,可以通过 Kubectl Patch 将推荐结果更新到
TargetRef

1. 低副本数的工作负载: 过低的副本数可能推荐需求不高,关联配置: `workload-min-replicas`
2. 存在一定比例非 Running Pod 的工作负载: 如果工作负载的 Pod 大多不能正常运行,可能不适合弹性,关联配置: `pod-min-ready-seconds` | `pod-available-ratio`
如何使用副本推荐请参考:[**推荐框架**](/zh-cn/docs/tutorials/recommendation/recommendation-framework)

### Prepare 阶段
## 实现原理

查询过去一周的 CPU 使用量
副本推荐按以下步骤完成一次推荐过程:

### Recommend 阶段
1. 通过监控数据,获取 Workload 过去一周的 CPU 和 Memory 历史用量。
2. 用 DSP 算法预测未来一周 CPU 用量
3. 分别计算 CPU 和 内存分别对应的副本数,取较大值

1. 计算过去7天 workload 每小时使用量中位数的最低值(防止极小值影响): workload_cpu_usage_medium_min
2. 目标利用率对应的副本数:
### 计算副本算法

以 CPU 举例,假设工作负载 CPU 历史用量的 P99 是10核,Pod CPU Request 是5核,目标峰值利用率是50%,可知副本数是4个可以满足峰值利用率是50%。

```go
replicas := int32(math.Ceil(workloadCpu / (rr.TargetUtilization * float64(requestTotal) / 1000.)))
replicas := int32(math.Ceil(workloadUsage / (TargetUtilization * float64(requestTotal) / 1000.)))
```

3. 为了防止 replicas 过小,replicas 需要大于等于 default-min-replicas
### 排除异常的工作负载

以下类型的异常工作负载不进行推荐:

### Observe 阶段
1. 低副本数的工作负载: 过低的副本数可能推荐需求不高,关联配置: `workload-min-replicas`
2. 存在一定比例非 Running Pod 的工作负载: 如果工作负载的 Pod 大多不能正常运行,可能不适合弹性,关联配置: `pod-min-ready-seconds` | `pod-available-ratio`

将推荐 replicas 记录到 Metric:crane_analytics_replicas_recommendation
### 通过 Prometheus Metric 监控推荐结果

副本推荐结果会记录到 Metric:crane_analytics_replicas_recommendation

## 如何验证推荐结果的准确性

用户可以通过以下 Prom query 得到 Workload 的资源用量,将资源用量带入上面副本算法公式可验证推荐 TargetRef。

以 crane-system 的 Deployment Craned 为例,用户可以将 container, namespace, pod 换成希望验证的推荐结果。

```shell
sum(irate(container_cpu_usage_seconds_total{namespace="crane-system",pod=~"^craned-.*$",container!=""}[3m])) # cpu usage
```

```shell
sum(container_memory_working_set_bytes{namespace="crane-system",pod=~"^craned-.*$",container!=""}) # memory usage
```

## 支持的资源类型

默认支持 StatefulSet 和 Deployment,但是支持所有实现了 Scale SubResource 的 Workload。

## 参数配置

| 配置项 | 默认值 | 描述 |
| ------------- |-----|-----------------|
| workload-min-replicas| 1 | 小于该值的工作负载不做弹性推荐 |
| pod-min-ready-seconds| 30 | 定义了 Pod 是否 Ready 的秒数 |
| pod-available-ratio| 0.5 | Ready Pod 比例小于该值的工作负载不做弹性推荐 |
| default-min-replicas| 1 | 最小 minReplicas |
| cpu-target-utilization| 0.5 | 按该值计算最小副本数 |
| 配置项 | 默认值 | 描述 |
|------------------------|------|-----------------------------|
| workload-min-replicas | 1 | 小于该值的工作负载不做弹性推荐 |
| pod-min-ready-seconds | 30 | 定义了 Pod 是否 Ready 的秒数 |
| pod-available-ratio | 0.5 | Ready Pod 比例小于该值的工作负载不做弹性推荐 |
| default-min-replicas | 1 | 最小 minReplicas |
| cpu-percentile | 0.95 | 历史 CPU 用量的 Percentile |
| mem-percentile | 0.95 | 历史内存用量的 Percentile |
| cpu-target-utilization | 0.5 | CPU 目标峰值利用率 |
| mem-target-utilization | 0.5 | 内存目标峰值利用率 |

如何更新推荐的配置请参考:[**推荐框架**](/zh-cn/docs/tutorials/recommendation/recommendation-framework)
Loading