Skip to content

Commit

Permalink
Merge pull request #60 from damemi/1.21-bump
Browse files Browse the repository at this point in the history
Bug 1949050: Pull upstream changes for 1.21
  • Loading branch information
openshift-merge-robot authored Apr 22, 2021
2 parents 1443aae + b41fdee commit 6ca1099
Show file tree
Hide file tree
Showing 1,670 changed files with 125,538 additions and 13,202 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,4 @@ _tmp/
vendordiff.patch
.idea/
*.code-workspace
.vscode/
4 changes: 2 additions & 2 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
FROM golang:1.15.8
FROM golang:1.16.1

WORKDIR /go/src/sigs.k8s.io/descheduler
COPY . .
Expand All @@ -21,7 +21,7 @@ RUN VERSION=${VERSION} make build.$ARCH

FROM scratch

MAINTAINER Avesh Agarwal <avesh.ncsu@gmail.com>
MAINTAINER Kubernetes SIG Scheduling <kubernetes-sig-scheduling@googlegroups.com>

USER 1000

Expand Down
2 changes: 1 addition & 1 deletion Dockerfile.dev
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
# limitations under the License.
FROM scratch

MAINTAINER Avesh Agarwal <avesh.ncsu@gmail.com>
MAINTAINER Kubernetes SIG Scheduling <kubernetes-sig-scheduling@googlegroups.com>

USER 1000

Expand Down
5 changes: 4 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ verify-commits:
# make test-unit
# make test-unit WHAT=pkg/build TESTFLAGS=-v

verify: verify-gofmt verify-vendor lint lint-chart verify-spelling
verify: verify-gofmt verify-vendor lint lint-chart verify-spelling verify-toc verify-gen

verify-spelling:
./hack/verify-spelling.sh
Expand All @@ -94,6 +94,9 @@ verify-gofmt:
verify-vendor:
./hack/verify-vendor.sh

verify-toc:
./hack/verify-toc.sh

test-unit:
GOTEST_FLAGS="$(TESTFLAGS)" hack/test-go.sh $(WHAT) $(TESTS)
.PHONY: test-unit
Expand Down
106 changes: 59 additions & 47 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,34 +24,35 @@ but relies on the default scheduler for that.

Table of Contents
=================

* [Quick Start](#quick-start)
* [Run As A Job](#run-as-a-job)
* [Run As A CronJob](#run-as-a-cronjob)
* [Install Using Helm](#install-using-helm)
* [Install Using Kustomize](#install-using-kustomize)
* [User Guide](#user-guide)
* [Policy and Strategies](#policy-and-strategies)
* [RemoveDuplicates](#removeduplicates)
* [LowNodeUtilization](#lownodeutilization)
* [RemovePodsViolatingInterPodAntiAffinity](#removepodsviolatinginterpodantiaffinity)
* [RemovePodsViolatingNodeAffinity](#removepodsviolatingnodeaffinity)
* [RemovePodsViolatingNodeTaints](#removepodsviolatingnodetaints)
* [RemovePodsViolatingTopologySpreadConstraint](#removepodsviolatingtopologyspreadconstraint)
* [RemovePodsHavingTooManyRestarts](#removepodshavingtoomanyrestarts)
* [PodLifeTime](#podlifetime)
* [Filter Pods](#filter-pods)
* [Namespace filtering](#namespace-filtering)
* [Priority filtering](#priority-filtering)
* [Label filtering](#label-filtering)
* [Pod Evictions](#pod-evictions)
* [Pod Disruption Budget (PDB)](#pod-disruption-budget-pdb)
* [Metrics](#metrics)
* [Compatibility Matrix](#compatibility-matrix)
* [Getting Involved and Contributing](#getting-involved-and-contributing)
* [Communicating With Contributors](#communicating-with-contributors)
* [Roadmap](#roadmap)
* [Code of conduct](#code-of-conduct)
<!-- toc -->
- [Quick Start](#quick-start)
- [Run As A Job](#run-as-a-job)
- [Run As A CronJob](#run-as-a-cronjob)
- [Install Using Helm](#install-using-helm)
- [Install Using Kustomize](#install-using-kustomize)
- [User Guide](#user-guide)
- [Policy and Strategies](#policy-and-strategies)
- [RemoveDuplicates](#removeduplicates)
- [LowNodeUtilization](#lownodeutilization)
- [RemovePodsViolatingInterPodAntiAffinity](#removepodsviolatinginterpodantiaffinity)
- [RemovePodsViolatingNodeAffinity](#removepodsviolatingnodeaffinity)
- [RemovePodsViolatingNodeTaints](#removepodsviolatingnodetaints)
- [RemovePodsViolatingTopologySpreadConstraint](#removepodsviolatingtopologyspreadconstraint)
- [RemovePodsHavingTooManyRestarts](#removepodshavingtoomanyrestarts)
- [PodLifeTime](#podlifetime)
- [Filter Pods](#filter-pods)
- [Namespace filtering](#namespace-filtering)
- [Priority filtering](#priority-filtering)
- [Label filtering](#label-filtering)
- [Pod Evictions](#pod-evictions)
- [Pod Disruption Budget (PDB)](#pod-disruption-budget-pdb)
- [Metrics](#metrics)
- [Compatibility Matrix](#compatibility-matrix)
- [Getting Involved and Contributing](#getting-involved-and-contributing)
- [Communicating With Contributors](#communicating-with-contributors)
- [Roadmap](#roadmap)
- [Code of conduct](#code-of-conduct)
<!-- /toc -->

## Quick Start

Expand Down Expand Up @@ -90,12 +91,12 @@ See the [resources | Kustomize](https://kubectl.docs.kubernetes.io/references/ku

Run As A Job
```
kustomize build 'github.com/kubernetes-sigs/descheduler/kubernetes/job?ref=master' | kubectl apply -f -
kustomize build 'github.com/kubernetes-sigs/descheduler/kubernetes/job?ref=v0.21.0' | kubectl apply -f -
```

Run As A CronJob
```
kustomize build 'github.com/kubernetes-sigs/descheduler/kubernetes/cronjob?ref=master' | kubectl apply -f -
kustomize build 'github.com/kubernetes-sigs/descheduler/kubernetes/cronjob?ref=v0.21.0' | kubectl apply -f -
```

## User Guide
Expand All @@ -110,9 +111,15 @@ Eight strategies `RemoveDuplicates`, `LowNodeUtilization`, `RemovePodsViolatingI
`RemovePodsHavingTooManyRestarts`, and `PodLifeTime` are currently implemented. As part of the policy, the
parameters associated with the strategies can be configured too. By default, all strategies are enabled.

The following diagram provides a visualization of most of the strategies to help
categorize how strategies fit together.

![Strategies diagram](strategies_diagram.png)

The policy also includes common configuration for all the strategies:
- `nodeSelector` - limiting the nodes which are processed
- `evictLocalStoragePods` - allowing to evict pods with local storage
- `evictLocalStoragePods` - allows eviction of pods with local storage
- `evictSystemCriticalPods` - [Warning: Will evict Kubernetes system pods] allows eviction of pods with any priority, including system pods like kube-dns
- `ignorePvcPods` - set whether PVC pods should be evicted or ignored (defaults to `false`)
- `maxNoOfPodsToEvictPerNode` - maximum number of pods evicted from each node (summed through all strategies)

Expand All @@ -121,6 +128,7 @@ apiVersion: "descheduler/v1alpha1"
kind: "DeschedulerPolicy"
nodeSelector: prod=dev
evictLocalStoragePods: true
evictSystemCriticalPods: true
maxNoOfPodsToEvictPerNode: 40
ignorePvcPods: false
strategies:
Expand All @@ -129,15 +137,17 @@ strategies:
### RemoveDuplicates
This strategy makes sure that there is only one pod associated with a Replica Set (RS),
Replication Controller (RC), Deployment, or Job running on the same node. If there are more,
This strategy makes sure that there is only one pod associated with a ReplicaSet (RS),
ReplicationController (RC), StatefulSet, or Job running on the same node. If there are more,
those duplicate pods are evicted for better spreading of pods in a cluster. This issue could happen
if some nodes went down due to whatever reasons, and pods on them were moved to other nodes leading to
more than one pod associated with a RS or RC, for example, running on the same node. Once the failed nodes
are ready again, this strategy could be enabled to evict those duplicate pods.
It provides one optional parameter, `ExcludeOwnerKinds`, which is a list of OwnerRef `Kind`s. If a pod
has any of these `Kind`s listed as an `OwnerRef`, that pod will not be considered for eviction.
It provides one optional parameter, `excludeOwnerKinds`, which is a list of OwnerRef `Kind`s. If a pod
has any of these `Kind`s listed as an `OwnerRef`, that pod will not be considered for eviction. Note that
pods created by Deployments are considered for eviction by this strategy. The `excludeOwnerKinds` parameter
should include `ReplicaSet` to have pods created by Deployments excluded.

**Parameters:**

Expand Down Expand Up @@ -168,15 +178,15 @@ in the hope that recreation of evicted pods will be scheduled on these underutil
parameters of this strategy are configured under `nodeResourceUtilizationThresholds`.

The under utilization of nodes is determined by a configurable threshold `thresholds`. The threshold
`thresholds` can be configured for cpu, memory, and number of pods in terms of percentage (the percentage is
`thresholds` can be configured for cpu, memory, number of pods, and extended resources in terms of percentage (the percentage is
calculated as the current resources requested on the node vs [total allocatable](https://kubernetes.io/docs/concepts/architecture/nodes/#capacity).
For pods, this means the number of pods on the node as a fraction of the pod capacity set for that node).

If a node's usage is below threshold for all (cpu, memory, and number of pods), the node is considered underutilized.
If a node's usage is below threshold for all (cpu, memory, number of pods and extended resources), the node is considered underutilized.
Currently, pods request resource requirements are considered for computing node resource utilization.

There is another configurable threshold, `targetThresholds`, that is used to compute those potential nodes
from where pods could be evicted. If a node's usage is above targetThreshold for any (cpu, memory, or number of pods),
from where pods could be evicted. If a node's usage is above targetThreshold for any (cpu, memory, number of pods, or extended resources),
the node is considered over utilized. Any node between the thresholds, `thresholds` and `targetThresholds` is
considered appropriately utilized and is not considered for eviction. The threshold, `targetThresholds`,
can be configured for cpu, memory, and number of pods too in terms of percentage.
Expand Down Expand Up @@ -216,14 +226,12 @@ strategies:
```

Policy should pass the following validation checks:
* Only three types of resources are supported: `cpu`, `memory` and `pods`.
* Three basic native types of resources are supported: `cpu`, `memory` and `pods`. If any of these resource types is not specified, all its thresholds default to 100% to avoid nodes going from underutilized to overutilized.
* Extended resources are supported. For example, resource type `nvidia.com/gpu` is specified for GPU node utilization. Extended resources are optional, and will not be used to compute node's usage if it's not specified in `thresholds` and `targetThresholds` explicitly.
* `thresholds` or `targetThresholds` can not be nil and they must configure exactly the same types of resources.
* The valid range of the resource's percentage value is \[0, 100\]
* Percentage value of `thresholds` can not be greater than `targetThresholds` for the same resource.

If any of the resource types is not specified, all its thresholds default to 100% to avoid nodes going
from underutilized to overutilized.

There is another parameter associated with the `LowNodeUtilization` strategy, called `numberOfNodes`.
This parameter can be configured to activate the strategy only when the number of under utilized nodes
are above the configured value. This could be helpful in large clusters where a few nodes could go
Expand Down Expand Up @@ -477,6 +485,9 @@ All strategies are able to configure a priority threshold, only pods under the t
specify this threshold by setting `thresholdPriorityClassName`(setting the threshold to the value of the given
priority class) or `thresholdPriority`(directly setting the threshold) parameters. By default, this threshold
is set to the value of `system-cluster-critical` priority class.

Note: Setting `evictSystemCriticalPods` to true disables priority filtering entirely.

E.g.

Setting `thresholdPriority`
Expand Down Expand Up @@ -510,7 +521,7 @@ does not exist, descheduler won't create it and will throw an error.

### Label filtering

The following strategies can configure a [standard kubernetes labelSelector](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.20/#labelselector-v1-meta)
The following strategies can configure a [standard kubernetes labelSelector](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.21/#labelselector-v1-meta)
to filter pods by their labels:

* `PodLifeTime`
Expand Down Expand Up @@ -544,12 +555,12 @@ strategies:

When the descheduler decides to evict pods from a node, it employs the following general mechanism:

* [Critical pods](https://kubernetes.io/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/) (with priorityClassName set to system-cluster-critical or system-node-critical) are never evicted.
* Pods (static or mirrored pods or stand alone pods) not part of an RC, RS, Deployment or Job are
* [Critical pods](https://kubernetes.io/docs/tasks/administer-cluster/guaranteed-scheduling-critical-addon-pods/) (with priorityClassName set to system-cluster-critical or system-node-critical) are never evicted (unless `evictSystemCriticalPods: true` is set).
* Pods (static or mirrored pods or stand alone pods) not part of an ReplicationController, ReplicaSet(Deployment), StatefulSet, or Job are
never evicted because these pods won't be recreated.
* Pods associated with DaemonSets are never evicted.
* Pods with local storage are never evicted (unless `evictLocalStoragePods: true` is set)
* Pods with PVCs are evicted unless `ignorePvcPods: true` is set.
* Pods with local storage are never evicted (unless `evictLocalStoragePods: true` is set).
* Pods with PVCs are evicted (unless `ignorePvcPods: true` is set).
* In `LowNodeUtilization` and `RemovePodsViolatingInterPodAntiAffinity`, pods are evicted by their priority from low to high, and if they have same priority,
best effort pods are evicted before burstable and guaranteed pods.
* All types of pods with the annotation `descheduler.alpha.kubernetes.io/evict` are eligible for eviction. This
Expand Down Expand Up @@ -584,6 +595,7 @@ packages that it is compiled with.

Descheduler | Supported Kubernetes Version
-------------|-----------------------------
v0.21 | v1.21
v0.20 | v1.20
v0.19 | v1.19
v0.18 | v1.18
Expand Down
5 changes: 2 additions & 3 deletions charts/descheduler/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,8 +59,7 @@ The following table lists the configurable parameters of the _descheduler_ chart
| `priorityClassName` | The name of the priority class to add to pods | `system-cluster-critical` |
| `rbac.create` | If `true`, create & use RBAC resources | `true` |
| `podSecurityPolicy.create` | If `true`, create PodSecurityPolicy | `true` |
| `resources.cpuRequest` | Descheduler container CPU request | `500m` |
| `resources.memoryRequest` | Descheduler container memory request | `256Mi` |
| `resources` | Descheduler container CPU and memory requests/limits | _see values.yaml_ |
| `serviceAccount.create` | If `true`, create a service account for the cron job | `true` |
| `serviceAccount.name` | The name of the service account to use, if not set and create is true a name is generated using the fullname template | `nil` |
| `nodeSelector` | Node selectors to run the descheduler cronjob on specific nodes | `nil` |
| `nodeSelector` | Node selectors to run the descheduler cronjob on specific nodes | `nil` |
2 changes: 1 addition & 1 deletion docs/contributor-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
## Required Tools

- [Git](https://git-scm.com/downloads)
- [Go 1.15+](https://golang.org/dl/)
- [Go 1.16+](https://golang.org/dl/)
- [Docker](https://docs.docker.com/install/)
- [kubectl](https://kubernetes.io/docs/tasks/tools/install-kubectl)
- [kind v0.10.0+](https://kind.sigs.k8s.io/)
Expand Down
14 changes: 10 additions & 4 deletions docs/user-guide.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,17 @@
# User Guide

Starting with descheduler release v0.10.0 container images are available in the official k8s container registry.
* `k8s.gcr.io/descheduler/descheduler`

Also, starting with descheduler release v0.20.0 multi-arch container images are provided. Currently AMD64 and ARM64
container images are provided. Multi-arch container images cannot be pulled by [kind](https://kind.sigs.k8s.io) from
a registry. Therefore starting with descheduler release v0.20.0 use the below process to download the official descheduler
Descheduler Version | Container Image | Architectures |
------------------- |-----------------------------------------------------|-------------------------|
v0.21.0 | k8s.gcr.io/descheduler/descheduler:v0.21.0 | AMD64<br>ARM64<br>ARMv7 |
v0.20.0 | k8s.gcr.io/descheduler/descheduler:v0.20.0 | AMD64<br>ARM64 |
v0.19.0 | k8s.gcr.io/descheduler/descheduler:v0.19.0 | AMD64 |
v0.18.0 | k8s.gcr.io/descheduler/descheduler:v0.18.0 | AMD64 |
v0.10.0 | k8s.gcr.io/descheduler/descheduler:v0.10.0 | AMD64 |

Note that multi-arch container images cannot be pulled by [kind](https://kind.sigs.k8s.io) from a registry. Therefore
starting with descheduler release v0.20.0 use the below process to download the official descheduler
image into a kind cluster.
```
kind create cluster
Expand Down
20 changes: 10 additions & 10 deletions go.mod
Original file line number Diff line number Diff line change
@@ -1,18 +1,18 @@
module sigs.k8s.io/descheduler

go 1.15
go 1.16

require (
github.com/client9/misspell v0.3.4
github.com/gogo/protobuf v1.3.2 // indirect
github.com/spf13/cobra v0.0.5
github.com/spf13/pflag v1.0.5
k8s.io/api v0.20.0
k8s.io/apimachinery v0.20.0
k8s.io/apiserver v0.20.0
k8s.io/client-go v0.20.0
k8s.io/code-generator v0.20.0
k8s.io/component-base v0.20.0
k8s.io/component-helpers v0.20.0
k8s.io/klog/v2 v2.4.0
k8s.io/api v0.21.0
k8s.io/apimachinery v0.21.0
k8s.io/apiserver v0.21.0
k8s.io/client-go v0.21.0
k8s.io/code-generator v0.21.0
k8s.io/component-base v0.21.0
k8s.io/component-helpers v0.21.0
k8s.io/klog/v2 v2.8.0
sigs.k8s.io/mdtoc v1.0.1
)
Loading

0 comments on commit 6ca1099

Please sign in to comment.