Skip to content

Commit

Permalink
storage: CSIStorageCapacity
Browse files Browse the repository at this point in the history
This is the initial documentation for one new feature:
- kubernetes/enhancements#1472
  • Loading branch information
pohly committed Jul 13, 2020
1 parent 38a5d01 commit 247d9ac
Show file tree
Hide file tree
Showing 3 changed files with 120 additions and 2 deletions.
113 changes: 113 additions & 0 deletions content/en/docs/concepts/storage/storage-capacity.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
---
reviewers:
- jsafrane
- saad-ali
- msau42
- xing-yang
- pohly
title: Storage Capacity
content_type: concept
weight: 45
---

<!-- overview -->

Storage capacity is limited and may vary depending on the node on
which a pod runs: network-attached storage might not be accessible by
all nodes, or storage is local to a node to begin with.

This page describes how Kubernetes keeps track of storage capacity and
how the scheduler uses that information to schedule pods.

<!-- body -->


## Enabling the feature

Storage capacity tracking is an *alpha feature* and only enabled when
the `CSIStorageCapacity` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) is enabled. A quick check
whether a Kubernetes cluster supports the feature is to list
`CSIStorageCapacity` objects with:
```shell
kubectl get csistoragecapacities --all-namespaces
```

If supported, the response will a list of objects or:
```
No resources found
```

If not supported, this error is printed instead:
```
error: the server doesn't have a resource type "csistoragecapacities"
```

In addition to enabling the feature in the cluster, a [CSI
driver](/docs/concepts/storage/volumes/#csi) deployment also has to
support it. Please refer to the driver's documentation for
details. The feature is not supported for non-CSI storage systems.

Without this support, there will be no information about storage
capacity available through the driver and the scheduler will schedule
Pods with volumes provided by the driver without looking for capacity
information.

## API

There are two API extensions for this feature:
- [`CSIStorageCapacity` objects](/docs/reference/generated/kubernetes-api/{{< param "version" >}}/#csistoragecapacity-v1alpha1-storage-k8s-io):
these get produced by a CSI driver in the namespace
where the driver is installed. Each object contains capacity
information for one storage class and defines which nodes have
access to that storage.
- [The `CSIDriverSpec.StorageCapacity` field](/docs/reference/generated/kubernetes-api/{{< param "version" >}}/#csidriverspec-v1-storage-k8s-io):
when set to `true`, the Kubernetes scheduler will consider storage
capacity for volumes that use the CSI driver.

## Scheduling

Storage capacity information is used by the Kubernetes scheduler if:
- the `CSIStorageCapacity` feature gate is true,
- a Pod uses a volume that has not been created yet,
- that volume uses a storage class which references a CSI driver and
uses [`WaitForFirstConsumer` volume binding
mode](/docs/concepts/storage/storage-classes/#volume-binding-mode),
and
- the `CSIDriver` object for the driver has `StorageCapacity` set to
true.

In that case, the scheduler only considers nodes for the Pod which
have enough storage available to them. This check is very
simplistic and only compares the size of the volume against the
capacity listed in `CSIStorageCapacity` objects with a topology that
includes the node. Without storage capacity tracking, nodes are picked
without this check.

For volumes with `Immediate` volume binding mode, the storage driver
decides where to create the volume, independently of Pods that will
use the volume. The scheduler then schedules Pods onto nodes where the
volume is available after the volume has been created.

For [CSI ephemeral volumes](/docs/concepts/storage/volumes/#csi),
scheduling always happens without considering storage capacity. This
is based on the assumption that this volume type is only used by
special CSI drivers which are local to a node and do not need
significant resources there.

## Rescheduling

When a node has been selected for a Pod with `WaitForFirstConsumer`
volumes, that decision is still tentative. The next step is that the
CSI storage driver gets asked to create the volume with a hint that the
volume is supposed to be available on the selected node.

Because Kubernetes might have chosen a node based on out-dated
capacity information, it is possible that the volume cannot really be
created. The node selection is then reset and the Kubernetes scheduler
tries again to find a node for the Pod.

## {{% heading "whatsnext" %}}

- For more information on the design, see the
[Storage Capacity Constraints for Pod Scheduling KEP](https://github.com/kubernetes/enhancements/blob/master/keps/sig-storage/1472-storage-capacity-tracking/README.md).
- For more information on further development of this feature, see the [enhancement tracking issue #1472](https://github.com/kubernetes/enhancements/issues/1472).
7 changes: 5 additions & 2 deletions content/en/docs/concepts/storage/volumes.md
Original file line number Diff line number Diff line change
Expand Up @@ -1291,8 +1291,11 @@ Once a CSI compatible volume driver is deployed on a Kubernetes cluster, users
may use the `csi` volume type to attach, mount, etc. the volumes exposed by the
CSI driver.

The `csi` volume type does not support direct reference from Pod and may only be
referenced in a Pod via a `PersistentVolumeClaim` object.
A `csi` volume can be used in a pod in three different ways:
- through a reference to a [`persistentVolumeClaim`](#persistentvolumeclaim)
- with a [generic ephemeral volume](/docs/concepts/storage/ephemeral-volumes/)
- with a [CSI ephemeral volume](#csi-ephemeral-volume) if the driver
supports that

The following fields are available to storage administrators to configure a CSI
persistent volume:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,7 @@ different Kubernetes components.
| `CSIMigrationGCEComplete` | `false` | Alpha | 1.17 | |
| `CSIMigrationOpenStack` | `false` | Alpha | 1.14 | |
| `CSIMigrationOpenStackComplete` | `false` | Alpha | 1.17 | |
| `CSIStorageCapacity` | `false` | Alpha | 1.19 | |
| `ConfigurableFSGroupPolicy` | `false` | Alpha | 1.18 | |
| `CustomCPUCFSQuotaPeriod` | `false` | Alpha | 1.12 | |
| `CustomResourceDefaulting` | `false` | Alpha| 1.15 | 1.15 |
Expand Down Expand Up @@ -388,6 +389,7 @@ Each feature gate is designed for enabling/disabling a specific feature:
- `CSIPersistentVolume`: Enable discovering and mounting volumes provisioned through a
[CSI (Container Storage Interface)](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/storage/container-storage-interface.md)
compatible volume plugin.
- `CSIStorageCapacity`: Enables CSI drivers to publish storage capacity information and the Kubernetes scheduler to use that information when scheduling pods. See [Storage Capacity](/docs/concepts/storage/storage-capacity/).
Check the [`csi` volume type](/docs/concepts/storage/volumes/#csi) documentation for more details.
- `CustomCPUCFSQuotaPeriod`: Enable nodes to change CPUCFSQuotaPeriod.
- `CustomPodDNS`: Enable customizing the DNS settings for a Pod using its `dnsConfig` property.
Expand Down

0 comments on commit 247d9ac

Please sign in to comment.