Skip to content

Commit 8f8592b

Browse files
committed
storage capacity: more review feedback
1 parent b039a48 commit 8f8592b

File tree

1 file changed

+16
-12
lines changed
  • keps/sig-storage/1472-storage-capacity-tracking

1 file changed

+16
-12
lines changed

keps/sig-storage/1472-storage-capacity-tracking/README.md

Lines changed: 16 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -844,7 +844,8 @@ enhancement:
844844
the enablement)?**
845845
Yes.
846846

847-
When disabling support in the API server, the new object type
847+
Registration of the `CSIStorageCapacity` type is controlled by the feature
848+
gate, so when disabling support in the API server, the new object type
848849
will disappear together with the new flag in
849850
`CSIDriver`, which will then cause kube-scheduler to revert to
850851
provisioning without capacity information. However, the new objects
@@ -902,7 +903,8 @@ events that quote `node(s) did not have enough free storage` as reason
902903
when the cluster is not really running out of storage capacity.
903904

904905
Another is a degradation in apiserver metrics (increased CPU or memory
905-
consumption, increased latency).
906+
consumption, increased latency), specifically
907+
[`apiserver_request_duration_seconds`](https://github.com/kubernetes/kubernetes/blob/645c40fcf6f1fca133a00c8186674bcbcecc4b8e/staging/src/k8s.io/apiserver/pkg/endpoints/metrics/metrics.go#L98).
906908

907909
* **Were upgrade and rollback tested? Was upgrade->downgrade->upgrade path tested?**
908910

@@ -917,10 +919,18 @@ No.
917919

918920
* **How can an operator determine if the feature is in use by workloads?**
919921

920-
The feature is used if all of the following points apply:
921-
- `CSIDriver.spec.storageCapacity` is true for a CSI driver.
922-
- There are storage classes for that driver.
923-
- Volumes are using those storage classes.
922+
The feature itself is not used by workloads. It is used when
923+
scheduling workloads onto nodes, but not while those run.
924+
925+
That a CSI driver provides storage capacity information can seen in the
926+
following metric data that will be provided by external-provisioner instances:
927+
- total number of `CSIStorageCapacity` objects that the external-provisioner
928+
is currently meant to manage for the driver
929+
- actual number of such objects that currently exist
930+
- work queue length for creating, updating or deleting objects
931+
932+
The CSI driver name will be used as label. When using distributed
933+
provisioning, the node name will be used as additional label.
924934

925935
* **What are the SLIs (Service Level Indicators) an operator can use to
926936
determine the health of the service?**
@@ -937,12 +947,6 @@ calls will be recorded with their non-OK status code as value.
937947

938948
* **What are the reasonable SLOs (Service Level Objectives) for the above SLIs?**
939949

940-
This depends on the workload and CSI driver. A very low rate of
941-
provisioned volumes per second and few total number of volumes overall
942-
may be fine for some workloads (long-running applications which use
943-
specialize storage like PMEM) while others may need a much higher rate
944-
(adding a local LVM scratch volume to every pod in the system).
945-
946950
The goal is to achieve the same provisioning rates with the feature
947951
enabled as those that currently can be achieved without it.
948952

0 commit comments

Comments
 (0)