Skip to content

Conversation

@pohly
Copy link
Contributor

@pohly pohly commented May 29, 2020

This marks items as completed which are (or soon will be) done.
PRR review will happen as part of merging this PR.

/assign @wojtek-t

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels May 29, 2020
@k8s-ci-robot k8s-ci-robot added kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. sig/storage Categorizes an issue or PR as relevant to SIG Storage. labels May 29, 2020
@pohly pohly force-pushed the storage-capacity-constraints-for-pod-scheduling branch 2 times, most recently from 1073bcb to f111fcf Compare May 29, 2020 14:42
- [X] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input
- [X] (R) Graduation criteria is in place
- [ ] (R) Production readiness review completed
- [X] (R) Production readiness review completed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry - this is super invonvenient for both you and me, but given I can't comment on lines you didn't change I can only do that here:

<unrelated to PRR itself, though I wanted to understand some stuff>

L583: you're saying that for DaemonSet deployment the owner will be a pod. That seems really suspicious. What if I need to update the spec of DS? The pods will get recreated, the objects will get deleted and the new ones will get crated? That doesn't sound like the best option to me...
I think it should be the node (and potentially some controller that will be removing that OwnerRef if the node is no longer targetted by the driver...)

L758: if I disable the feature and never reenable again, then those object will keep staying in etcd. It's probably fine, because we expect to graduate it anyway, but probably worth pointing that out explicitly.

L781: Can you add explicit "Will be added before the transition to beta." (as you added later).

L824: While scalability is not required for Alpha, I really recommend you to think about this question: "*Will enabling / using this feature result in any new API calls?" now.
If we implement it badly, it may not scale in large clusters (overload apiserver/etcd) which would block transition to GA. It's fine to do shortcuts/inefficient implementations for Alpha, but if you do that unconsciously it may require significant unneded changes earlier.
If you say "no", I can't block it, but I highly encourage you to spend a bit time on thinking about it and fill this in.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if I need to update the spec of DS? The pods will get recreated, the objects will get deleted and the new ones will get crated?

Correct.

That doesn't sound like the best option to me... I think it should be the node (and potentially some controller that will be removing that OwnerRef if the node is no longer targetted by the driver...)

Such a controller would have to be part of the driver deployment because Kubernetes itself has no knowledge about how the driver is generating these objects. This then brings us back to the "what if the driver has no chance to clean up" part: simply removing the driver and its operator may leave stale objects behind. I'm not sure how common this mode of operation will be (it was primarily meant for local ephemeral volumes, but now those are going to be handled via "generic ephemeral inline volumes"), so I'd rather keep it very simple and refine later if needed.

L758: if I disable the feature and never reenable again, then those object will keep staying in etcd. It's probably fine, because we expect to graduate it anyway, but probably worth pointing that out explicitly.

So they won't be garbage-collected because the type is disabled? Hmm, I already had such a warning in the paragraph: "However, the new objects will still remain in the etcd database." I've added a "..., they just won't be visible until the feature is enabled again."

L781: Can you add explicit "Will be added before the transition to beta." (as you added later).

Done.

L824: While scalability is not required for Alpha, I really recommend you to think about this question: "*Will enabling / using this feature result in any new API calls?" now.

This has been discussed extensively. I could try to summarize all of those discussions now, but the result would probably be incomplete and/or out-dated because we intend to do performance evaluations of the alpha implementation and then change the implementation before beta. So I'd rather skip this part for alpha.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if I need to update the spec of DS? The pods will get recreated, the objects will get deleted and the new ones will get crated?

Correct.

I don't think this is a behavior we really want. I don't want to block the whole effort on that, but I would like the item to figuring out the behavior of disappearing capacity object on DS updates to really be solved for Beta.

So please add that to the Beta graduation criteria.

L610/L596: why those objects need to be periodically updated? It seems that we need to fetch data from drivers periodically, but if nothing has changed, the updates shouldn't even be sent (we should be watching those objects on provisioner side and if the value is the same in cache, just skip updates)

This has been discussed extensively. I could try to summarize all of those discussions now, but the result would probably be incomplete and/or out-dated because we intend to do performance evaluations of the alpha implementation and then change the implementation before beta. So I'd rather skip this part for alpha.

I think that scalability of this feature is potentially questionable, so I would really like to avoid having something that won't work by-design. I'm fine with not having all the answers there, but at least answering the first question in this section (any new API calls) - should be answered at this point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is a behavior we really want. I don't want to block the whole effort on that, but I would like the item to figuring out the behavior of disappearing capacity object on DS updates to really be solved for Beta.

So please add that to the Beta graduation criteria.

Okay, but let's make it an item about deciding what the desired behavior is and then implementing that.

why those objects need to be periodically updated? It seems that we need to fetch data from drivers periodically, but if nothing has changed, the updates shouldn't even be sent (we should be watching those objects on provisioner side and if the value is the same in cache, just skip updates)

Correct, and that's how it is implemented - there is regular polling, but objects only get updated if changed. I'll update the text to make that clear.

I think that scalability of this feature is potentially questionable

The problem is that scalability may be fine for some use-cases and not for others. I think adding such a feature is worthwhile, it just has to be well-understood and document what the feature is good for.

I'm fine with not having all the answers there, but at least answering the first question in this section (any new API calls) - should be answered at this point.

Okay, I'll try to add something.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought a bit more about this and came to the conclusion that the owner reference handling needs to become optional. We can kill several birds with one stone here: support drivers that are deployed outside of the cluster, reduce watch traffic by filtering which CSIStorageCapacity objects need to be sent to each external-provisioner instance, and avoid recreating all objects during a DaemonSet update. For that we need to introduce labels.

@msau42: can you look at the "storage capacity: optional owner reference + watch filter" commit (currently 33e9f28)? Does this make sense?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that sounds good to me!

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jun 2, 2020
@pohly
Copy link
Contributor Author

pohly commented Jun 2, 2020

@wojtek-t: PR updated, please take another look.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 31, 2020
@pohly
Copy link
Contributor Author

pohly commented Aug 31, 2020

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 31, 2020
@deads2k
Copy link
Contributor

deads2k commented Oct 5, 2020

I came across this trying to dig through old PRR reviews. Can you confirm this is alpha in 1.19 and planned for beta in 1.21?

Also, can you add a trailing space or something to each PRR line so we can see them all? I expanded sections and did not see details around metrics, which I would really like to have.

@pohly
Copy link
Contributor Author

pohly commented Oct 5, 2020

I came across this trying to dig through old PRR reviews. Can you confirm this is alpha in 1.19 and planned for beta in 1.21?

Correct. I know that we are late with this PR, and I already pinged @wojtek-t on Slack to get it moving again.

Also, can you add a trailing space or something to each PRR line so we can see them all? I expanded sections and did not see details around metrics, which I would really like to have.

So far I only filled in the feature gate section in: https://github.com/kubernetes/enhancements/blob/3148611e140dd3af05de796dabc672e356e02ca1/keps/sig-storage/1472-storage-capacity-tracking/README.md#production-readiness-review-questionnaire

With metrics, you were probably thinking of the "Monitoring requirements"? That part still needs to be added... for beta.
My goal right now with this PR is just to get the requirements for alpha fulfilled; alpha didn't need those parts yet.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 3, 2021
@pohly
Copy link
Contributor Author

pohly commented Jan 3, 2021

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 3, 2021
- [X] (R) Test plan is in place, giving consideration to SIG Architecture and SIG Testing input
- [X] (R) Graduation criteria is in place
- [ ] (R) Production readiness review completed
- [X] (R) Production readiness review completed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if I need to update the spec of DS? The pods will get recreated, the objects will get deleted and the new ones will get crated?

Correct.

I don't think this is a behavior we really want. I don't want to block the whole effort on that, but I would like the item to figuring out the behavior of disappearing capacity object on DS updates to really be solved for Beta.

So please add that to the Beta graduation criteria.

L610/L596: why those objects need to be periodically updated? It seems that we need to fetch data from drivers periodically, but if nothing has changed, the updates shouldn't even be sent (we should be watching those objects on provisioner side and if the value is the same in cache, just skip updates)

This has been discussed extensively. I could try to summarize all of those discussions now, but the result would probably be incomplete and/or out-dated because we intend to do performance evaluations of the alpha implementation and then change the implementation before beta. So I'd rather skip this part for alpha.

I think that scalability of this feature is potentially questionable, so I would really like to avoid having something that won't work by-design. I'm fine with not having all the answers there, but at least answering the first question in this section (any new API calls) - should be answered at this point.


When disabling support in the API server, the new object type and
thus all objects will disappear together with the new flag in
will disappear together with the new flag in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What exact behavior the feature gate is controlling on the kube-apiserver side?

It seems to me that kube-apiserver is more treated to passing the state and nothing really happens there...

Another question is - how we can control the behavior in the external-provisioner side.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What exact behavior the feature gate is controlling on the kube-apiserver side?

There's no specific logic in kube-apiserver besides offering the API. It has to be a builtin type because there was (and IMHO still is) no way to to ensure that a Kubernetes cluster has a CRD installed. Another unsolved (?) issue is generating Go bindings for a CRD that then can be used by kube-scheduler.

Another question is - how we can control the behavior in the external-provisioner side.

There are command line flags for it: https://github.com/kubernetes-csi/external-provisioner/blob/master/README.md#capacity-support

Cluster admins can influence that when deploying a CSI driver.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK - I think we need to distringuish two different things in this question:

  • when the feature is disabled itself (but no apiserver rollback - so API is still there)
  • when there is version rollback (so API is also rolled back)

Can you make those a bit more explicit.
[It's a different thing, but might be work looking into this one: https://github.com//pull/2199/files#diff-b18c53698772d9f0dcc15e9f35d90ecd0f181923f33e40115036991aa0b5f1c5R236 ]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depending on how thoroughly the feature gets disabled, the API is not there anymore. It will be gone when any of the following happens for the apiserver:

  • the v1alpha1 API group gets disabled (not relevant anymore once it is beta)
  • the feature is turned off for the apiserver
  • the apiserver is rolled back to a version without the API

In this section, I already tried to discuss disabling the feature in the three places where it could be disabled separately (apiserver, scheduler, CSI driver). I'm not sure what to add?

`CSIDriver`, which will then cause kube-scheduler to revert to
provisioning without capacity information. However, the new objects
will still remain in the etcd database.
will still remain in the etcd database, they just won't be visible.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm - so what exactly is happening. Will the CRD itself be deleted?
If so, the underlying CRs will also get deleted, no?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually - is it CRD or is it a built-in API? I think it's not explicitly mentioned anywhere...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Built-in, see #1829 (comment).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've clarified that at the top of the API section.

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jan 26, 2021
@pohly pohly changed the title storage capacity: finish 1.19 review storage capacity: PRR and beta review Jan 27, 2021
@pohly pohly force-pushed the storage-capacity-constraints-for-pod-scheduling branch from c0b1f88 to d2daf15 Compare January 28, 2021 10:49
CRD because kube-scheduler will depend on this API and ensuring that a
CRD gets installed and updated together with the core Kubernetes is
still an unsolved problem.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: fix indentation for types defined below

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

While external-provisioner runs, it needs to update and potentially
The owner should be the higher-level app object which defines the provisioner pods:
- the Deployment or StatefulSet for central provisioning
- the DaemonSet for distributed provisioning
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm glad to see this was changed (comparing to what it initially was).

there are multiple instances with leadership election or the driver
gets upgraded.

This ownership reference is optional. It gets enabled with
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason for making it optional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can't be mandatory because drivers that aren't deployed inside Kubernetes have nothing that can serve as owner.

We could remove it complete, but I find it useful because admins don't need to worry about manually removing objects after removing the "normal" driver objects that they know about (StatefulSet, DaemonSet, etc.).

That leaves keeping it as an optional feature.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be useful to provide this context in the KEP itself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

`CSIStorageCapacity` objects after uninstalling a CSI driver. To
simplify that and for efficient access to objects, external-provisioner
sets some labels:
- `app.kubernetes.io/instance`: the CSI driver name
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the decision of API approvers, but I'm not a fan of this name.
Can we change that so that it will be clear that it's about csi-driver name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I picked app.kubernetes.io because it seemed useful to follow conventions; in this case, the CSI driver is the "app". But I don't have a strong opinion about it. Some other, external-provisioner specific label also works. How about external-provisioner.k8s.io/drivername?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be API approver decision, but I like the external-provisioner.k8s.io/drivername much more :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me add that these labels aren't part of the CSIStorageCapacity API, nor was the owner reference. The scheduler doesn't know about them and doesn't use them.

Instead, this is an implementation detail of the external-provisioner. It needs to be documented as part of the external-provisioner because admins may need that information. The name of the labels could even be configurable, although I don't see a need for that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree I think we should scope down the label key to be more specific and matching the format of other labels we've added for csi, like "csi.storage.k8s.io"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed it to

csi.storage.k8s.io/drivername=<driver name>
csi.storage.k8s.io/managed-by=external-provisioner|external-provisioner-<host name>

simplify that and for efficient access to objects, external-provisioner
sets some labels:
- `app.kubernetes.io/instance`: the CSI driver name
- `app.kubernetes.io/managed-by`: `external-provisioner` for central provisioning,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here - why "app"?

Copy link
Contributor Author

@pohly pohly Jan 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This then becomes external-provisioner.k8s.io/managed-by=central or external-provisioner.k8s.io/managed-by=<hostname>.

Yes.

Enabling it in apiserver and CSI drivers will cause
`CSIStorageCapacity` objects to be created respectively updated. The
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "created and respective updated" ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"created or updated"

centralized provisioning, the number of segments is probably low. For
distributed provisioning, the each node where the driver runs
represents one segment. The rate at which objects depends on how often
storage usage changes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to be more specific, e.g.:

  • distributest provising:
    #CSIStorageCapacity per each (node, storage-class) pair [is that right?]

For rate of changes, we can probably estimate it:

  • a creation whenever node is create
  • an update whenever: some volume is created/resized/deleted (i.e. bounded by volume api calls) or storage underneath is changed (which probably requires some operator action)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

provider?**

A CSI driver might have to query the storage backend more often to be
kept informed about available storage capacity.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the plans? Can we ensure rate-limiting?

In distributed-mode, can we ensure that we won't overloade the provider (we had problems like that in the past).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Distributed provisioning is meant for CSI drivers which manage local storage and don't have a backend cloud provider. If they had one, there would be no reason to use distributed provisioning.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For central provisioning with a cloud provider we have rate limiting because objects that need to be updated or created go through a rate limiting workqueue.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add this here (i.e. distributed is not meant to be used by cloud-provider stuff, and for central we have rate-limiting).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

* **Will enabling / using this feature result in increasing size or count
of the existing API objects?**

No.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CSIDriver by adding new field (it's fine - just make it explicit)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, forgot about that one because it didn't matter when thinking about scalability. Fixed.


No. Scheduling of pods without volumes are not expected to be
affected. Scheduling of pods with volumes is not covered by existing
SLIs/SLOs.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the link, I rewrote the section.

@k8s-ci-robot k8s-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Jan 29, 2021
existing CSI sidecar metrics support with the
`csi_sidecar_operations_seconds_count` metric with labels
`driver_name=<csi driver name>`, `grpc_status_code=OK`,
`method_name=CreateVolume`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be metrics for 'GetCapacity' rpc calls too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, but those might also come from the code which handles immediate binding when doing distributed provisioning, so merely having a non-zero count of GetCapacity calls does not imply that this feature here is in use.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To determine if the feature is working, it may be useful to see if GetCapacity calls are returning errors?

I agree it's not particularly useful on its own, but in combination with other metrics, we can use it to get a more complete picture of the various failure points.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Michelle. I think we should focus on this particular feature, not the whole "end-to-end volume provisioning".

I like the idea of GetCapacity rpcs. metrics. Plus the CSIStorageCapacity object existence.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also have metrics for GetCapacity calls and can distinguish between successful and failing ones. I've focused on that instead of volume creation.

* the same for adding or removing storage classes (both modes)
* updates when volumes are created/resized/deleted (thus bounded by
some other API calls)
* updates when capacity in the underlying storage system is changed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or when a PV gets provisioned

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's covered by "when volumes are created".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see "when volumes are created" in this list. Can we be more explicit here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's mentioned above:
"when volumes are created/resized/deleted"

:)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's mentioned above:
"when volumes are created/resized/deleted"

:)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry I need new glasses :)

* **Will enabling / using this feature result in increasing size or count
of the existing API objects?**

One new field gets added to `CSIDriver`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

specify that it's a bool (so we get sense of size)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

* **What are other known failure modes?**

* **What steps should be taken if SLOs are not being met to determine the problem?**
The API server might get overloaded.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by CSIStorageCapacity updates?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's clearer, added.

- "@saad-ali"
- "@xing-yang"
approvers:
- "@saad-ali"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also add my name here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I've added you as reviewer and approver.

Copy link
Contributor Author

@pohly pohly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed another revision.

I'm not squashing commits yet, but probably should before merging.

CRD because kube-scheduler will depend on this API and ensuring that a
CRD gets installed and updated together with the core Kubernetes is
still an unsolved problem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

there are multiple instances with leadership election or the driver
gets upgraded.

This ownership reference is optional. It gets enabled with
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

existing CSI sidecar metrics support with the
`csi_sidecar_operations_seconds_count` metric with labels
`driver_name=<csi driver name>`, `grpc_status_code=OK`,
`method_name=CreateVolume`.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, but those might also come from the code which handles immediate binding when doing distributed provisioning, so merely having a non-zero count of GetCapacity calls does not imply that this feature here is in use.

* the same for adding or removing storage classes (both modes)
* updates when volumes are created/resized/deleted (thus bounded by
some other API calls)
* updates when capacity in the underlying storage system is changed
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's covered by "when volumes are created".

* **Will enabling / using this feature result in increasing size or count
of the existing API objects?**

One new field gets added to `CSIDriver`.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

* **What are other known failure modes?**

* **What steps should be taken if SLOs are not being met to determine the problem?**
The API server might get overloaded.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's clearer, added.

- "@saad-ali"
- "@xing-yang"
approvers:
- "@saad-ali"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I've added you as reviewer and approver.

@pohly pohly force-pushed the storage-capacity-constraints-for-pod-scheduling branch from 4787e42 to 0f8594e Compare January 29, 2021 19:44
Copy link
Member

@wojtek-t wojtek-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to resolve some of the already addressed comments, but it seems I lost permissions to do that....

In external-provisioner 2.0.0, this was achieved by checking the owner
reference. Now that the owner reference is optional, it will get
achieved by ignoring all objects that don't have the
`csi.storage.k8s.io/managed-by` labels described above.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please also explicitly write down what happens if such object without managed-by label exists?

I'm assuming:

  • external-provisioner does not create its own
  • it also doesn't change that manually managed ones

But I would like to see that explicitly mentioned.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

reference. Now that the owner reference is optional, it will get
achieved by ignoring all objects that don't have the
`app.kubernetes.io/managed-by` labels described above.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know what changed - but I'm not longer able to resolve comments, even my own...

* **Can the feature be disabled once it has been enabled (i.e. can we rollback
the enablement)?**
Yes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment for Line 821 "Does enabling the feature change any default behavior".

Actually - I think the answer should be "it doesn't".
Because technically, without any changes from the user (just cluster operator), pods will start to be scheduled on other nodes.
[This is supposed to be change for the better to reduce problematic cases, but there is a change.]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Does enabling the feature change any default behavior" currently says "No changes for existing storage driver deployments." which is the same as "it doesn't".

Or did you mean to say that the answer should be "It does change default behavior"?

This depends on the interpretation of "the feature". My "no changes" was based on enabling "the feature" in Kubernetes, without enabling it in a CSI driver. Only once it is also enabled in a CSI driver, then there will be a change.

I'll put both in so that it is clearer.

the enablement)?**
Yes.

When disabling support in the API server, the new object type and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: remove "and "

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.


If the second phase fails because a driver malfunctions or overloads
the apiserver, then it can be rolled back and scheduling again happens
without using storage capacity information.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this particular example, enabling it just in kube-apiserver should not fail in any way. It's just a new type that isn't getting used. Sorry for my lack of imagination

Imagine that e.g. conversion functions are broken for the type and they are crashing kube-apiserver. Then if there was a preexisting object, kube-apiserver won't even start...

I would probably mentioned two things in this case:

  • unexpected kube-apiserver restarts
  • increased number of 5xx errors from CSIStorageCapacity-related calls

existing CSI sidecar metrics support with the
`csi_sidecar_operations_seconds_count` metric with labels
`driver_name=<csi driver name>`, `grpc_status_code=OK`,
`method_name=CreateVolume`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Michelle. I think we should focus on this particular feature, not the whole "end-to-end volume provisioning".

I like the idea of GetCapacity rpcs. metrics. Plus the CSIStorageCapacity object existence.

provisioned volumes per second and few total number of volumes overall
may be fine for some workloads (long-running applications which use
specialize storage like PMEM) while others may need a much higher rate
(adding a local LVM scratch volume to every pod in the system).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds fine to me. But then let's remove this paragraph :)

* the same for adding or removing storage classes (both modes)
* updates when volumes are created/resized/deleted (thus bounded by
some other API calls)
* updates when capacity in the underlying storage system is changed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's mentioned above:
"when volumes are created/resized/deleted"

:)

* the same for adding or removing storage classes (both modes)
* updates when volumes are created/resized/deleted (thus bounded by
some other API calls)
* updates when capacity in the underlying storage system is changed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's mentioned above:
"when volumes are created/resized/deleted"

:)

provider?**

A CSI driver might have to query the storage backend more often to be
kept informed about available storage capacity.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add this here (i.e. distributed is not meant to be used by cloud-provider stuff, and for central we have rate-limiting).

Copy link
Contributor Author

@pohly pohly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New commit pushed.

when the cluster is not really running out of storage capacity.

Another is a degradation in apiserver metrics (increased CPU or memory
consumption, increased latency).
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have documentation for those that I can look up? I don't have metrics gathering enabled for my local test cluster, so I cannot look this up easily.


If the second phase fails because a driver malfunctions or overloads
the apiserver, then it can be rolled back and scheduling again happens
without using storage capacity information.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

The feature is used if all of the following points apply:
- `CSIDriver.spec.storageCapacity` is true for a CSI driver.
- There are storage classes for that driver.
- Volumes are using those storage classes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would have to go into kube-scheduler. What should that metric count? Number of times that a storage capacity was the deciding factor between using or not using a node?

There are no metrics at all that expose why nodes are chosen as far as I know - or at least there are none in the part that is involved with volumes.

IMHO this needs more thought; a new "enhance metrics for volume scheduling" KEP might be a better place for it.

provider?**

A CSI driver might have to query the storage backend more often to be
kept informed about available storage capacity.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

Copy link
Member

@wojtek-t wojtek-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few more comments.


When disabling support in the API server, the new object type and
thus all objects will disappear together with the new flag in
When disabling support in the API server, the new object type
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to distringuish two things:

  • version rollback - in which case the type will no longer be server (but please be explicit it's about version rollback)
  • feature disablemenet (via feature gate) - in this case the type will still be visible

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps it got lost in the GitHub noise, but I replied to that already here: https://github.com/kubernetes/enhancements/pull/1829/files/b039a48bffcab54f98975f4dfdb2df7e0f9944ad#r566714065

I said:

Depending on how thoroughly the feature gets disabled, the API is not there anymore. It will be gone when any of the following happens for the apiserver:

  • the v1alpha1 API group gets disabled (not relevant anymore once it is beta)
  • the feature is turned off for the apiserver
  • the apiserver is rolled back to a version without the API

In this section, I already tried to discuss disabling the feature in the three places where it could be disabled separately (apiserver, scheduler, CSI driver). I'm not sure what to add?

Just to be very clear, in the second case that you mentioned ("feature disablement (via feature gate)") the type will also be gone and not visible. So there is really no difference between that case and version rollback.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be very clear, in the second case that you mentioned ("feature disablement (via feature gate)") the type will also be gone and not visible.

It shouldn't - we generally don't hide type registration behind feature gate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somehow this is common for storage types. For example: https://github.com/kubernetes/kubernetes/blob/73dd5c840662bb066a146d0871216333181f4b64/pkg/registry/storage/rest/storage_storage.go#L98-L105

@wojtek-t and I were discussing this on chat and weren't sure whether this is a good or bad thing. He wanted to clarify with Jordan.

In the meantime I have changed the first sentence to make it clear that registration is currently depending on the feature gate.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just briefly chatted with Jordan about this offline.

Given that APIs are controlled via --runtime-config flag anyway, then having that controlled via feature gate doesn't bring any value. So the consensus we get to is that you basically should remove that feature gate from there.

[And then it will evolve to situation I was suggesting above.]

Please also add to the KEP somewhere that feature-gate will be removed from controlling API registration.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that APIs are controlled via --runtime-config flag anyway, then having that controlled via feature gate doesn't bring any value. So the consensus we get to is that you basically should remove that feature gate from there.

That argument currently applies because it is in the v1alpha1 group. But once we move it to v1beta1 in 1.20, it will become impossible to remove just that one type.

I can see pros and cons for both approaches: when the apiserver can disable the type, the cluster can be recovered from a malfunctioning CSI driver quickly by disabling the type and later fixing the CSI driver. The downside of disabling the type is that it is impossible to delete stale objects.

Overall I prefer to not have the feature gate check.

Please also add to the KEP somewhere that feature-gate will be removed from controlling API registration.

Done.

when the cluster is not really running out of storage capacity.

Another is a degradation in apiserver metrics (increased CPU or memory
consumption, increased latency).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The feature is used if all of the following points apply:
- `CSIDriver.spec.storageCapacity` is true for a CSI driver.
- There are storage classes for that driver.
- Volumes are using those storage classes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK - let me take a step back here.

Basically - this isn't a feature of a "running workload" per-se. It only affects on which node (if any) it will get scheduled.

So let's maybe add something like that. at the beginning.

As for the second point, I think the scheduler metric is exactly what we want.
I don't care that much if it was a deciding factor or not. But I would like to see if we were attempting to use to that information.

As a first step in this KEP, I was thinking about a metric like: "I (scheduler) wanted to check CSIStorageCapacity object and it didn't exist for that Node". But now I realize that it doesn't really have to be an error, because there may not be particular type of storage on that node, right?

So let's maybe make one more step back and think about producing CSIStorageCapacity object. In L662 you propose a central-provisioner that will be GC-ing orphaned objects. Can we extend it a bit so that:

  • it will be exposing a metric for number of desired CSIStorageCapacity object splitted by type
  • ?

provisioned volumes per second and few total number of volumes overall
may be fine for some workloads (long-running applications which use
specialize storage like PMEM) while others may need a much higher rate
(adding a local LVM scratch volume to every pod in the system).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping - let's remove this paragraph

Copy link
Contributor Author

@pohly pohly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New commit pushed.

provisioned volumes per second and few total number of volumes overall
may be fine for some workloads (long-running applications which use
specialize storage like PMEM) while others may need a much higher rate
(adding a local LVM scratch volume to every pod in the system).
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

The feature is used if all of the following points apply:
- `CSIDriver.spec.storageCapacity` is true for a CSI driver.
- There are storage classes for that driver.
- Volumes are using those storage classes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I left out more advanced metrices for scheduling because this needs more overall planning.

when the cluster is not really running out of storage capacity.

Another is a degradation in apiserver metrics (increased CPU or memory
consumption, increased latency).
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added that name as a link.


When disabling support in the API server, the new object type and
thus all objects will disappear together with the new flag in
When disabling support in the API server, the new object type
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somehow this is common for storage types. For example: https://github.com/kubernetes/kubernetes/blob/73dd5c840662bb066a146d0871216333181f4b64/pkg/registry/storage/rest/storage_storage.go#L98-L105

@wojtek-t and I were discussing this on chat and weren't sure whether this is a good or bad thing. He wanted to clarify with Jordan.

In the meantime I have changed the first sentence to make it clear that registration is currently depending on the feature gate.

achieved by ignoring all objects that don't have the
`csi.storage.k8s.io/managed-by` labels described above.

If such objects are created manually, then it is the responsibility of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We made CSICapacity objects namespaced to handle the ownerref right? Does that mean anyone with namespace permissions could interfere with the scheduling of other pods? We should double check with @kubernetes/sig-auth-pr-reviews on this security model.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a valid concern, but we've discussed this already. Permission to create CSIStorageCapacity objects is not normally granted to users in a namespace, so they cannot interfere with pod scheduling - neither of their own nor of pods in other namespaces.

Someone who can create CSIStorageCapacity objects has cluster admin permissions and while they might make mistakes, this is not a privilege escalation or denial of service attack.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, confirmed with Jordan that objects are opt-in to the default namespace editor policy


For efficiency reasons, external-provisioner instances watch
`CSIStorageCapacity` objects in their namespace and filter already in
the apiserver by the `app.kubernetes.io/instance` and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update these labels

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

following metric data that will be provided by external-provisioner instances:
- total number of `CSIStorageCapacity` objects that the external-provisioner
is currently meant to manage for the driver
- actual number of such objects that currently exist
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to determine what count of these objects are stale so that it doesn't look like we are 100% covered?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With "stale" you mean the topology/storage class pair, not the "freshness" of the capacity value, right?

That's doable. Instead of that one item with "that currently exist" I now have:

  • number of such objects that currently exist and can be kept because they have a topology/storage class pair that is still valid
  • number of such objects that currently exist and need to be deleted because they have an outdated topology/storage class pair

* the same for adding or removing storage classes (both modes)
* updates when volumes are created/resized/deleted (thus bounded by
some other API calls)
* updates when capacity in the underlying storage system is changed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry I need new glasses :)

@k8s-ci-robot k8s-ci-robot added the sig/auth Categorizes an issue or PR as relevant to SIG Auth. label Feb 2, 2021
This marks items as completed which are (or soon will be) done for
beta. PRR review will happen as part of merging this.
@pohly pohly force-pushed the storage-capacity-constraints-for-pod-scheduling branch from 09e8c12 to f70ed9d Compare February 3, 2021 09:51
@pohly
Copy link
Contributor Author

pohly commented Feb 3, 2021

I've addressed @msau42's latest comments and squashed everything into a single commit.

be treated with exponential backoff just like other communication
issues with the API server.
The new flag in `CSIDriver` will always be removed when disabling the
feature gate or when rolling back to a release older than 1.19. In that
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re feature-gate disablement:
If the field is set and the FG was disabled, this shouldn't automatically delete the field - we should only drop the field if someone is trying to set it, but not if that already exists (unless someone explicitly removes it).

It's basically this pattern of "if FD disabled and field not in use", e.g.:
https://github.com/kubernetes/kubernetes/blob/074a5177209c3605daf11cfe6f477fddce0d55c0/pkg/api/pod/util.go#L450

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, this is not how any of the other fields in storage objects work. That's what was agreed upon with the API reviewers. I am not sure why it is being done, but I am reluctant to change it without a wider discussion.

Can we merge the KEP as-is and I bring it up when doing the API change?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the reasons for having PRR is also to achieve predictibility and consistency across the system. Especially for Beta and GA features.
If we're doing something differently then anywhere else, I would like to understand the clear reasons for that and have that documented. If you write "I am not sure why it is being done", I would really like to hear API approvers voice now.

@liggitt - can you please chime in?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Allowing existing data to be preserved on update is required. This ensures that an n-1 server where the feature gate is not yet enabled by default will not cause data loss in a skewed multi-server cluster during upgrade. Please update that detail here to avoid confusion before merge.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not how any of the other fields in storage objects work

Do you have pointers to those? If we are dropping field values on update without regard for existing persisted data, we'll need to open issues to resolve that.

For reference, the ephemeral volume field is following the expected pattern correctly:

https://github.com/kubernetes/kubernetes/blob/074a5177209c3605daf11cfe6f477fddce0d55c0/pkg/api/pod/util.go#L568-L576

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree - scheduler should revert by just switching of feature gate.

And this generated a significant issue that:

  • I create an object with a field set
  • FG gets disabled (and that silently drops the field)
  • FG gets reenabled, and the object I created with a field set has no longer the field set.

I don't know why this wasn't caught in the review, but that sounds like an omission. I don't think that justifies leaving that for longer. We should fix that (and do the same thing we're doing everywhere else).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have pointers to those? If we are dropping field values on update without regard for existing persisted data, we'll need to open issues to resolve that.

This is the commit which added the additional field: kubernetes/kubernetes@1089954#diff-1918b63c53bdbabb45108a29e45667799fd5674bc5e04b1dd01c24131a79308f

I was doing the same as for VolumeLifecycleModes. If I read the code in pkg/registry/storage/csidriver/strategy.go right, updating a CSIDriver object while the feature is disabled preserves the field. I think that addresses the concern about data loss.

But should that field be sent to clients when they ask for a v1 CSIDriver object and the apiserver has the feature disabled? My understanding is that @wojtek-t wants it to be sent whereas the current implementation of that and other fields doesn't do it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... or at least I think it doesn't. Let me double-check that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wrong, sorry for the confusion. "kubectl get" doesn't show the field after disabling the feature in the apiserver, but it's still there in the actual object. "kubect get -o yaml" shows it.

That means I need to update the KEP to reflect the implementation, but the implementation itself should be okay.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've pushed a commit with the KEP change. Let's keep it as a separate commit to document in eternity how confused I was about this... 😅

The implementation in the apiserver already preserves the field and
delivers it to clients, which is the exact opposite of how it was
described in the KEP.
- "@wojtek-t"
see-also:
- "https://docs.google.com/document/d/1WtX2lRJjZ03RBdzQIZY3IOvmoYiF5JxDX35-SsCIAfg"
latest-milestone: "v1.19"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi there, 1.21 enhancements lead here.
Could you also make a change to the latest milestone?

Suggested change
latest-milestone: "v1.19"
latest-milestone: "v1.21"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@msau42
Copy link
Member

msau42 commented Feb 3, 2021

/lgtm
/approve

Thanks for working through everything and making this very comprehensive!

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 3, 2021
Copy link
Member

@wojtek-t wojtek-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two last very minor comments - LGTM once fixed.

be treated with exponential backoff just like other communication
issues with the API server.
The new flag in `CSIDriver` will be preserved when disabling the
feature gate or when rolling back to a release older than 1.19. kube-scheduler
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on rollback it won't be preserved, because CSIDriver type will not have that field in older release.

So it's only preserved on feature gate disablement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

they have a topology/storage class pair that is still valid
- number of such objects that currently exist and need to be deleted
because they have an outdated topology/storage class pair
- work queue length for creating, updating or deleting objects
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a TODO to fill in metric names here explicit once those will get implemented.

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 4, 2021
@wojtek-t
Copy link
Member

wojtek-t commented Feb 4, 2021

/lgtm
/approve

Thanks!

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 4, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: msau42, pohly, wojtek-t

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 4, 2021
@k8s-ci-robot k8s-ci-robot merged commit 0a495c8 into kubernetes:master Feb 4, 2021
@k8s-ci-robot k8s-ci-robot added this to the v1.21 milestone Feb 4, 2021
@pohly pohly mentioned this pull request Feb 9, 2021
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. sig/auth Categorizes an issue or PR as relevant to SIG Auth. sig/storage Categorizes an issue or PR as relevant to SIG Storage. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants