Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add per version field section to custom resource webhook conversion KEP #2737

Merged
merged 1 commit into from
Oct 24, 2018

Conversation

mbohlool
Copy link
Contributor

@mbohlool mbohlool commented Oct 2, 2018

The defaulting section of customresource webhook conversion KEP was backward incompatible. This PR suggest a new approach. Different approaches as well as the problem description is detailed in this document.

@liggitt @lavalamp @deads2k @sttts @erictune @roycaihw

@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/design Categorizes issue or PR as related to design. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. labels Oct 2, 2018
@@ -128,7 +128,25 @@ type CustomResourceConversionWebhook {
### Defaulting

In case that there is no versions list, a single version with values defaulted to top level version will be created. That means a single version with a name set to spec.version.
All newly added per version fields (schema, additionalPrinterColumns or subresources) will be defaulted to the coresponding top level field except for the first version in the list that will remain empty.
None of the added per version fields (schema, additionalPrinterColumns or subresources) will be defaulted to the coresponding top level field except because that is a backward incompatible change.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

except

* *Validation*: Either top level X or per version X can be set not both.
* *Mutation*: On update, if the top level field is set, clear all per version X.

With this suggestion, there can be a single list in CRD-V1 (apiextensions.k8s.io/v1). If top level fields are set, they will be copied to all items in the list otherwise we use `spec.versions` list.
Copy link
Member

@roycaihw roycaihw Oct 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How are we going to distinguish round-tripping v1beta1->v1->v1beta1 between

  • v1beta1 CRD that sets top level X
  • v1beta1 CRD that sets same value to all per version X

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a section in the linked doc: https://docs.google.com/document/d/1A6-q7l0M9L1HIqTzbB2Y-9YQ-ggL6HGO7CJSGCxJZF8/edit#bookmark=id.vm10z07du3o7

Let's discuss it there first and I will move it to this PR after.

@mbohlool mbohlool force-pushed the crd_conversion_defaulting branch from 567ac61 to 794d7b9 Compare October 8, 2018 19:58
@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Oct 8, 2018
@mbohlool mbohlool force-pushed the crd_conversion_defaulting branch from 794d7b9 to cafe4c1 Compare October 8, 2018 20:01
@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Oct 8, 2018

### Support Level

The feature will be alpha in the first implementation and will have a feature gate that is defaulted to false. The roll-back story with a feature gate is much more clear. if we have the features as alpha in kubernetes release Y (>X where the feature is missing) and we make it beta in kubernetes release Z, it is not safe to use the feature and downgrade from Y to X but the feature is alpha in Y which is fine. It is safe to downgrade from Z to Y (given that we enable the feature gate in Y) and that is desirable as the feature is beta in Z.

On downgrading from a Z to Y, stored CRDs can have per version fields set. While the feature gate can be off on Y (alpha cluster), it is dangerous to disable per-version Schema Validation or Status subresources as it makes the status field mutable and validation on CRs will be disabled. Thus the feature gate in Y only protects adding per version fields not the actual behaviour. This way even if we downgrade from Z to Y, per version validations and subresources will be honored. This will not be the case for webhook conversion itself. The feature gate will also protect the implementation of webhook conversion and alpha cluster with disabled feature gate will return error for CRDs with webhook conversion (that are created with a future version of the cluster).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add how the feature gate is checked to this section or the section above?

I'd suggest that we document kubernetes/kubernetes#67521 (comment) somewhere as a recommended pattern for how to use feature gates, but I'm not sure if it applies to all features (probably not, e.g. for webhook conversion itself).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this cover what you are asking? ... Thus the feature gate in Y only protects adding per version fields not the actual behaviour. ....

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was concerned if it's clear that we allows updating per version X when gate is off, if the CRD object already uses the per version X; also different from the other validation rules, I suppose this feature gate validation rule applies to the whole set instead of individual X’s?

E.g. when gate is off, for a CRD that already uses per version Schema, users can update the CRD to use per version Subresources from using top-level Subresources previously.

Maybe it's just me :) I'm willing to LGTM after the typo reported by CI is fixed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added more detail, does it look good now?


Hence the solution is the two sets of fields cannot be set at the same time. These validation rules need to be applied:

* Either top level X or per version X can be set not both. This rule applies to individual X’s not the whole set. E.g. top level Schema can be set while per version subresources are set.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E.g. top level Schema can be set while per version subresources are set.

+1

@justaugustus
Copy link
Member

/kind kep

@mbohlool mbohlool force-pushed the crd_conversion_defaulting branch from cafe4c1 to 7ccdea7 Compare October 15, 2018 23:18
@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Oct 15, 2018
@mbohlool
Copy link
Contributor Author

@deads2k @liggitt can you take another look please.

@mbohlool
Copy link
Contributor Author

@roycaihw comment addressed. PTAL.

@mbohlool mbohlool force-pushed the crd_conversion_defaulting branch from 7ccdea7 to 44ca73f Compare October 15, 2018 23:23
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 15, 2018

### Support Level

The feature will be alpha in the first implementation and will have a feature gate that is defaulted to false. The roll-back story with a feature gate is much more clear. if we have the features as alpha in kubernetes release Y (>X where the feature is missing) and we make it beta in kubernetes release Z, it is not safe to use the feature and downgrade from Y to X but the feature is alpha in Y which is fine. It is safe to downgrade from Z to Y (given that we enable the feature gate in Y) and that is desirable as the feature is beta in Z.
On downgrading from a Z to Y, stored CRDs can have per version fields set. While the feature gate can be off on Y (alpha cluster), it is dangerous to disable per-version Schema Validation or Status subresources as it makes the status field mutable and validation on CRs will be disabled. Thus the feature gate in Y only protects adding per version fields not the actual behaviour. Thus if the fearure gate is off in Y:

* None of the per-version X cannot be set on create.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None of or s/cannot/can/


* None of the per-version X cannot be set on create.
* None of the per-version X cannot be changed on update unless the update command remove them all.
* Any update that does not change per-version X (if they exists) are allowed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding (of kubernetes/kubernetes#67521 (comment); please correct me if we are pursuing a different strategy) is that we allow updates that do change per-version X's if per-version X's exist.

Any update that does not change per-version X (if they exists) sounds like "patching top-level X (when per-version X exists)" to me, which would fail the mutual exclusivity validation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these three rules can be simplified to this when the feature gate is disabled:

  • Per-version X cannot be set on CRD create (per-version fields are auto-cleared).
  • Per-version X can only be set/changed on CRD update unless the existing CRD object already has per-version X set.

"cannot be changed on update unless the update command remove them all" seems fiddly... can you describe a scenario that protects against? if it were my CRD, I'd rather be able to fix up one per-version field if needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think when the feature gate is disabled, there should be something wrong with the feature and we should not less user guess what they can do to make it work. Any change to the feature may make it worse. But I don't feel strong about it and can go with your way too (except the auto-clear part).

on the auto-clear part, while we are doing that for other feature gates, I object to doing it here. End user of the cluster does not have access to feature gate and if admin disable a feature gate, Operators that installed before that act differently than operator installed after that and both installation goes successfully. This will just create confusion for people.

I agree to continue with auto-clear for now as we do it elsewhere.

@mbohlool mbohlool force-pushed the crd_conversion_defaulting branch from 44ca73f to cc77ce7 Compare October 17, 2018 18:44
@roycaihw
Copy link
Member

Talked with @mbohlool offline, the purposed strategy for how to use feature gate is more strict compared with kubernetes/kubernetes#67521 (comment).

kubernetes/kubernetes#67521 (comment) will be discussed in sig-arch meeting. If we want to go with that strategy in future, it's safe (backward compatible) to relax the validation.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 17, 2018
@roycaihw
Copy link
Member

nit: please update the comments for top-level and per-version X's in API changes section.

@mbohlool mbohlool force-pushed the crd_conversion_defaulting branch 2 times, most recently from 072001f to da16453 Compare October 22, 2018 21:17
@mbohlool
Copy link
Contributor Author

@roycaihw @liggitt addressed your comments, PTAL.

@mbohlool mbohlool force-pushed the crd_conversion_defaulting branch from da16453 to 924ffb1 Compare October 22, 2018 21:20
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 22, 2018
@mbohlool mbohlool force-pushed the crd_conversion_defaulting branch from 924ffb1 to 0f9b999 Compare October 22, 2018 21:52
@roycaihw
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 22, 2018
Copy link
Member

@liggitt liggitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a few nits/clarifications. one substantive comment around behavior when the feature gate is disabled. lgtm otherwise.


* Either top level X or per version X can be set not both. This rule applies to individual X’s not the whole set. E.g. top level Schema can be set while per version subresources are set.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nits:
"...can be set, but not both."
lowercase "Schema"
s/per version/per-version/g


#### Alternative approaches considered

First a defaulting approach is considered which per version fields would be defaulted to top level fields. but that will breaks backward incompatible change; Quoting from API [guidelines](https://github.com/kubernetes/community/blob/master/contributors/devel/api_changes.md#backward-compatibility-gotchas):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/will breaks/breaks/


There are other solution considered that does not involved defaulting:

* Field Discriminator: Use `Spec.Conversion.Strategy` as discriminator to decide which set of fields to use. This approach would work but the proposed solution is keeping the mutual excusivity in a broader sense and is preferred.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach would work but the proposed solution is keeping the mutual excusivity in a broader sense and is preferred.

I think the main benefit of the mutual exclusivity over the strategy-based discriminator is that it makes the API less surprising (accepting schema/subresource data and then not making use of it because the conversion strategy was set in a particular way is surprising).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well the point was validating here too. we won't accept anything that we don't use. we validate based on strategy field and use what we validated. e.g. if it is none we only let top level X to be set and we use top level X. if it is not None, then we only allow per version fields to be set and we use those.


### Support Level

The feature will be alpha in the first implementation and will have a feature gate that is defaulted to false. The roll-back story with a feature gate is much more clear. if we have the features as alpha in kubernetes release Y (>X where the feature is missing) and we make it beta in kubernetes release Z, it is not safe to use the feature and downgrade from Y to X but the feature is alpha in Y which is fine. It is safe to downgrade from Z to Y (given that we enable the feature gate in Y) and that is desirable as the feature is beta in Z.
On downgrading from a Z to Y, stored CRDs can have per version fields set. While the feature gate can be off on Y (alpha cluster), it is dangerous to disable per-version Schema Validation or Status subresources as it makes the status field mutable and validation on CRs will be disabled. Thus the feature gate in Y only protects adding per version fields not the actual behaviour. Thus if the fearure gate is off in Y:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/fearure/feature/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feature is a valid word :) cool.


* None of the per-version X cannot be set on create.
* None of the per-version X cannot be changed on update unless the update command remove them all.
* Any update that does not change per-version X (if they exists) are allowed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these three rules can be simplified to this when the feature gate is disabled:

  • Per-version X cannot be set on CRD create (per-version fields are auto-cleared).
  • Per-version X can only be set/changed on CRD update unless the existing CRD object already has per-version X set.

"cannot be changed on update unless the update command remove them all" seems fiddly... can you describe a scenario that protects against? if it were my CRD, I'd rather be able to fix up one per-version field if needed.

@mbohlool mbohlool force-pushed the crd_conversion_defaulting branch from 0f9b999 to 570a4fa Compare October 23, 2018 20:46
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 23, 2018
@mbohlool
Copy link
Contributor Author

@liggitt PTAL

@mbohlool mbohlool force-pushed the crd_conversion_defaulting branch from 570a4fa to 25ae80b Compare October 23, 2018 21:33
@liggitt
Copy link
Member

liggitt commented Oct 23, 2018

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 23, 2018
@liggitt
Copy link
Member

liggitt commented Oct 23, 2018

/approve

1 similar comment
@lavalamp
Copy link
Member

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: lavalamp, liggitt

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 24, 2018
@k8s-ci-robot k8s-ci-robot merged commit 04ff155 into kubernetes:master Oct 24, 2018
@erictune
Copy link
Member

🎉 thank you everyone who contributed to working out the design around defaulting!

MadhavJivrajani pushed a commit to MadhavJivrajani/community that referenced this pull request Nov 30, 2021
…lting

Add per version field section to custom resource webhook conversion KEP
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/design Categorizes issue or PR as related to design. lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants