diff --git a/keps/sig-api-machinery/20180731-crd-pruning-decoding.png b/keps/sig-api-machinery/20180731-crd-pruning-decoding.png new file mode 100644 index 000000000000..39788f021467 Binary files /dev/null and b/keps/sig-api-machinery/20180731-crd-pruning-decoding.png differ diff --git a/keps/sig-api-machinery/20180731-crd-pruning.md b/keps/sig-api-machinery/20180731-crd-pruning.md index 96f9e8d3dc2e..5719591fb6a5 100644 --- a/keps/sig-api-machinery/20180731-crd-pruning.md +++ b/keps/sig-api-machinery/20180731-crd-pruning.md @@ -1,5 +1,4 @@ --- -kep-number: 24 title: Pruning for Custom Resources status: provisional authors: @@ -7,7 +6,6 @@ authors: owning-sig: sig-api-machinery participating-sigs: - sig-api-machinery - - sig-architecture reviewers: - "@deads2k" - "@lavalamp" @@ -21,367 +19,356 @@ approvers: editor: name: "@sttts" creation-date: 2018-07-31 -last-updated: 2018-07-31 +last-updated: 2019-04-26 +status: implementable +see-also: + - "https://github.com/kubernetes/enhancements/pull/1002" --- # Pruning for Custom Resources - + ## Table of Contents -* [Pruning for Custom Resources](#pruning-for-custom-resources) - * [Table of Contents](#table-of-contents) - * [Overview](#overview) - * [Goals](#goals) - * [Non-Goals](#non-goals) - * [Motivation](#motivation) - * [Pruning](#pruning) - * [Mixing of Schema and Value Validation](#mixing-schema-and-value-validation) - * [Formal Proposal – following pruning option 1]() - * [Types and Formats](#types-and-formats) - * [Polymorphic Fields](#polymorphic-fields) - * [Excluding values from Pruning](#excluding-values-from-pruning) - * [References](#references) - * [Alternatives Considered](#alternatives-considered) -## Overview + * [Pruning for Custom Resources](#pruning-for-custom-resources) + * [Table of Contents](#table-of-contents) + * [Summary](#summary) + * [Motivation](#motivation) + * [Goals](#goals) + * [Non-Goals](#non-goals) + * [Proposal](#proposal) + * [Excluding values from Pruning](#excluding-values-from-pruning) + * [Examples](#examples) + * [Opt-in and Opt-out of Pruning on CRD Level](#opt-in-and-opt-out-of-pruning-on-crd-level) + * [References](#references) + * [Alternatives Considered](#alternatives-considered) + * [Test Plan](#test-plan) + * [Graduation Criteria](#graduation-criteria) + * [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy) + * [Version Skew Strategy](#version-skew-strategy) + * [Implementation History](#implementation-history) + +## Summary + +CustomResources store arbitrary JSON data without following the typical Kubernetes API behaviour to prune unknown fields. This makes CRDs different, but also leads to security and general data consistency concerns because it is unclear what is actually stored in etcd. + +This KEP proposed to add pruning of all fields which are not specified in the OpenAPI validation schemas given in the CRD. It requires _structural schemas_ (as described in [KEP Vanilla OpenAPI Subset: Structural Schema](https://github.com/kubernetes/enhancements/pull/1002)) for all defined versions. Validation will reject the CRD otherwise. + +Pruning will be opt-in in `v1beta1` of `apiextensions.k8s.io` via + +```yaml +apiVersion: apiextensions.k8s.io/v1beta1 +kind: CustomResourceDefinition +spec: + pruneUnknownFields: true + ... +``` + +i.e. CRDs created in `v1beta1` default to disabled pruning. Pruning will enabled by default for CRDs created in `v1`. -Native Golang based resources do not persist JSON fields which are not part of the Golang structs which are backing them in the API server memory. This artifact of the use of typed Golang structs inside of the REST implementation has turned into an API convention. Major parts of the Kubernetes depend on this to ensure consistency of data persisted to etcd and returned from the REST API. +Pruning can be locally disabled for subtrees of CustomResources by setting the `x-kubernetes-preserve-unknown-fields: true` vendor extension. This allows to store arbitrary JSON or RawExtensions. -Without consistency of data in etcd, objects can suddenly render unaccessible on version upgrade because unexpected data may break decoding (e.g. in generated, typed clients which are not based on forgiving `unstructured.Unstructured`). Even if persisted data has correct format and decodes correctly, having it not gone through validation and admission when it was stored can break cluster-wide invariants. For example assume the `privileged: true/false` field is added to a type in Kubernetes version X+1. In version X, there is no security check around this. So every user could set that flag if we didn’t drop unknown fields. When the field is added in X+1, that user suddenly has escalated access (note: on read we do not run admission). This is a serious security risk. +## Motivation -CustomResources are persisted as JSON blobs today (with the exception of `ObjectMeta` which is pruned since Kubernetes 1.11), i.e. we do not drop unknown fields, with the described consequences. This proposal is about adding a decoding step named "pruning" into the decoder from JSON to `unstructured.Unstructured` inside the apiextensions-apiserver. This pruning step will drop fields not specified in the OpenAPI validation spec, leading to the same persistence semantics as for native types. +* Native Golang based resources do pruning as a consequence of the JSON unmarshalling algorithm. This is has become a fundamental behaviour of Kubernetes API semantics that CustomResources break. +* Pruning enforces consistency of data stored in etcd. Object cannot suddenly render unaccessible because unexpected data breaks decoding. +* Even if unexpected data is of the right type and does not break decoding, it has not gone through admission or validation. Pruning enforces this. +* Pruning is a counter-measure to security attacks which make use of knowledge of future versions of APIs with new security relevant fields. Without pruning an attacker can prepare CustomResources with privileged fields set. On version upgrade of the cluster, these fields can suddenly become alive and lead to unallowed behaviour. -## Goals +### Goals * Prune unknown fields from CustomResources silently. Unknown means not specified in the OpenAPI validation spec. -* Allow to opt-out of pruning via the OpenAPI validation spec for a whole subtree of the JSON objects. -* Have simple semantics for pruning. -* Be extensible to defaulting at a later point. +* Allow to opt-out of pruning via the OpenAPI validation spec for a whole subtree of JSON objects. -## Non-Goals +### Non-Goals * Add a strict mode to the REST API which rejects objects with unknown fields. -* Propose or decide anything about defaulting. -## Motivation -### Pruning -Pruning of a JSON value means to remove "unknown fields". A field (given as a JSON path) is unknown if it is not specified in the OpenAPI validation schema. - -A JSON path `.x` is specified in an OpenAPI validation schema if `"x"` is a key in a corresponding `properties` field in the schema. - -#### Example 1 - ->Assume the OpenAPI schema -> ->```json ->{"properties": {"a": {}, "b": {"type": "string"}, "c": {"not": {}}}} ->``` -> ->Then a JSON object `{"a":1, "c": 3, "d": 4}` is pruned to `{"a":1, "c":3}`. -> -> Note that the pruned object does not validate. +## Proposal -#### Example 2 +We assume the CRD has _structural schemas_ (as defined in [KEP Vanilla OpenAPI Subset: Structural Schema](https://github.com/kubernetes/enhancements/pull/1002)). -> Assume the OpenAPI schema -> ->```json ->{"anyOf": [{"properties": {"a": {}}}, {"properties": {"b": {}}}]} ->``` -> ->Then a JSON object `{"a":1, "b": 2, "c": 3}` is pruned to `{"a":1, "b":2}` with the given semantics. Note that also `{"a":1}`, `{"b":2}` would be natural pruning results if we defined "specified" in a different way. - -#### Example 3 - -> Assume the OpenAPI schema -> ->```json ->{ -> "properties": { -> "a": {}, -> "b": {"type": "string", "properties": {"x": {}}} -> }, -> "anyOf": [ -> { "properties": {"c":{}} }, -> { "properties": {"d":{"not":{}}}} -> ] ->} ->``` -> -> Then `a`, `b`, `b.x`, `c`, `d` are specified, `b.y` and `e` are not specified. - -**Question:** is it natural semantics to assume `d` to be specified and not to prune it? Note that there is no object with `d` that validates against `"d":{not:{}}`. But due to the `anyOf` there are objects which will validate against the complete schema. - -#### Pruning Options - -Motivated by the examples, we have different options to define the pruning semantics: -1. Use the described semantics of `specified`. -2. Only consider `properties` fields in the schema which actually successfully validate a given object (then `d` would be pruned in example 3). -3. Only consider `properties` fields outside of `anyOf`, `allOf`, `oneOf`, `not` during pruning. -4. Only consider `properties` fields outside of `anyOf`, `allOf`, `oneOf`, `not` during pruning, but enforce that every `properties` key inside `anyOf`, `allOf`, `oneOf`, `not` also appears outside all of those. - -From these options: -1. leads to fields being kept from pruning although they only appear in branches of the OpenAPI validation schema which do not contribute to validation of an object. -2. is ambiguous if you have multiple branches of `anyOf` validating. Should we drop the fields of the first or the second branch? -3. leads to surprising pruning of fields that the user forgot to specify outside of propositional logic of `anyOf`, `allOf`, `oneOf`, `not`. -4. forces the user to re-specify all those properties outside of `anyOf`, `allOf`, `oneOf`, `not` which appear inside of them. The outcome will match that of 1, with the difference that it makes the skeleton explicit. - -**Here we propose not to follow 2 and 3 due to ambiguity of 2 and the danger of user mistakes of 3. Both 1 and 4 lead to the same pruning behaviour and only differ in whether the user has to re-specify property keys or whether they are automatically derived.** - -### Mixing of Schema and Value Validation - -The OpenAPI validation schema mixes the actual structural schema validation (which is usually done by the Golang struct JSON decoding for native types) -and the value validation (usually done in the validation step of the API server handler pipeline for native types). - -For CRDs we cannot distinguish both. This was first noticed by @lavalamp in https://github.com/kubernetes/kubernetes/pull/64907#issuecomment-397015030. - -#### Example 4 - -> Assume -> ->```json ->{ -> "properties": { -> "a": {"type": "string", "pattern": "", "format": "ip"}, -> "b": {"properties": {"x": {}, "y": {}}} -> } ->} ->``` -> -> The type and properties are usually called schema validation, while regex patterns and formats are about values. The latter would be tested in the validation phase, the former would be checked during decoding for native types in the JSON decoder. - -**Remark:** the line between type and format is blurry. Format is optional to be processed by tools, and the value is open ended. In Kube we have some formats (like `date`) which correspond to custom JSON unmarshallers in the native types. That format would be verified during decoding, although technically a date is just a string. So we might want to replicate that behaviour at some point. - -For pruning (and possibly later defaulting) we have to apply the OpenAPI validation schema inside of the decoder step (the left box inside apiextensions-apiserver in the figure above). This is considerably earlier than for native types. This has a number of implications: -* We have to apply full OpenAPI value validation (e.g. regular expressions, propositional evaluation) during decoding. -* Our generic registry applies validation after defaulting. We need the other way around for CRDs. -* Our generic registry expects defaulting to always succeed (no error result type). CRD validation needed by OpenAPI defaulting would be able to fail. - -To avoid these and to get sane semantics, **we propose to split the CRD validation into two top-level steps:** - -1. During decoding we validate using a skeleton schema which lacks value validations and propositional logic (`anyOf`, `allOf`, `oneOf`, `not`). -2. During standard generic registry validation phase we validate using the full validation schema. - -Step 1 is extensible to defaulting based on the skeleton validation result. Step 2 would then catch wrong types of the used defaults. - -## Formal Proposal – following pruning option 1 - -The examples in the motivational section show that the semantics of full OpenAPI validation schemata are not trivial in respect to pruning. To simplify the algorithms and to enforce "sane" schemata which allow to split schema and value validation, we propose to derive a skeleton schema from the full user-given OpenAPI validation schema -* which does not contain value validations -* which is complete enough for pruning (and possibly later defaulting). - -**Remark:** we have two options to define and implement pruning: -1. directly on the full OpenAPI validation schema with two custom algorithms doing parallel recursion over the schema and the input object. -2. via an intermediate representation (the skeleton schema) and using go-openapi pruning. Both algorithms are ten-liners based on the go-openapi/validate output. With the skeleton schema the go-openapi pruning algorithm coincides with our pruning option 1. - -Both routes lead to the same algorithm: the intermediate representation of the skeleton schema makes merging of OpenAPI validation schema constraints explicit, while the custom algorithms would hide that in its recursion code. - -Moreover, the main reason for the intermediate schema: the custom algorithm would have to replicate a lot of the validation logic of go-openapi, e.g. the semantics of `properties`, `additionalPropoerties`, `patternProperties` and the same for items of an array. With route 2 we get all of this for free. - -### Definition: skeleton schema - -> For a given OpenAPI validation schema `s` the skeleton schema `skel(s)` is derived by -> 1. applying `skel` to all elements of `.allOf`, `.anyOf`, `.oneOf` and `.not` giving `s_1, …, s_n`, -> then dropping all fields from `s` other than -> * `type`, -> * `items`, -> * `additionalItems`, -> * `properties`, -> * `patternProperties`, -> * `additionalProperties` -> giving `drop(s)`, -> 2. then merging `s_i` into it’s containing object. -> -> I.e.: `skel(s) := merge(drop(s), s_1, …, s_n)` with -> ->``` ->merge(x_1, …, x_n) := { -> "type": t if all x_i agree on t as type, undefined otherwise -> "items": [ merge(x_i1, …, x_ik) ] where s_ij = s_j.items[i] if defined, -> "additionaItems": merge(p_1, …, p_i), -> for all x_i with defined additionalItems p_i -> "properties": { k_i: merge(v_i1, …, v_ik) for keys appearing in x_ij with values x_ij }, -> "patternProperties": { k_i: merge(v_i1, …, v_ik) for keys appearing in x_ij with values x_ij }, -> "additionalProperties": merge(p_1, …, p_i), -> for all x_i with defined additionalProperties p_i ->} ->``` - -The skeleton schema especially lacks: -* all value validations like `pattern`, `format`, `minValue`, `maxValue`, ... -* all propositional operators like `anyOf`, `allOf`, `oneOf`, `not`. - -The skeleton schema `skel(s)` puts -* less or equal constraints on `type` than `s`. -* every field specified by a `properties` key in `s` is specified in `skel(s)`. - -The computation of `skel(s)` is `O(size(s))` and `size(skel(s)) <= size(s)`. - -Note that a field might be constrained by the `type` construct in the full OpenAPI validation schema, but not in its skeleton. This is fine because we have to support polymorphic fields like `IntOrString`, but avoid `allOf`, `anyOf`, `oneOf`, `not` in the skeleton. - -**Property:** if the OpenAPI validation schema applies to an object, so does its skeleton schema. - -Pruning will be implemented based on the skeleton schema of the specified OpenAPI validation schema. - -Optionally for debugging, the skeleton schema derived from the validation schema by the apiextensions-apiserver can be stored in the CRD status. - -#### Example 5 - ->Assume -> ->``` ->s := { -> "anyOf": [ -> {"properties": {"a": {"type": "string"}}}, -> {"properties": {"a": {"type": "integer"}, "b": {"type": "string"}}} -> ], -> "properties": {"c": {}} ->} ->``` -> ->Then -> ->``` ->skel(s) = { -> "properties": { -> "a": {}, -> "b": {"type": "string"}, -> "c": {} -> } ->} ->``` -> -> The `type` of `a` does not match on all paths. Hence, it is omitted in the skeleton. In contrast, `b` has a unique `type` of `"string"`, so it stays in the skeleton. - -### Types and Formats - -Note, that having less `type` constraints in the skeleton than in the whole OpenAPI validation schema means that admission and conversion will see possibly wrong types. The final validation in the registry validation phase will check for the complete OpenAPI validation schema and catch those type errors. - -**Question:** we could extent the `skel` function to keep a list of `type` values. As long as all branches define the type, we can add `anyOf: [{type: "type1", type: "type2, ….}]` to the skeleton. If one branch does not define the type though, this is still not possible. - -In native types, we have custom unmarshallers for date / timestamps. In OpenAPI these would be rendered as `{type: "string", format: "date"}`. With the proposed skeleton algorithm, we would not verify these fields before full OpenAPI validation in the registry. We could move non-contradicting `format` constraints into the skeleton as well, like we do for `type` already. Then admission would be protected from invalid format (if admission uses Golang decoding, this might be relevant to get proper error messages). - -### Polymorphic Fields - -#### Example 6 - -> Assume -> ->``` ->s := { -> "anyOf": [ -> {"properties": {"a": {"type": "string"}}}, -> {"additionalProperties": {"type": "integer"}} -> ], ->} ->``` -> ->Then -> ->``` -> skel(s) = {"properties": {}, "additionalProperties": {}} ->``` -> -> In the CRD validation, we reject OpenAPI validation schemata like `skel(s)` with `properties` and one of `additionalProperties` or `patternProperties` being defined at the same time. Having `additionalProperties` or `patternProperties` defined there means to have a `map[string]T` like field where we don’t want to prune the unknown "keys". -> -> The schema `s` above means to either have a `map[string]int64` or a `struct` with `a` of type string. This is a special case of polymorphism because for the same JSON path is typed with two different types in the Golang sense (struct and `map[string]T`), but same types in the JSON sense (JSON object). - -Hence, we have to add the following restriction to OpenAPI validation schemata: - -**Restriction 1 on Object Polymorphism:** reject CRD OpenAPI validation schemata `s` with `skel(s)` having `properties` and one of `additionalProperties` or `patternProperties` being defined for the same JSON path. +We propose to -### Excluding values from Pruning - -There are cases where parts of an object are verbatim JSON, i.e. without any applied schema and especially without a complete specification which allows to apply pruning. +1. derive the value-validation-less variant of the structural schema (trivial by definition of _structural schema_) and +2. recursively follow the given CustomResource instance and the structural schema, removing fields from the former if they are not specified in the properties of the later +3. return a deserialization error if the CustomResource instance JSON value and the type in the structural schema do not match +4. fields of `metav1.TypeMeta` (`apiVersion` and `kind`) and `metav1.ObjectMeta` at the object root are implicitly specified. -Hence, we need a mechanism to express that in the OpenAPI validation schema (compare https://github.com/kubernetes/kubernetes/pull/64558#issuecomment-403564033). +We do this in the serializer just after the binary payload has been unmarshalled into an `map[string]interface{}`, compare the yellow boxes in the following figure: -**Raw JSON Option 1:** add a format `json`, e.g.: - -```json -{"properties": {"x": {"format": "json"}}} -``` +![Decoding steps which must prune](20180731-crd-pruning-decoding.png) -In this example "x" is excluded from pruning. Note that you can still use any kind of OpenAPI validation schema constructs to restrict `"x"` further. +### Excluding values from Pruning -Note that we lose expressiveness of the existing `format` strings: we either apply the format to `"json"` or any other pre-defined format. I.e. we cannot express that pruning should be disabled, but if it is an integer, it should be an int32. +There are cases where parts of an object are verbatim JSON, i.e. without any applied schema and especially without a complete specification which allows to apply pruning. -**Question:** this feels like a reasonable loss of expressivity. Do we accept that? +The vendor extension `x-kubernetes-preserve-unknown-fields: true` proposed in (as defined in the [KEP Vanilla OpenAPI Subset: Structural Schema](https://github.com/kubernetes/enhancements/pull/1002)) serves exactly this purpose, with the following semantics: + +1. the whole JSON subtree at the level of `x-kubernetes-preserve-unknown-fields: true` and below is excluded from pruning +2. if `x-kubernetes-embedded-resource: true` is in the subtree of (1), pruning starts again in the `metadata` property below the level of `x-kubernetes-embedded-resource: true`. + + +If `x-kubernetes-preserve-unknown-fields` is not specified, the parent's pruning behaviour is followed (with the exception of the rule (2) +). + +### Examples + +1. arbitrary JSON + + ```yaml + type: object + properties: + json: + type: object + x-kubernetes-preserve-unknown-fields: true + nullable: true + ``` + + Inside of `.json` nothing is pruned, i.e. + + ```json + { + "foo": 42, + "json": {"bar": 43} + } + ``` + + is pruned to + + ```json + { + "json": {"bar": 43} + } + ``` + +2. partially restricted JSON + + ```yaml + type: object + properties: + json: + type: object + x-kubernetes-preserve-unknown-fields: true + nullable: true + property: + bar: + type: integer + ``` + + Inside of `.json` nothing is pruned, i.e. + + ```json + { + "foo": 42, + "json": { + "bar": 43, + "abc": 44 + } + } + ``` + + is pruned to + + ```json + { + "json": {"bar": 43, "abc": 44} + } + ``` + +3. embedded resource + + ```yaml + type: object + properties: + object: + type: object + nullable: true + x-kubernetes-embedded-resource: true + x-kubernetes-preserve-unknown-fields: true + ``` + + Here, inside of `.object` nothing is pruned with the exception of unknown fields under `.object.metadata`, i.e. + + ```json + { + "foo": 42, + "object": { + "bar": 43, + "abc": 44, + "metadata": { + "name": "example", + "garbage": 45 + } + } + } + ``` + + is pruned to + + ```json + { + "object": { + "bar": 43, + "abc": 44, + "metadata": { + "name": "example" + } + } + } + ``` + +4. implicit `metav1.TypeMeta` and `metav1.ObjectMeta` + + ```yaml + type: object + ``` + + Pruning takes place, but `apiVersion`, `kind`, `metadata` and known fields under `metadata` are preserved, i.e. + + ```json + { + "apiVersion": "example/v1", + "kind": "Foo", + "metadata": { + "name": "example", + "garbage": 43 + }, + "foo": 42, + } + ``` + + is pruned to + + ```json + { + "apiVersion": "example/v1", + "kind": "Foo", + "metadata": { + "name": "example" + } + } + ``` + + + +### Opt-in and Opt-out of Pruning on CRD Level + +We will add a `PruneUnknownFields` flag to `CustomResourceDefinitionSpec` of `apiextensions.k8s.io/v1beta1`: + +```go +type CustomResourceDefinitionSpec struct { + ... + + // pruneUnknownFields enables pruning of object fields which are not + // specified in the OpenAPI schema. apiVersion, kind, metadata and known + // fields inside metadata are excluded from pruning. + // Defaults to false. + // Setting this field to true is considered an alpha API. + // Note: this will default to true in version v1. + PruneUnknownFields bool +} +``` -**Raw JSON Option 2:** add an extension property, e.g. +I.e. for `apiextensions.k8s.io/v1beta1` this will default to `false` for backwards compatibility. -```json -{"properties": {"x": {"x-kubernetes-no-pruning": true}}} -``` +For `apiextensions.k8s.io/v1` we will change the default to `true` and forbid `false` during creation and updates if it has been `true` before. In `v1` the only way to opt-out from pruning is via setting `x-kubernetes-preserve-unknown-fields: true` in the schema. An empty or unspecified schema -This would not lead to a loss in expressivity. It is formulated negatively intentionally to have `false` as the default with `omitempty`. +## References -**We propose to follow the second option with `x-kubernetes-no-prune`, because it is more explicit, does not reduce expressivity and does not mangle with the already very vague `format` field definition.** +* Old pruning implementation PR https://github.com/kubernetes/kubernetes/pull/64558, to be adapted +* [OpenAPI v3 specification](https://github.com/OAI/OpenAPI-Specification/blob/master/versions/3.0.0.md) +* [JSON Schema](http://json-schema.org/) -### Nested x-no-pruning +## Alternatives Considered -**Question:** should we support nested `x-kubernetes-no-prune`, i.e. disabling pruning for a sub-object, but re-enable it something deep inside of it? E.g. +* in [GDoc which preceded this KEP](https://docs.google.com/document/d/1rBn6SZM7NsWxzBN41J2kO2Odf07PeGPygatM_1RwofY/edit#heading=h.4qdisqud6z3t) we considered a number of alternatives, including using a skeleton schema approach. We decided against that because of its complex semantics. In contrast, the _structural schema_ of the [KEP Vanilla OpenAPI Subset: Structural Schema](https://github.com/kubernetes/enhancements/pull/1002) is the natural output of schema generators deriving a schema from Golang structs. This matches the behavour of pruning through JSON unmarshalling, independently of any value validation the developer adds on top. +* we could allow nested `x-kubernetes-preserve-unknown-fields`, i.e. to switch on pruning again for a subtree. This might encourage non-Kubernetes-like API types. It is unclear whether there are use-cases we want to support which need this. We can add this in the future. +* we could allow per-version opt-in/out of pruning via `PruneUnknownFields` in `CustomResourceDefinitionVersion`. For the goal of data consistency and security a CRD with semi-enabled pruning does not make much sense. The main reason to not enable pruning will probably be the lack of a complete structural schema. If this is added for one version, it should be possible for all other versions as well as it is less a technical, but a CRD development life-cycle question. -```json -{ - "properties": { - "x": { - "x-no-pruning": true, - "properties": { - "y": { - "x-kubernetes-no-pruning": false, - "properties": { "z": {} } - } - } - } - } -} -``` +### Test Plan -If we do, the object +**blockers for alpha:** -```json -{ - "a": 1, - "x": { - "b": 2, - "y": { - "c": 3, - "z": 42 - } - } -} -``` +We default `PruneUnknownFields` to false and hence switch off the whole code path doing pruning. This reduces risk for everybody not using this alpha feature. -would be pruned to `{"x":{"b": 2, "y":{"z": 42}}}`. +* we add apiextensions-apiserver integration tests to + * verify that the pruning feature is actually off if `PruneUnknownFields` is false. + * verify that `PruneUnknownFields` is defaulted to false. -**We propose to disallow nesting of `x-kubernetes-no-prune` and to disallow setting it to false, i.e. `x-kubernetes-no-prune: false`.** We can add nesting later if necessary. +**blockers for beta:** -## Opt-in and Opt-out of Pruning on CRD Level +* we add unit tests for the general pruning algorithm +* we verify that `x-kubernetes-embedded-resource` and `x-kubernetes-preserve-unknown-fields` work as expected. +* we add apiextensions-apiserver integration tests to + * verify that pruning happens if `PruneUnknownFields` is true, for all versions in the CRD according to the schema of the respective version. + * verify that pruning happens on incoming request payloads, on read from storage and after calling mutating admission webhooks. + * verify that `metadata`, `apiVersion`, `kind` are preserved if `PruneUnknownFields` is true and there is no schema given in the CRD. -We will add a pruning flag to `CustomResourceDefinitionSpec` of `apiextensions.k8s.io/v1beta1`: +**blockers for GA:** -```golang -type CustomResourceDefinitionSpec struct { - ... - - // Prune enables pruning of unspecified fields. Defaults to false. - // Note: this will default to true in version v1. - Prune *bool -} -``` +* we verified that performance of pruning is adequat and not considerably reducing throughput. -I.e. for `apiextensions.k8s.io/v1beta1` this will default to `false`. +### Graduation Criteria -For `apiextensions.k8s.io/v1` we will change the default to `true` and forbid `false` during creation and updates. In `v1` the only way to opt-out from pruning is via setting `x-kubernetes-no-prune: true` in the schema. +* the test plan is fully implemented for the respective quality level -When [CRD conversion](https://github.com/mbohlool/community/blob/master/contributors/design-proposals/api-machinery/customresource-conversion-webhook.md) is implemented before this KEP, we will add the pruning field to `type CustomResourceDefinitionVersion`, in analogy to subresources and `additionalPrinterColumns`. +### Upgrade / Downgrade Strategy -## References +* setting `PruneUnknownFields` to true is considered alpha quality in 1.15. +* downgrading to 1.14 will lose `pruneUnknownFields: true`, but that's acceptable. +* downgrading from 1.16 (where pruning might be beta) to 1.15 will keep the same behaviour as we don't feature gate `pruneUnknownFields: true`. +* upgrading from 1.14 will default to `pruneUnknownFields: false` and hence change no behaviour. +* upgrading from 1.15 will keep the value and hence change no behaviour. +* when `v1` of `apiextensions.k8s.io` is added, we will keep the old pruning behaviour for CRDs created in `v1beta1` with `pruneUnknownFields: false`, but enforce `pruneUnknownFields: true` for every newly create `v1` CRD. Hence, we keep backwards compatibility. -* Pruning implementation PR https://github.com/kubernetes/kubernetes/pull/64558 -* [OpenAPI v3 specification](https://github.com/OAI/OpenAPI-Specification/blob/master/versions/3.0.0.md) -* [JSON Schema](http://json-schema.org/) -* [pruning algorithm in go-openapi](https://github.com/go-openapi/validate/blob/master/post/prune.go) +### Version Skew Strategy -## Alternatives Considered +* kubectl is not aware of pruning in relevant way +* posting `pruneUnknownFields: true` alpha quality CRD to an old server will disable pruning. But that's acceptable. -* we have explored pruning option 4 in the [GDoc which preceded this KEP](https://docs.google.com/document/d/1rBn6SZM7NsWxzBN41J2kO2Odf07PeGPygatM_1RwofY/edit#heading=h.4qdisqud6z3t), but decided against it as it put a lot of burden on the CRD author. The approach shown in this KEP leads to the same final outcome, but derives the skeleton automatically. +## Implementation History \ No newline at end of file