-
Notifications
You must be signed in to change notification settings - Fork 1.6k
KEP-5504: Comparable Resource Version #5505
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
michaelasp
commented
Aug 28, 2025
- One-line PR description: Adding new KEP for comparable resource version
- Issue link: Comparable Resource Version #5504
- Other comments:
99edbd3
to
3a110b6
Compare
|
||
## Alternatives | ||
|
||
N/A |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we have alternatives, and I'm about to suggest one. There may be more.
We have a rich discovery API. I think we can use that, drawing inspiration from Rust and its concept of traits.
Rust traits are like interfaces, but can also be kind of zero sized. For example, the PartialEq trait from Rust's standard library means that you can use =
to check for equality. The Eq trait (which requires also implementing PartialEq
) makes a promise that all values can be compared against other values. Some nullable types only implement PartialEq
. There are also traits to say if some, or all, values can be compared against each other: PartialOrd and Ord.
(Eq
and Ord
are just marker traits; they don't add any API surface, they are just making a stronger promise than their supertrait).
We can mark our APIs with something similar to traits and declare, per API, whether resourceVersion
comparisons are meaningful for sorting
Doing it like this means we can have easy soft adoption; well, relatively easy.
Every in-tree API would implement resourceVersion
ordering and can mark itself as such. A metrics adapter that doesn't implement it would not need to worry about clients making the wrong assumption.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we add something like traits, I think we could mark our stable APIs as stable, too. It wouldn't be much extra code.
There might be some other traits-like things we can think up, such as marking if an API kind uses the .metadata.generation
/ .status.observedGeneration
pattern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Listing that as an alternative is fine, but I think it's reasonable to proceed without needing opt-in comparability indicators in discovery.
There are three categories of APIs
1. Servers where resourceVersion is a comparable integer
- built-in kube types served by kube-apiserver
- built-in kube types served by something other than kube-apiserver (ensured via the conformance test proposed in this KEP)
- CRD-backed types served by kube-apiserver
- CRD-backed types served by something other than kube-apiserver (ensured via the conformance test proposed in this KEP)
- extension API servers built with k8s.io/apiserver using normal CRUD storage
- extension API servers built with other technologies but following kube-apiserver behavior
2. Servers where resourceVersion is not an integer
For servers like this, a local well-formedness check when comparing two resourceVersion strings triggering a comparison error (as proposed in the function signature) would be enough to guard against misinterpretation, without needing to augment discovery and make the client check discovery.
All the examples of unusual aggregated servers I'm aware of fall in this category, or have other server-side correctness issues that already mishandle resource versions in ways inconsistent with client-expected behavior:
-
metrics-server
- “pod/node metrics” are virtual APIs that only support get/list and never set RV on the list or the individual items in the list
-
- “pod/node metrics” are virtual APIs that only support get/list and never set RV on the list or the individual items in the list
-
calico
- https://github.com/projectcalico/calico/blob/09c0b753c91474e72157818a480165028f620999/libcalico-go/lib/backend/k8s/resources/profile.go#L138
- https://archive-os-3-26.netlify.app/calico/3.26/reference/resources/profile
- “profiles” is a virtual REST API built on top of namespaces and service accounts that supports get/list/watch with RV set to “nsRV/saRV”
-
Porch
- appears to set invalid resourceVersion values on some objects (static non-integer strings that pass equality checks incorrectly) - https://github.com/nephio-project/porch/blob/4c066b6986533445fb15143507e7ce6470b66c72/pkg/cache/dbcache/dbpackage.go#L97C22-L97C31
- looks like it tries to support watch, but ignores starting resourceVersion in a non-compliant way https://github.com/nephio-project/porch/blob/4c066b6986533445fb15143507e7ce6470b66c72/pkg/registry/porch/watch.go#L100
- uses a git hash as the RV for some objects - https://github.com/nephio-project/porch/blob/4c066b6986533445fb15143507e7ce6470b66c72/pkg/externalrepo/git/package.go#L62 (1/10,000,000 chance of an all-numeric hash on a single object, 1/100,000,000,000,000 chance of two non-identical all-numeric hashes to try to compare)
3. Servers where resourceVersion is an integer, but is not comparable
Obviously this is possible, but do we have any actual known examples of real servers that do this?
I would really want to push extension API server authors to live in category 1 or 2. Either make your integer resourceVersions comparable like everyone expects (preferred), or make them clearly not integers if that's not possible.
Unless there's evidence that there are actually servers like this in use that won't adjust to be in category 1 or 2, I'd push back on the complexity of adding something like "integerResourceVersionIsComparable: false" to discovery per-type and making clients check it.
|
||
## Summary | ||
|
||
Resource version is currently defined as an opaque string from the view of a client, with the only operation that is supported being equality comparisons. This differs from the internal apiserver implementation, where it is clearly defined as a monotonically incrementing integer. There are increasing requirements being required from clients consuming object metadata, where stronger comparisons than just equality are required. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"The apiserver" is not the only implementation of .metadata
, though: you can also integrate your control plane with an extension API server via the aggregation layer.
|
||
### Goals | ||
|
||
The goals for this KEP are fairly straightforward, firstly we will expose a utility function that clients can use on the resource version to check comparisons between resource versions. This will take the opaque resource version string and return a boolean and an error if it occurs. Along with that we will update the documentation to specify that a ResourceVersion must be a monotonically increasing integer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This risks implying that all APIs support this comparison. I'd prefer to make that promise on an API-by-API basis. See #5505 (comment) for the specifics of what I have in mind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say that not-monotonic RV is ok for an API without Watch like Metrics API, however the advantage of having monotonic RV for informers is huge. Being able to do consistent reads, monitor staleness is deal breaker that will make non-compliant mostly API useless and force them to implement it.
I would also point out that it would force us to maintain two different informer implementations based on whether API has monotonic RV, because fully utilizing monotonic RV will motivate non trivial code changes.
keps/sig-api-machinery/5504-comparable-resource-version/kep.yaml
Outdated
Show resolved
Hide resolved
|
||
## Alternatives | ||
|
||
N/A |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Listing that as an alternative is fine, but I think it's reasonable to proceed without needing opt-in comparability indicators in discovery.
There are three categories of APIs
1. Servers where resourceVersion is a comparable integer
- built-in kube types served by kube-apiserver
- built-in kube types served by something other than kube-apiserver (ensured via the conformance test proposed in this KEP)
- CRD-backed types served by kube-apiserver
- CRD-backed types served by something other than kube-apiserver (ensured via the conformance test proposed in this KEP)
- extension API servers built with k8s.io/apiserver using normal CRUD storage
- extension API servers built with other technologies but following kube-apiserver behavior
2. Servers where resourceVersion is not an integer
For servers like this, a local well-formedness check when comparing two resourceVersion strings triggering a comparison error (as proposed in the function signature) would be enough to guard against misinterpretation, without needing to augment discovery and make the client check discovery.
All the examples of unusual aggregated servers I'm aware of fall in this category, or have other server-side correctness issues that already mishandle resource versions in ways inconsistent with client-expected behavior:
-
metrics-server
- “pod/node metrics” are virtual APIs that only support get/list and never set RV on the list or the individual items in the list
-
- “pod/node metrics” are virtual APIs that only support get/list and never set RV on the list or the individual items in the list
-
calico
- https://github.com/projectcalico/calico/blob/09c0b753c91474e72157818a480165028f620999/libcalico-go/lib/backend/k8s/resources/profile.go#L138
- https://archive-os-3-26.netlify.app/calico/3.26/reference/resources/profile
- “profiles” is a virtual REST API built on top of namespaces and service accounts that supports get/list/watch with RV set to “nsRV/saRV”
-
Porch
- appears to set invalid resourceVersion values on some objects (static non-integer strings that pass equality checks incorrectly) - https://github.com/nephio-project/porch/blob/4c066b6986533445fb15143507e7ce6470b66c72/pkg/cache/dbcache/dbpackage.go#L97C22-L97C31
- looks like it tries to support watch, but ignores starting resourceVersion in a non-compliant way https://github.com/nephio-project/porch/blob/4c066b6986533445fb15143507e7ce6470b66c72/pkg/registry/porch/watch.go#L100
- uses a git hash as the RV for some objects - https://github.com/nephio-project/porch/blob/4c066b6986533445fb15143507e7ce6470b66c72/pkg/externalrepo/git/package.go#L62 (1/10,000,000 chance of an all-numeric hash on a single object, 1/100,000,000,000,000 chance of two non-identical all-numeric hashes to try to compare)
3. Servers where resourceVersion is an integer, but is not comparable
Obviously this is possible, but do we have any actual known examples of real servers that do this?
I would really want to push extension API server authors to live in category 1 or 2. Either make your integer resourceVersions comparable like everyone expects (preferred), or make them clearly not integers if that's not possible.
Unless there's evidence that there are actually servers like this in use that won't adjust to be in category 1 or 2, I'd push back on the complexity of adding something like "integerResourceVersionIsComparable: false" to discovery per-type and making clients check it.
keps/sig-api-machinery/5504-comparable-resource-version/kep.yaml
Outdated
Show resolved
Hide resolved
keps/sig-api-machinery/5504-comparable-resource-version/README.md
Outdated
Show resolved
Hide resolved
keps/sig-api-machinery/5504-comparable-resource-version/README.md
Outdated
Show resolved
Hide resolved
keps/sig-api-machinery/5504-comparable-resource-version/README.md
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did another sweep, this is looking good
keps/sig-api-machinery/5504-comparable-resource-version/README.md
Outdated
Show resolved
Hide resolved
keps/sig-api-machinery/5504-comparable-resource-version/README.md
Outdated
Show resolved
Hide resolved
keps/sig-api-machinery/5504-comparable-resource-version/README.md
Outdated
Show resolved
Hide resolved
keps/sig-api-machinery/5504-comparable-resource-version/README.md
Outdated
Show resolved
Hide resolved
keps/sig-api-machinery/5504-comparable-resource-version/README.md
Outdated
Show resolved
Hide resolved
keps/sig-api-machinery/5504-comparable-resource-version/README.md
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does merging this KEP effectively turn the InformerResourceVersion feature gate GA?
The feature gate is not very well documented but as far as I understand, it was in place just to wait until ResourceVersions are comparable to make sure the invocation of cache.Controller{}.LastSyncResourceVersion() returns a sensible result.
cc @nilekhc
keps/sig-api-machinery/5504-comparable-resource-version/README.md
Outdated
Show resolved
Hide resolved
Um.... not quite? (I didn't actually know about that gate). This KEP defines how client-side comparisons should happen, but those comparisons can return errors... I don't see any comparison or error-handling path in the spots where the InformerResourceVersion gate is used (or where the underlying LastSyncResourceVersion() is set or retrieved). InformerResourceVersion would have to be updated to use the defined comparison approach, and handle encountered errors properly, I think. |
I think these are two separate things, although I may be misunderstanding the feature gate. All I think that |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: michaelasp The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |