-
Notifications
You must be signed in to change notification settings - Fork 221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to handle unset apiVersion
and kind
in nodeClassRef
#909
Comments
Not quite following this point. Is this to say that CPs would vend an annotation on their NodeClass CRD like
Can't we just make the kind and apiVersion fields non-hashed? I'm not sure if it wires through properly, being a referenced struct, but we should check the internal implementation of the hashing function here to see if it recursively hashes. |
Yep, you could imagine that a Cloud Provider adds annotations to NodeClasses to tell the
Changing from using it in the hash to not using it in the hash is a breaking change. The hashing mechanism does currently recursively hash everything. You can imagine that it basically does a JSON serialization and then hashes that, which means that it includes the current |
Right, so if it recursively respects hash settings per-field, if we add the Kind and APIVersion to the spec, we also add the The hashing compatibility is only an issue when we want to remove fields that were previously hashed, as going from hash(n fields) -> hash(n+1 fields) is injective, and hash(n fields) -> hash(n-1 fields) is surjective.
Doesn't this assume that there's only one NodeClass CRD for a given CP? We'd need to explicitly add a |
Someone may one day want to allow people to migrate from: nodeClassRef:
apiVersion: karpenter.k8s.aws/v1
kind: EC2NodeClass
name: default to nodeClassRef:
apiVersion: karpenter.k8s.aws/v2alpha1 # new API version
kind: EC2NodeClass
name: default (or a variant on that scheme, such as dual-running two NodePools as part of the rollout). How about:
and
Helm charts etc could have the controller's mutation of existing objects on by default and have the webhook on by default. Later down the line, eg v1beta3, switch the controller to not mutate existing BTW: Kubernetes API versioning aims to support round-trip compatibility between stable versions, eg We don't promise it works fine between a stable version and a newer alpha, though for core Kubernetes we do try hard. This is relevant because we probably do want |
Unfortunately, this isn't correct. We are hashing the entirety of the In either case, I think we want to hash on these values, so I think our current behavior is correct. The main concern here is just handling avoiding node rolling if we start defaulting it.
It doesn't necessarily imply that. You could imagine that you could create an annotation solution that had
Personally, this is why I think we should use
If we bumped to a new apiVersion and started requiring the value to be set in v1beta2, I'm imagining that the code would have to start automatically setting the This is part of the reason that I was thinking that it would be reasonable to default in the OpenAPI spec for the CRDs that the CloudProviders release (I see this as an equivalent solution to the webhook solution since it's basically a defaulting webhook that just uses the OpenAPI schema to set the default). Setting the default in this way seems reasonable to me, but we still have to overcome the hash value used for drift changing here. |
Why do we want to include the reference itself in the hash? There is a separate hash on NodeClass itself, so there is already a mechanism for detecting material change there. It seems it is a detected change in a combination of NodePool and NodeClass that indicates drift, the reference between them should not matter? (Except for the purpose of actually locating the corresponding NodeClass ...) Or am I missing something? (I understand the unfortunate implications of possibly changing the hashing function, just wondering about "our current behavior is correct" ...) |
Changing the hash when the reference changes is the least surprising of the two options IMO. For instance, the AWS provider tags instances with the |
I see. Maybe another way to think about this is that, since NodeClass name itself happens to be part of / affects the desired state of resources (at least in AWS provider), it should be included in the computed hash of NodeClass? Would that achieve the same effect? (I do realize that most changes to hash logic are breaking, just trying to understand the tradeoffs.) It also occurs to me that this tag (and maybe resource tagging in general) may be yet another good candidate for in-place drift? |
Yeah, I could definitely see an argument for that. Today, we don't in-place drift anything, but I could see us configuring to do this for certain resources. Out of curiousity, what's the benefit for you of not drifting on changes to the |
This is cited as one for the reasons for having to do drift hash versioning, so got me wondering why do it to begin with. Other than that - no particular benefit, just trying to get clarity on where a change in reference properly fits in the drift framework, and understand the forces and tradeoffs between options. |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
/remove-lifecycle rotten |
Description
What problem are you trying to solve?
As part of #337, we need to know the
apiVersion
and thekind
of thenodeClassRef
. This is also true when it comes to supporting #493 where we would want to roll-up NodeClass readiness into NodePool readiness so that we can determine if a NodePool can be provisioned with based on both whether the NodePool is in a "Ready" state as well as when the NodeClass is in a "Ready" state. You can imagine something like the example below when it comes to reasoning about NodeClass readiness and NodePool readiness.NodeClass Readiness
NodePool Readiness
Right now, there is no hard requirement to set
apiVersion
andkind
in yournodeClassRef
(onlyname
) since the CloudProviders who are actually grabbing thenodeClassRef
to pull true type forGetInstanceTypes
andCreate
calls already know the type under the hood. For AWS, this type is alwaysEC2NodeClass
; for Azure, this type is alwaysAKSNodeClass
today. This works fine for now; however, this becomes problematic when we want to grab the underlying resource dynamically from the neutral code using the dynamic client, like we would need to do if we were trying to grab the readiness of the NodeClass to bubble-up that readiness into the NodePool.We need to have some way to infer the actual type and the
name
field isn't enough. Ideally, we would require that theapiVersion
andkind
always be set in the NodePool'snodeClassRef
; however, doing so out of the gate would be a breaking change and would break existing resources which don't already have this field set on the cluster. There are a couple thoughts that I had here around how to achieve this:Solution 1: Runtime Defaults through
cloudprovider.go
Codify the concept of a "default" NodeClass. This means that a CloudProvider would set some global variables
var DefaultNodeClassKind string
andvar DefaultNodeClassApiVersion string
in thecloudprovider.go
file which Karpenter would cast an object into if the type wasn't explicitly set on thenodeClassRef
. This works fine, but probably doesn't have good, long-term extensibility. It's also less observable to set this at runtime vs. having it statically resolve into the NodePool. You could also imagine a way to do this through something like a CRD annotation that would set the default NodeClass through the discovery client (in a way very similar to the default storage class). This is really a fancier version of the in-code methodology so for the purposes of thinking about the best decision here, I'm throwing it into the same bucket.Solution 2: Static Defaults through OpenAPI
The second option is to have the CloudProviders today default the
apiVersion
and thekind
directly into the OpenAPI spec so that the string values are automatically resolved into the NodePool statically. That basically means that if someone submitted the followingnodeClassRef
through their manifestSubmitted Manifest
Resolved Manifest (for AWS)
We can allow these values to be defaulted for some time and then require them to be set explicitly at
v1
. This would allow existing resources that don't have these fields set to have a migration path to be fully required to be set down the line. The kicker of this solution: We can't update thenodeClassRef
defaulting without causing a breaking change to the hash mechanism that we are using for drift. If we updating the defaulting mechanism outright, we would cause all nodes on the cluster to drift and get rolled on upgrade, which we obviously want to avoid. One solution to the drift problem is:karpenter.sh/nodepool-hash-version
v2
on NodePool's and re-hash the NodePool when we upgrade to the new versionkarpenter.sh/nodepool-hash
based on the current hash value of the owning NodePool. Update thekarpenter.sh/nodepool-hash
and setkarpenter.sh/nodepool-hash-version
tov2
on startupkarpenter.sh/nodepool-hash-version
annotation set to equal thekarpenter.sh/nodepool-hash-version
of the owning NodePoolThe upshot of Solution 2 is that we would need to coordinate setting the OpenAPI defaults with the other cloud providers that we support on the same minor version. That way, each cloudprovider got the benefit of us bumping the hash version alongside us setting static defaults for
nodeClassRef
s.How important is this feature to you?
This unblocks us in different ways where we would like to do dynamic watches based on unknown
apiVersion
/group
andkind
but have some of these values unset in NodePool's today.The text was updated successfully, but these errors were encountered: