-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🌱 clusterctl: always run crd migration if possible to reduce conversion webhook usage #10513
🌱 clusterctl: always run crd migration if possible to reduce conversion webhook usage #10513
Conversation
Nevermind /hold cancel |
… webhook usage If objects are still stored in the old version, every time an object is requested the kube-apiserver will call the conversion webhook. Instead of lots of conversion webhook calls we once do the migration if possible. Also increases the timeout used for the migrations: Before running CRD migration clusterctl usually upgrades cert-manager which may lead to updating Certificates and rolling out new certificates to our controllers. There is a chance that this can take up to 90s in which conversion, validating or mutatingwebhooks may be unavailable. Because the migration gets run more aggresively during upgrades this also increases the relevant timeouts.
baa5b6d
to
d754ad5
Compare
/hold For getting a good review |
/test pull-cluster-api-e2e-main |
@killianmuldoon PTAL. I think the logic around that we have to go through CR migration if an apiVersion becomes unserved was flawed. A version doesn't have to be served to migrate it: a) the get & update calls can be done against the current storage version b) served is not required for conversion (apiserver will just call the conversion webhook and that's it) I think the new logic is significantly easier to reason through. Basically trigger migration ASAP to avoid unnecessary conversion webhook calls later at runtime. |
d3879cf
to
4a7acb0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good - pending fixups - this is definitely easier to reason about than the previous migrations
/lgtm |
LGTM label has been added. Git tree hash: 3079234d0ebd5b679a4a30871d4fb0d2d29ebf2d
|
Thx! /approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: sbueringer The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/hold cancel Assuming a good review (or two or three) has been provided 😂 |
/cherry-pick release-1.7 |
@cahillsf: new pull request created: #11513 In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
What this PR does / why we need it:
If objects are still stored in the old version, every time an object is requested
the kube-apiserver will call the conversion webhook. Instead of lots of conversion
webhook calls we once do the migration if possible.
Also increases the timeout used for the migrations: Before running CRD migration
clusterctl usually upgrades cert-manager which may lead to updating Certificates and
rolling out new certificates to our controllers. There is a chance that this can take
up to 90s in which conversion, validating or mutatingwebhooks may be unavailable.
Because the migration gets run more aggresively during upgrades this also increases
the relevant timeouts.
This also fixes to not run the CRD version migration, if the stored version is equal to the storageVersion (which would result in running migration, but being a no-op because of no conversion happening).
This could e.g. be the case when upgrading from v0.4 to v1.6 or v1.7: What would happen with the old code:
storageVersion=v1alpha4
andstoredVersions = [v1alpha4]
clusterctl upgrade apply
storedVersions = [v1alpha4, v1beta1] (as soon as the first object got applied
It is totally fine that v1alpha4 is not served anymore with the new CRD, important is that it is still in the versions of the new CRD, so kube-apiserver is still able to run the conversion webhook. The next upgrade would cover removing the
valpha4
from storedVersions, because the migration would be done.We have a safeguard to not run in the error case above the changed code:
cluster-api/cmd/clusterctl/client/cluster/crd_migration.go
Line 114 in 0388967
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #
/area clusterctl