-
Notifications
You must be signed in to change notification settings - Fork 93
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add upgrade steps, instructions for 2023.9.1 #2029
Conversation
Tested upgrade on AWS - looks good Going from 2023.5.1 to 2023.9.1, the cluster was destroyed and rebuilt (@iameskild addressed this with more warnings in the upgrade step to 2023.7.1, where the cluster destroy occurs) |
@@ -8,7 +8,7 @@ | |||
# 04-kubernetes-ingress | |||
DEFAULT_TRAEFIK_IMAGE_TAG = "2.9.1" | |||
|
|||
HIGHEST_SUPPORTED_K8S_VERSION = ("1", "26", "7") | |||
HIGHEST_SUPPORTED_K8S_VERSION = ("1", "26", "9") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a question. How is this determined? Have we tested newer versions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have tested this new version on all of the cloud providers and I bumped it so we could support DOKS version 1.26.9-do.0 since this appears to be only version of kubernetes 1.26 available on DO.
"""Return the major.minor version of the k8s version string.""" | ||
|
||
k8s_version = str(k8s_version) | ||
# Split the input string by the first decimal point | ||
parts = k8s_version.split(".", 1) | ||
|
||
if len(parts) == 2: | ||
# Extract the part before the second decimal point | ||
before_second_decimal = parts[0] + "." + parts[1].split(".")[0] | ||
try: | ||
# Convert the extracted part to a float | ||
result = float(before_second_decimal) | ||
return result | ||
except ValueError: | ||
# Handle the case where the conversion to float fails | ||
return None | ||
else: | ||
# Handle the case where there is no second decimal point | ||
return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can use packaging
for this:
>>> from packaging import version
>>> version.parse('2.3.4')
<Version('2.3.4')>
>>> version.parse('2.3.4') > version.parse('2.3.1')
True
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The DO versions aren't just numeric (example - 1.18.19-do.0
) and I get back packaging.version.InvalidVersion
when trying to use this library for them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a slug, not the actual k8s versions, you can get explicit kubernetes versions too, e.g.:
$ doctl kubernetes options versions
Slug Kubernetes Version Supported Features
1.28.2-do.0 1.28.2 cluster-autoscaler, docr-integration, ha-control-plane, token-authentication
1.27.6-do.0 1.27.6 cluster-autoscaler, docr-integration, ha-control-plane, token-authentication
1.26.9-do.0 1.26.9 cluster-autoscaler, docr-integration, ha-control-plane, token-authentication
1.25.14-do.0 1.25.14 cluster-autoscaler, docr-integration, ha-control-plane, token-authentication
Alternatively you can split by -
and use the first part.
@@ -2,7 +2,7 @@ terraform { | |||
required_providers { | |||
google = { | |||
source = "hashicorp/google" | |||
version = "4.83.0" | |||
version = "4.8.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I needed to revert to what we had in our previous release because when I tried to redeploy with a newer Kubernetes version, it complained about the following although many of those fields are indeed set:
Error: googleapi: Error 400: At least one of ['node_version', 'image_type', 'updated_node_pool', 'locations', 'workload_metadata_config', 'upgrade_settings'] must be specified., badRequest
Otherwise, there doesn't seem away around this unless you delete the node groups and then remove them from the Terraform state which is an accident prone task...
@@ -57,10 +57,6 @@ resource "google_container_cluster" "main" { | |||
} | |||
} | |||
|
|||
cost_management_config { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed because not supported with GCP Terraform provider version 4.8.0
(see other comment for reason why).
@@ -22,6 +27,8 @@ | |||
) | |||
ARGO_JUPYTER_SCHEDULER_REPO = "https://github.com/nebari-dev/argo-jupyter-scheduler" | |||
|
|||
UPGRADE_KUBERNETES_MESSAGE = "Please see the [green][link=https://www.nebari.dev/docs/how-tos/kubernetes-version-upgrade]Kubernetes upgrade docs[/link][/green] for more information." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created these docs to walk folks through the Kubernetes upgrade process: nebari-dev/nebari-docs#367
I would like to test this for Digital Ocean, I just haven't had the time yet...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tested this on Digital Ocean.
@pytest.mark.skipif( | ||
_nebari.upgrade.__version__ < "2023.9.1", | ||
reason="This test is only valid for versions <= 2023.9.1", | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To get these tests to run, we need to create a new tag 2023.9.1
. You can do this locally to confirm they pass.
"-> The Kubernetes version is multiple minor versions behind the minimum required version. You will need to perform the upgrade one minor version at a time. For example, if your current version is 1.24, you will need to upgrade to 1.25, and then 1.26." | ||
) | ||
rich.print( | ||
f"-> Update the value of [green]{provider_config_block}.kubernetes_version[/green] in your config file to a newer version of Kubernetes and redeploy." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can confirm that redeploying with a Kubernetes version one minor higher than the one the user is running will work for GCP and Azure. @kenafoster can you confirm that this will also work as advertised on AWS?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it works on AWS. However, if they do it multiple versions, they'll need to upgrade the node pools manually, and that part must be done outside Nebari
@@ -8,7 +8,7 @@ | |||
# 04-kubernetes-ingress | |||
DEFAULT_TRAEFIK_IMAGE_TAG = "2.9.1" | |||
|
|||
HIGHEST_SUPPORTED_K8S_VERSION = ("1", "26", "7") | |||
HIGHEST_SUPPORTED_K8S_VERSION = ("1", "26", "9") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have tested this new version on all of the cloud providers and I bumped it so we could support DOKS version 1.26.9-do.0 since this appears to be only version of kubernetes 1.26 available on DO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have any strong objection, that should stop this PR from getting merged, apart from some minor nitpicks. I am happy to get this in as long as tests pass.
thanks for all the work @kenafoster @iameskild
Reference Issues or PRs
What does this implement/fix?
Supersedes #2021 (@kenafoster I couldn't push to your remote branch or open PR against that branch)
Put a
x
in the boxes that applyTesting
I tested these upgrade commands (and redeployment) on GCP and Azure. @kenafoster did you happen to test the upgrade command / redeployment on AWS?
Any other comments?