-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KOTS: stop running workspaces prior to upgrading existing workspace for single cluster ref arch #13147
Comments
@lucasvaltl @corneliusludmann may we ask for your help in treating this as a priority for the September release? |
I'm afraid we are quite limited regarding the KOTS UX and cannot ask the user. @mrsimonemms any ideas? |
We cannot add a "this is the impact" type message, but there is always a confirmation before the deployment is made (unless they have auto-deployments configured). Documenting the impact in the Gitpod docs is the only option. Am I right in thinking that the reason for stopping the workspaces is to enforce the workspaces to backup to the storage? Suggestions
Questions
|
My idea here for an absolute skateboard would be to add a preflight check (should be top of the list of preflight checks in the UI) to check for running workspaces. If workspaces are running, the check should fail and point to the (new) documentation page around stopping workspaces in this PR. |
Yes. Otherwise, the workspaces will continue to run, KOTS will delete the gitpod installation (including |
I'm working on a test for this, @mrsimonemms , where basically we want to prevent users from starting workspaces during outage windows for updates. Options:
For awareness, I've created #13150, because we cannot easily test in our preview environments, due to the cluster name showing up as an empty string. |
@lucasvaltl That will help for workspaces that are running before the upgrade is attempted, however, we also need to put the Gitpod installation into a state where it doesn't allow users to try starting workspaces...otherwise they'll have a poor experience during the upgrade. |
Thanks for the clarification @kylos101. I agree with @lucasvaltl's earlier comment of having a 🛹 and then bringing this additional stuff into it. From experience, upgrades tend to only take a couple of minutes to run - if it's done immediately before the |
@mrsimonemms do we prompt the user to see which ref arch they're using? If they're using the single cluster ref arch, and there are running workspaces, it would be great if the deploy process can hard fail, sharing that workspaces are currently running. In other words, my understanding is that the pre-flight checks are soft, and can be ignored. I'd hate for an administrator to shoot themselves in the foot, and cause users to lose data. |
@kylos101 No, the only prompt is a big "deploy" button - they can choose to skip the pre-flight checks, where there's another "we don't recommend this - it may break things" alert. Again, we don't have any control over this content or whether they can skip it. The idea is the pre-flight checks are idempotent and that a change only happens when they click "deploy" |
@kylos101 Fair! What I proposed at least lessens the pain. If we can also get the installation into a state where new workloads cannot be started - all the better. Was just not sure if we can get something done for this in a reasonable timeframe :) |
@kylos101 this command will also stop any running image builds - I presume that is a desired effect of this? |
@kylos101 I've had a play and created a draft PR at #13125. Unfortunately, on the app I'm testing, the workspace pod seems to have stuck on terminating. I presume that if I were to put a |
@mrsimonemms Yes sir, that is the desired effect.
What type of workspace were you testing with? Regular, prebuild, imagebuild? |
@kylos101 thanks for clarifying.
Regular workspace this time, but I've seen that behaviour on all types of workspace. It's one of those funny things that I've found over the years that if you run |
I've done some more investigation on this and can confirm that The problem is that The refactored Installer as an authClusterOrKubeconfig function, which (as the name implies) allows authentication via the supplied Once that's in, we can stop workspaces using And it would be very helpful if the EDIT: I may have found a workaround which I'm testing |
I figured out how to use the |
@mrsimonemms |
Is your feature request related to a problem? Please describe
We do not support live upgrades for the single cluster ref arch while workspaces are running.
Describe the behaviour you'd like
Before KOTS begins a deployment:
kubectl delete pods -l component=workspace
may sufficeAdditionally, as part of the monthly release cycle, a self-hosted test should be added, so that the upgrade flow with running workspaces is included as part of the testing.
Describe alternatives you've considered
N/A, this removes friction from the upgrade experience.
Additional context
The deploy process should not start in a live cluster while workspaces are running.
As of the August KOTS release, when a deploy is done to an existing cluster, resources are deleted, however...because
ws-daemon
was deleted, the workspaces could not backup, and thus could not be deleted. Therefore, it is imperative that we wait for workspace pods to be deleted (including imagebuild and prebuild), before deleting the Gitpod installation.Customers that experience this issue willl incur data loss, and to clean-up the pods, must remove the related finalizer from the regular and prebuild workspaces.
Dependent Tasks
The text was updated successfully, but these errors were encountered: