Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[kots]: delete workspace pods before installing Gitpod #13215

Merged
merged 2 commits into from
Sep 26, 2022

Conversation

mrsimonemms
Copy link
Contributor

@mrsimonemms mrsimonemms commented Sep 22, 2022

Description

Adds a preflight check to see if there are any workspaces/image-builds running. It fails if there is one (the textAnalyzer doesn't allow for warn, which is very annoying as this isn't really a failure, but a warning).

It also stops any running workspaces/image-builds prior to the helm upgrade function being executed.

Related Issue(s)

Fixes #13147

How to test

Try installing via KOTS both with and without a workspace running.

Release Notes

[kots]: delete workspace pods before installing Gitpod

Documentation

Werft options:

  • /werft with-local-preview
    If enabled this will build install/preview
  • /werft with-preview
  • /werft with-integration-tests=all
    Valid options are all, workspace, webapp, ide

@mrsimonemms
Copy link
Contributor Author

/hold as requires @gitpod-io/engineering-workspace input too

@github-actions github-actions bot added team: delivery Issue belongs to the self-hosted team and removed do-not-merge/hold labels Sep 23, 2022
@mrsimonemms mrsimonemms requested a review from a team September 23, 2022 07:50
@github-actions github-actions bot added the team: workspace Issue belongs to the Workspace team label Sep 23, 2022
@@ -156,6 +156,9 @@ EOF
HELM_TIMEOUT="1h"
fi

echo "Gitpod: shut down any running workspaces/image-builders"
kubectl delete pods -n "${NAMESPACE}" -l component=workspace --wait
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this would mean that during upgrade the workspaces that are currently in use will get interrupted? I thought it was a nice feature that we had, allowing uninterrupted upgrades. And how would this affect potential backup jobs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understand it, this has been requested by @gitpod-io/engineering-workspace to guarantee that the workspaces are properly backed up. If you check out #13147, there's a whole discussion on this in there

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

totally missed the context 🙈 reading!

@mrsimonemms
Copy link
Contributor Author

/hold this seems to have been removed incorrectly

@mrsimonemms mrsimonemms force-pushed the sje/installer-kill-workspaces branch 2 times, most recently from edd2fcd to 91c570e Compare September 26, 2022 13:47
@mrsimonemms mrsimonemms force-pushed the sje/installer-kill-workspaces branch from 91c570e to c988742 Compare September 26, 2022 15:16
@roboquat roboquat added size/M and removed size/S labels Sep 26, 2022
@mrsimonemms
Copy link
Contributor Author

mrsimonemms commented Sep 26, 2022

/werft run publish-to-kots

👍 started the job as gitpod-build-sje-installer-kill-workspaces.6
(with .werft/ from main)

@mrsimonemms mrsimonemms removed the request for review from a team September 26, 2022 15:20
@mrsimonemms mrsimonemms force-pushed the sje/installer-kill-workspaces branch 6 times, most recently from 5e9a4ad to 3048413 Compare September 26, 2022 17:27
# Get list of workspace instances from gpctl
for instance in $(/app/gpctl workspaces list -o json | jq -r 'select(. != null) | .[] | .Instance'); do
echo "Gitpod: shutting down workspace ${instance}"
/app/gpctl workspaces stop "${instance}"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This command can be a bit flakey if (presumably) the gRPC server communication errors - I've seen logs like:

time="2022-09-26T17:20:18Z" level=fatal msg="cannot connect" error="error upgrading connection: error dialing backend: EOF"

Three options as I see it:

  1. Do nothing and let it error. A redeploy will almost always be fine
  2. Add a || true and never let it error. This allows for missing backups
  3. Add a || sleep 10 && /app/gpctl workspaces stop "${instance}" to allow it to retry after 10 seconds

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Option 3 sounds like the best balance of reliability and safety and should catch most of the spurious gRPC server errors; I'd favor that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'll make that change now so you can get the RC out

Copy link
Contributor

@adrienthebo adrienthebo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/hold for adding a retry on gpctl.

Looks good to me; I haven't functionally verified this but we can do so during the RC release testing.

@mrsimonemms mrsimonemms force-pushed the sje/installer-kill-workspaces branch from 3048413 to 42a660d Compare September 26, 2022 20:55
@mrsimonemms mrsimonemms force-pushed the sje/installer-kill-workspaces branch from 42a660d to 2e331aa Compare September 26, 2022 21:06
@mrsimonemms
Copy link
Contributor Author

Retry added as described

@mrsimonemms
Copy link
Contributor Author

/unhold

@roboquat roboquat merged commit 74429dd into main Sep 26, 2022
@roboquat roboquat deleted the sje/installer-kill-workspaces branch September 26, 2022 21:17
@roboquat roboquat added the deployed: workspace Workspace team change is running in production label Sep 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
deployed: workspace Workspace team change is running in production release-note size/M team: delivery Issue belongs to the self-hosted team team: workspace Issue belongs to the Workspace team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

KOTS: stop running workspaces prior to upgrading existing workspace for single cluster ref arch
4 participants