Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wipe deleted Workspace (+ their WorkspaceInstances) at some point #11378

Closed
geropl opened this issue Jul 14, 2022 · 7 comments · Fixed by #11360
Closed

Wipe deleted Workspace (+ their WorkspaceInstances) at some point #11378

geropl opened this issue Jul 14, 2022 · 7 comments · Fixed by #11360
Assignees

Comments

@geropl
Copy link
Member

geropl commented Jul 14, 2022

Motivation

  • getting rid of old shapes/variance: our code is by now riddled with optional fields (like this: someField?: string) just because we once in 2018 had a period where we would not set those field. This is an attempt to get rid of those variance to at least have a chance to streamline the code
  • dead weight: This is about removing old deleted workspaces only: It seems there is no sense in keeping them around
  • reduce table size:
    • reduce migration times: running a migration on one of the big tables takes ~50mins by now, which does not go well with CD
    • reduce storage: This has become a concern, although not a problem, yet
    • query time: This is a nice benefit, although not a driver as we don't have problems here for now

Solutions

Add 3rd "purged" state

Currently our Workspace deletion model has two phases:

  • soft deleted: A workspace goes into this state either when a) a user deletes it or b) it is garbage collected after a # of days of inactivity. The effect is that is "hidden" from our API, and can easily be reversed e.g. on user request, or inspected in case of abuse, for instance.
  • content deleted: A workspace goes into this mode after a certain time being in "soft deleted" state. All workspace contents (backups, logs, etc.) is deleted, and the workspace record is scrubbed from all PII, but the skeleton and main data model stays intact. This is helpful for being able to trace back issues or incidents even after the actual workspace content has been deleted.

This issue proposes to introduce a 3rd state:

  • purged: A workspace moves into this state after a certain amount of time being in "content deleted" state. All workspace entries (workspace, instance, prebuild) are completely wiped from the DB.

Remarks

Beyond the pure technical aspect, this affects all of downstream systems as well: We need to make sure that they can handle not getting the total view over all workspaces. 🧘

@geropl geropl moved this to Scheduled in 🍎 WebApp Team Jul 14, 2022
@geropl geropl self-assigned this Jul 14, 2022
@geropl geropl moved this from Scheduled to In Progress in 🍎 WebApp Team Jul 14, 2022
@atduarte
Copy link
Contributor

Is the goal to reduce storage cost or optimize query times? If the latter, did we consider partitioning the table by year?

@geropl
Copy link
Member Author

geropl commented Jul 15, 2022

Is the goal to reduce storage cost or optimize query times? If the latter, did we consider partitioning the table by year?

There are several motivations. I updated and listed them above. ☝️ 🙏

@geropl geropl changed the title Delete Workspace (+ their Instances) at some point Wipe deleted Workspace (+ their WorkspaceInstances) at some point Jul 15, 2022
@geropl
Copy link
Member Author

geropl commented Jul 19, 2022

Dropping assignment as I'm not actively working on this besides talks with downstream clients on how this might affect them and how to mitigate that.

@geropl geropl removed their assignment Jul 19, 2022
@geropl geropl self-assigned this Aug 29, 2022
@geropl geropl assigned easyCZ and unassigned geropl Sep 15, 2022
@geropl
Copy link
Member Author

geropl commented Sep 15, 2022

Assigned to Milan after discussion here.

Repository owner moved this from In Progress to Done in 🍎 WebApp Team Sep 15, 2022
@easyCZ
Copy link
Member

easyCZ commented Sep 15, 2022

Re-opening as we'll want to follow-up with other db records, as well as lower the retention period.

@easyCZ easyCZ reopened this Sep 15, 2022
Repository owner moved this from Done to In Progress in 🍎 WebApp Team Sep 15, 2022
@easyCZ
Copy link
Member

easyCZ commented Sep 21, 2022

We're currently not deleting Prebuilds for Workspaces which we're purging, this leaves zombie records.

We'll need to do 2 things:

  • Start deleting the prebuild records in the GC loop
  • Retrospectively find all prebuilds which don't have a WS and delete these

To actually start deleting prebuild records - PR, we will need the following:

@easyCZ
Copy link
Member

easyCZ commented Sep 25, 2022

We've now purged all Workspaces and Workspace Instances from 2 years ago and we've cought up against the current rate workspaces entering this purging window - dashboard.

#13234 switched purging to after 1 year from content deletion.

We're also now purging Prebuilds and related records. However, we'll need to identify records which did not get automatically cleaned up and delete these manually as well.

@easyCZ easyCZ moved this from In Progress to In Validation in 🍎 WebApp Team Sep 26, 2022
@easyCZ easyCZ moved this from In Validation to Done in 🍎 WebApp Team Oct 4, 2022
@easyCZ easyCZ closed this as completed Oct 6, 2022
Repository owner moved this from Done to In Validation in 🍎 WebApp Team Oct 6, 2022
@easyCZ easyCZ moved this from In Validation to Done in 🍎 WebApp Team Oct 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

3 participants