Skip to content

ws-manager-mk2 - or what if we built it today? #9596

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 26 commits into from
Jan 23, 2023
Merged

Conversation

csweichel
Copy link
Contributor

@csweichel csweichel commented Apr 27, 2022

Description

This PR is the initial work for getting ws-manager-mk2 off the ground. It can start workspaces and do initialize the content. Communication between ws-manager, ws-daemon and ws-proxy is done through a workspace CRD. This is not a production grade implementation yet. For milestones see the milestone plan. Features not yet working will be added in subsequent PRs.

PoC: ws-manager-mk2 - Watch Video

When we built ws-manager more then three years ago, kubebuilder wasn't a thing and writing Kubernetes controllers was hard(er). Also, we didn't know too much about Kubernetes at the time.

This PR shows how ws-manager would look like if it were written as a Kubernetes controller, CRDs and all. The main takeway: it terrifyingly simplifies things. The code is orders of magnitudes easier to read and maintain compared to what we have today.

If we just look at the lines of code (a weak, but easily quantifiable proxy for complexity):

# ws-manager-mk2: 7906 SLOC
$ find ws-manager/pkg/manager -name '*.go' | xargs cat | wc -l
7906

# ws-manager-mk2: 2625 SLOC
$ find ws-manager-mk2/controllers/ -name '*.go' | xargs cat | wc -l
1773
$ find ws-manager-mk2/service/ -name '*.go' | xargs cat | wc -l
852

The version in this branch is nearly happy-path feature complete:

  • it is able to start and stop workspaces
  • supports headless workspaces, e.g. image builds and maybe prebuilds
  • it talks to ws-daemon to do content init and disposal
  • it can handle ports,
  • it acts as imagespec provider for registry-facade,
  • it's a drop-in replacement for ws-manager; uses the exact same config an provides the very same API.

I've also extended ws-proxy to use the workspace CRs as workspace info source.

There's a bunch of things missing in this branch:

  • tests of any kind really
  • better non-happy path handling
  • did I mention tests?
  • it's not integrated into the installer yet

Ok, but why?

Other than the considerable reduction in complexity this approach gets us in ws-manager,

  • it simplifies the workspace side as a whole, because we can build the in-cluster interactions using CRs rather than gRPC. Much like ws-proxy functions today, ws-daemon and image-builder could interact on the custom resource rather than speak gRPC. This greatly simplifies state management.
  • it lets us implement functionality that's independent of the pod's lifecycle, e.g. image builds become much easier to do.
  • it lets us implement workspaces which are not pods. Today, all workspaces must be a pod. If we ever want to move to other means of executing workspaces, we'd have a ton of work ahead (think workspaces as harvester VMs? :D - not an actual suggestion).

Do you not have better things to do?

This was primarily an evening/weekend project - less than 20h have gone into this.

How to review

  • Check for adverse side effects. This PR should not affect the function of Gitpod in any way. It's merely the foundation for things to come.
  • Workspaces don't actually start yet using mk2. This is expected. We need to get this changeset off a feature branch though.

Release Notes

NONE

Relates with #11416

/werft with-vm=true
/werft with-preview=true
/werft with-wsman-mk2=true

@csweichel
Copy link
Contributor Author

csweichel commented Apr 28, 2022

/werft run

👍 started the job as gitpod-build-cw-ws-manager-mk2.41
(with .werft/ from main)

@csweichel csweichel force-pushed the cw/ws-manager-mk2 branch 5 times, most recently from 9d728da to 068908b Compare May 5, 2022 12:39
@csweichel csweichel force-pushed the cw/ws-manager-mk2 branch 2 times, most recently from 0f56fb6 to 3c0d700 Compare May 10, 2022 18:14
@csweichel csweichel force-pushed the cw/ws-manager-mk2 branch 5 times, most recently from 853e298 to c701b07 Compare May 20, 2022 07:44
@stale
Copy link

stale bot commented May 30, 2022

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the meta: stale This issue/PR is stale and will be closed soon label May 30, 2022
@csweichel csweichel force-pushed the cw/ws-manager-mk2 branch from c701b07 to 72c7ed8 Compare June 1, 2022 17:32
@stale stale bot removed the meta: stale This issue/PR is stale and will be closed soon label Jun 1, 2022
@csweichel csweichel force-pushed the cw/ws-manager-mk2 branch 4 times, most recently from 5b22eb4 to 7b63891 Compare June 2, 2022 09:18
@csweichel
Copy link
Contributor Author

csweichel commented Jun 2, 2022

/werft run

👍 started the job as gitpod-build-cw-ws-manager-mk2.63
(with .werft/ from main)

@csweichel csweichel force-pushed the cw/ws-manager-mk2 branch 3 times, most recently from 6141beb to b13d9e2 Compare June 2, 2022 12:17
@stale
Copy link

stale bot commented Jun 12, 2022

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the meta: stale This issue/PR is stale and will be closed soon label Jun 12, 2022
@@ -10,4 +10,4 @@ packages:
- "go.sum"
config:
packaging: library
dontTest: false
dontTest: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-blocking, question

@Furisto why disable? Just curious.

Copy link
Member

@Furisto Furisto Jan 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah was not aware of that but it makes sense. Currently the test we have (for the webhook) fails. Webhooks will be removed with the next PR which means the test will be removed as well.

Copy link
Member

@aledbf aledbf Jan 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Webhooks will be removed with the next PR which means the test will be removed as well.

Do we have a written reason why this is being done? What's the replacement?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a suggestion from @csweichel. Currently there is just nothing that we would want to validate.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's talk about that in the next sync. One of the goals of the webhook is to avoid persistence with invalid values and wasting sync loops in the controller.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Furisto was the thought to use CEL, instead of webhooks to do validation? This article suggests we cannot use CEL until Kubernetes 1.25 (we're currently on 1.23, and cannot move to 1.24 yet).

Copy link
Contributor

@kylos101 kylos101 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@Furisto
Copy link
Member

Furisto commented Jan 23, 2023

/unhold

@roboquat roboquat merged commit 445b834 into main Jan 23, 2023
@roboquat roboquat deleted the cw/ws-manager-mk2 branch January 23, 2023 13:14
@roboquat roboquat added the deployed: IDE IDE change is running in production label Jan 23, 2023
@kylos101 kylos101 mentioned this pull request Jan 23, 2023
4 tasks
@roboquat roboquat added the deployed: webapp Meta team change is running in production label Jan 24, 2023
@roboquat roboquat added the deployed: workspace Workspace team change is running in production label Jan 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
deployed: IDE IDE change is running in production deployed: webapp Meta team change is running in production deployed: workspace Workspace team change is running in production feature: ws-manager-mk2 meta: never-stale This issue can never become stale release-note-none size/XXL team: delivery Issue belongs to the self-hosted team team: devx team: IDE team: SID team: staff-engineers team: webapp Issue belongs to the WebApp team team: workspace Issue belongs to the Workspace team
Projects
No open projects
Status: Done
Development

Successfully merging this pull request may close these issues.

10 participants