Skip to content

[server] use owner and repo name for workspace id #7391

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 7, 2022
Merged

Conversation

svenefftinge
Copy link
Member

@svenefftinge svenefftinge commented Dec 30, 2021

This change introduces optional arguments in generateWorkspaceId
for the first two segments. And makes use of it in workspace factory
using the repos org/group and name.

fixes #4129

Testing

Start a workspace on GitHub and GitLab repos (try with nested groups on GitLab) and find the repo names and orgs reflected in the workspace id.

Release Notes

Use repository org and name for workspace ids.

@svenefftinge
Copy link
Member Author

svenefftinge commented Dec 30, 2021

/werft run

👍 started the job as gitpod-build-se-workspaceid.2

@svenefftinge svenefftinge marked this pull request as ready for review December 30, 2021 12:29
@svenefftinge
Copy link
Member Author

svenefftinge commented Dec 30, 2021

/werft run

👍 started the job as gitpod-build-se-workspaceid.3

@svenefftinge
Copy link
Member Author

svenefftinge commented Dec 30, 2021

/werft run

👍 started the job as gitpod-build-se-workspaceid.4

@svenefftinge
Copy link
Member Author

svenefftinge commented Dec 30, 2021

/werft run

👍 started the job as gitpod-build-se-workspaceid.5

@csweichel
Copy link
Contributor

The change makes sense - it also considerably increases the chances for workspace ID collisions. What happens when such a collision occurs? I.e. how does Gitpod behave in that case?

@geropl
Copy link
Member

geropl commented Jan 4, 2022

I agree with the collision argument. If we increase the likelihood of collisions in the first two segments, we should increase the overall variances.

Maybe by making the 3rd argument longer?

back-of-the-envelope calculation:

  • animal (378)= x color (64) = 20412
  • if we assume this to be "1" (for special cases, e.g., big repos), we need to replace all of this variance with longer 3rd segment
  • 20412÷(36^2) = 15, 20412÷(36^3) = 0,4375
  • => making the 3rd segment 3 characters longer would be at least equivalent

@csweichel Does this sound about right? 🤔

Another thought: Did we re-visit the "id vs. title" debate again? Changing the workspaceId is always a tradeoff (see above). The questions is: Why do people look at the URL in the first place? If we made the description field more prominent (also within the workspace; ideally even before I have to switch tab!) that might become irrelevant. (found this has been mentioned by @loujaybee here)

@svenefftinge
Copy link
Member Author

I think we only need to handle a collision gracefully. I.e. catch an internal error and generate a new ID.

}

function clean(segment: string | undefined, maxChars: number = 16) {
if (segment) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I read this correct as:

if (!segment) return undefined;

const result = Array.from(segment)
  .filter(c => characters.indexOf(c) !== -1)
  .join('');
if (result.length >= 2) {
   return result.substring(0, maxChars);
}
return result;

🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I rewrote it as suggested.

@svenefftinge
Copy link
Member Author

Another thought: Did we re-visit the "id vs. title" debate again? Changing the workspaceId is always a tradeoff (see above). The questions is: Why do people look at the URL in the first place? If we made the description field more prominent (also within the workspace; ideally even before I have to switch tab!) that might become irrelevant. (found this has been mentioned by @loujaybee here)

I feel we got stuck in discussion for way too long on this. It's definitely not a solution to all related problems but an improvement over generic color-animal ids. There are other things that can be done, but I don't see why this should stop this small improvement.

@csweichel
Copy link
Contributor

csweichel commented Jan 4, 2022

I think we only need to handle a collision gracefully. I.e. catch an internal error and generate a new ID.

Totally agree. Unfortunately the cross-cluster/db-sync setup makes that difficult. Especially for "high-traffic" repos it's more likely that they're used in different regions where we cannot ensure that an ID isn't used twice (except if we wait for the db-sync interval during workspace ID generation - we definitely don't want that).

Gero's observation is on-point: with just a tad more entropy we can achieve a collision likelihood similar to what we have now. We don't even need to extend the last segment, but we could just pad/replace three characters in the second segment (if we fret modifying the regexp yet again).

Example workspace IDs would be:

gitpodio-gitpod3js-as8sjdfs
gitpodio-gitpodabc-8asfsde
kubernetes-kubernetesksd-3fads53a
somereallylongor-somereallylon384-2asfr4bc

Looking at the collision probabilities:

  • current color/animal choices: 1 / (len(animal) * len(color) * 36**8) == 1.73e-17
  • repo-based (removes the entire animal/color choice: 1 / (36**8) == 3.54-13. Is four orders of magnitude worse
  • padded (i.e. add three more random characters): 1 / (36**11) == 7.59e-18. Is even better than what we have today

Implementation is simple:

export async function generateWorkspaceID(firstSegment?: string, secondSegment?: string): Promise<string> {
    const firstSeg = clean(firstSegment) || await random(colors);
    let secondSeq: string;
    if (!!secondSegment) {
        secondSeq = clean(secondSegment, Math.min(16, 24-firstSeg.length)).substring(0, 13) + (await random(characters, 8));
    } else {
        secondSeq = await random(animals);
    }
    return firstSeg+'-'+secondSeq+'-'+(await random(characters, 8));
}

If we're concerned with mixing in random characters in the second segment, just using numbers would. also work and collide with a chance of 1 / (36**8 * 10**3) == 3.544e-16 costing us one order of magnitude.

@svenefftinge
Copy link
Member Author

Totally agree. Unfortunately the cross-cluster/db-sync setup makes that difficult. Especially for "high-traffic" repos it's more likely that they're used in different regions where we cannot ensure that an ID isn't used twice (except if we wait for the db-sync interval during workspace ID generation - we definitely don't want that).

Isn't it super unlikely that an id conflict happens within 10 mins between US and EU? Or am I missing something?

@svenefftinge
Copy link
Member Author

I'd prefer adding additional characters to the last segment instead of changing the human readable first two.

@roboquat roboquat added team: IDE team: workspace Issue belongs to the Workspace team labels Jan 4, 2022
@svenefftinge
Copy link
Member Author

I have added the three characters to the third segment as suggested. I decided to not add a collision check, because it would need to be done in the generateWorkspaceId function (we don't have an explicit insert call that could fail, well maybe that is not too hard to do with TypeORM?) and I think adding another DB request for every workspace start given how unlikely it is to have a collision isn't worth it.

@svenefftinge
Copy link
Member Author

svenefftinge commented Jan 4, 2022

/werft run with-clean-slate

👍 started the job as gitpod-build-se-workspaceid.7

@csweichel
Copy link
Contributor

/lgtm

@roboquat
Copy link
Contributor

roboquat commented Jan 5, 2022

LGTM label has been added.

Git tree hash: 479779e9058424fa4b1bb413af60675b6a676647

@svenefftinge
Copy link
Member Author

svenefftinge commented Jan 5, 2022

/werft run with-clean-slate

👍 started the job as gitpod-build-se-workspaceid.8

@codecov
Copy link

codecov bot commented Jan 5, 2022

Codecov Report

Merging #7391 (e6c234c) into main (ab20d1b) will increase coverage by 16.26%.
The diff coverage is n/a.

❗ Current head e6c234c differs from pull request most recent head 51b9866. Consider uploading reports for the commit 51b9866 to get more accurate results
Impacted file tree graph

@@             Coverage Diff             @@
##             main    #7391       +/-   ##
===========================================
+ Coverage   19.04%   35.31%   +16.26%     
===========================================
  Files           2       23       +21     
  Lines         168     2455     +2287     
===========================================
+ Hits           32      867      +835     
- Misses        134     1535     +1401     
- Partials        2       53       +51     
Flag Coverage Δ
components-local-app-app-darwin-amd64 ∅ <ø> (?)
components-local-app-app-darwin-arm64 ∅ <ø> (?)
components-local-app-app-linux-amd64 19.04% <ø> (ø)
components-local-app-app-linux-arm64 ∅ <ø> (∅)
components-local-app-app-windows-386 ∅ <ø> (∅)
components-local-app-app-windows-amd64 ?
components-local-app-app-windows-arm64 ?
components-ws-proxy-app 68.26% <ø> (?)
installer-raw-app 5.76% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
components/ws-proxy/pkg/proxy/workspacerouter.go 81.57% <ø> (ø)
installer/pkg/components/ws-manager/deployment.go 0.00% <0.00%> (ø)
components/ws-proxy/pkg/proxy/proxy.go 23.61% <0.00%> (ø)
installer/pkg/components/ws-manager/configmap.go 29.71% <0.00%> (ø)
components/ws-proxy/pkg/proxy/routes.go 82.99% <0.00%> (ø)
installer/pkg/common/display.go 0.00% <0.00%> (ø)
installer/pkg/components/ws-manager/rolebinding.go 0.00% <0.00%> (ø)
installer/pkg/common/common.go 4.64% <0.00%> (ø)
installer/pkg/common/objects.go 0.00% <0.00%> (ø)
components/ws-proxy/pkg/proxy/cookies.go 78.57% <0.00%> (ø)
... and 12 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ab20d1b...51b9866. Read the comment docs.

This change introduces optional arguments in generateWorkspaceId
for the first two segments. And makes use of it in workspace factory
using the repos org/group and name.

fixes #4129
@JanKoehnlein
Copy link
Contributor

Corrected another instance of the regex pattern.

@svenefftinge
Copy link
Member Author

Corrected another instance of the regex pattern.

Thank you! Did you also do testing and review?

@svenefftinge svenefftinge requested a review from akosyakov January 5, 2022 14:26
@svenefftinge
Copy link
Member Author

@csweichel the force pushing removed your LGTM

@@ -162,7 +162,7 @@ func run(origin, sshConfig string, apiPort int, allowCORSFromPort bool, autoTunn
return err
}
wsHostRegex := "(\\.[^.]+)\\." + strings.ReplaceAll(originURL.Host, ".", "\\.")
wsHostRegex = "([0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}|[0-9a-z]{2,16}-[0-9a-z]{2,16}-[0-9a-z]{8})" + wsHostRegex
wsHostRegex = "([0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}|[0-9a-z]{2,16}-[0-9a-z]{2,16}-[0-9a-z]{8,11})" + wsHostRegex
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it mean that existing users of VS Code desktop and local companion shuold upgrade now? Or it is backard compatible?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not compatible. If you start a new workspace and still have an old local companion app running this logic would not match.
How is the upgrade process for local companion app going for other incompatible changes or do we never have them?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We did not do incompatible changes so far. Giving that it is in Beta state, maybe it is fine to break. For VS Code Desktop we can switch easily from the local companion as soon as SSH gateway is deployed.

@JanKoehnlein
Copy link
Contributor

Tested and LGTM, but @geropl had a comment on the clean function which I didn't quite understand but which got lost in the force push.

@svenefftinge
Copy link
Member Author

Tested and LGTM, but @geropl had a comment on the clean function which I didn't quite understand but which got lost in the force push.

He was asking me to rewrite the function to do an early exit. I did that already.

@JanKoehnlein
Copy link
Contributor

/lgtm

@roboquat
Copy link
Contributor

roboquat commented Jan 6, 2022

LGTM label has been added.

Git tree hash: f183a59c825a94d23420f693d764f10876c5c67b

@akosyakov
Copy link
Member

/lgtm

@gitpod-io/engineering-ide if someone asks that local port forwarding does not work anymore ask them to upgrade to latest local app

@roboquat
Copy link
Contributor

roboquat commented Jan 7, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: akosyakov, csweichel, JanKoehnlein

Associated issue: #4129

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@roboquat roboquat merged commit f8086b9 into main Jan 7, 2022
@roboquat roboquat deleted the se/workspaceid branch January 7, 2022 09:33
@roboquat roboquat added deployed: webapp Meta team change is running in production deployed: workspace Workspace team change is running in production labels Jan 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved deployed: webapp Meta team change is running in production deployed: workspace Workspace team change is running in production release-note size/M team: IDE team: webapp Issue belongs to the WebApp team team: workspace Issue belongs to the Workspace team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Please use a useful workspace name instead of a nonsensical one
6 participants