Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support GitHub Enterprise #8574

Merged
merged 5 commits into from
Mar 9, 2022
Merged

Support GitHub Enterprise #8574

merged 5 commits into from
Mar 9, 2022

Conversation

jankeromnes
Copy link
Contributor

@jankeromnes jankeromnes commented Mar 3, 2022

Description

  • Fix raw file URL for GitHub Enterprise to allow fetching .gitpod.yml files successfully
  • Fix the Grant Access button/flow for private repositories on GitHub Enterprise
  • Allow creating Projects and running Prebuilds for GitHub Enterprise repositories
  • Drive-by: Trim clientId and clientSecret values when creating a new Integration
  • Drive-by: In dev-staging, don't truncate the callbackURL of new Integrations

Related Issue(s)

Fixes #8501 #8429

How to test

  1. Have a GitHub Enterprise server account (for example, on https://ghe.gitpod-self-hosted.com )
  2. Have a public repository there (e.g. https://ghe.gitpod-self-hosted.com/roboquat/public-repo) and a private repository (e.g. https://ghe.gitpod-self-hosted.com/roboquat/top-secret)
  3. Log in to https://jx-fix-ghe-yml.staging.gitpod-dev.com/integrations (e.g. using your github.com account)
  4. Follow https://www.gitpod.io/docs/github-integration#github-enterprise-server to create a new Integration for your GitHub Enterprise server, and activate it
  5. Open your public repository from 2. in https://jx-fix-ghe-yml.staging.gitpod-dev.com/open -- it should work
  6. Open your private repository from 2. -- it should ask you to Grant Access, then also work
  7. Add your two repositories as Projects in https://jx-fix-ghe-yml.staging.gitpod-dev.com/new
  8. Once the Projects are created, a first Prebuild should run and complete successfully
  9. Push a new commit to one of your repositories -- the new commit should be detected by Gitpod, and trigger a new Prebuild

Release Notes

Support GitHub Enterprise

Documentation

@jankeromnes jankeromnes requested a review from a team March 3, 2022 15:56
@github-actions github-actions bot added the team: webapp Issue belongs to the WebApp team label Mar 3, 2022
@codecov
Copy link

codecov bot commented Mar 3, 2022

Codecov Report

Merging #8574 (b78071f) into main (ede9db9) will decrease coverage by 1.13%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #8574      +/-   ##
==========================================
- Coverage   12.31%   11.17%   -1.14%     
==========================================
  Files          20       18       -2     
  Lines        1161      993     -168     
==========================================
- Hits          143      111      -32     
+ Misses       1014      880     -134     
+ Partials        4        2       -2     
Flag Coverage Δ
components-gitpod-cli-app 11.17% <ø> (ø)
components-local-app-app-darwin-amd64 ?
components-local-app-app-darwin-arm64 ?
components-local-app-app-linux-amd64 ?
components-local-app-app-linux-arm64 ?
components-local-app-app-windows-386 ?
components-local-app-app-windows-amd64 ?
components-local-app-app-windows-arm64 ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
components/local-app/pkg/auth/pkce.go
components/local-app/pkg/auth/auth.go

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ede9db9...b78071f. Read the comment docs.

@jankeromnes jankeromnes mentioned this pull request Mar 3, 2022
4 tasks
@jankeromnes
Copy link
Contributor Author

Taking this back to draft: Should work fine as is, but needs a GitHub Enterprise integration with the preview environment to test it, and once I have that, I might as well add in a few more fixes.

@jankeromnes jankeromnes marked this pull request as draft March 4, 2022 09:46
@roboquat roboquat added size/S and removed size/XS labels Mar 4, 2022
@jankeromnes jankeromnes changed the title [server] Fix fetching raw files (e.g. .gitpod.yml) from GitHub Enterprise repositories Support GitHub Enterprise Mar 4, 2022
@jankeromnes
Copy link
Contributor Author

Holding because of the minor TODOs above:

/hold

Copy link
Contributor Author

@jankeromnes jankeromnes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Forgot to actually submit the minor TODO comments)

@jankeromnes jankeromnes requested a review from a team March 9, 2022 12:23
@geropl
Copy link
Member

geropl commented Mar 9, 2022

@jankeromnes Thank you for tackling this, and providing a working solution this fast! 💯 🙏

We discussed this earlier, just want to point it out again:

  • This PR breaks with the existing pattern we use for integrations type == 'GitHub', where we install GitHub Apps on the org/repo level. Instead, it uses the OAuth-app (user-impersonation) to install the webhook (see "Differences between GitHub Apps and OAuth Apps").
  • I'm fine with deviating for GitHub Enterprise, especially if it's temporary for the sake of progress. But then someone must own the responsibility to make this coherent again a) technically (by implementing a any required changes), and b) product-wise (by re-discovering the differences between both approaches). (/cc @jldec )
  • I want to
    a) ensure we don't take up additional long-term technical debt by introducing new integration modes
    b) make sure someone (not me 🙃 ) follows up on this, because I see no value being added to our goals in this move.

If I'm missing something here - be it technical, or other - please let me know! Also, really don't want to hinder the good progress here, just want to make sure by point is clear. 🙂

@geropl
Copy link
Member

geropl commented Mar 9, 2022

On the technical side: How do we ensure we have the right scopes for:

  • registering webhooks (write:repo_hook or write:org_hook scope, ref)
  • updating Commit statuses (repo:status scope, ref)

@jankeromnes
Copy link
Contributor Author

jankeromnes commented Mar 9, 2022

Thanks @geropl for the quick review! 🙏

  • This PR breaks with the existing pattern we use for integrations type == 'GitHub', where we install GitHub Apps on the org/repo level. Instead, it uses the OAuth-app (user-impersonation) to install the webhook (see "Differences between GitHub Apps and OAuth Apps").

I don't think it "breaks the pattern", because the GitHub App is a singleton in Gitpod (i.e. you can only have one GitHub App per Gitpod installation), while we already allow an arbitrary number of Integrations (of type GitHub or GitLab) which are already OAuth apps. As we've seen, there are already 13 GitHub Enterprise integrations on gitpod.io. This Pull Request makes them actually work.

Also, webhooks and the GitHub App are not mutually exclusive. As a next step to this PR, I'd love to explore what it could look like if you have a single Gitpod Self-Hosted installation, and connect its GitHub App singleton to a single GitHub Enterprise server (as opposed to connecting it to github.com as we do on SaaS). Ironing out any bugs in that process could create a nice alternative integration point specifically for Self-Hosted / GHE.

  • I want to
    a) ensure we don't take up additional long-term technical debt by introducing new integration modes
  • We already allow arbitrary OAuth App integrations for GitLab and GitHub Enterprise in Gitpod (that's how the custom Integrations in /integrations work)
  • This Pull Request just makes the existing GitHub Enterprise integrations work properly, by using the same mechanism as GitLab and Bitbucket (webhooks)
  • In the future, I would love to improve these integrations (GitLab, Bitbucket and GitHub Enterprise) to make them post visible status updates on repositories, commits and Pull Requests, to make GitLab Bitbucket and GitHub Enterprise reach parity with our current github.com support

b) make sure someone (not me 🙃 ) follows up on this, because I see no value being added to our goals in this move.

As mentioned above, I'm happy to follow up, e.g. by testing a Gitpod Self-Hosted using a GitHub App connected to a GitHub Enterprise server. As mentioned above, I don't think webhooks and the GitHub App are mutually exclusive.

@jankeromnes
Copy link
Contributor Author

jankeromnes commented Mar 9, 2022

On the technical side: How do we ensure we have the right scopes for:

  • registering webhooks (write:repo_hook or write:org_hook scope, ref)
  • updating Commit statuses (repo:status scope, ref)

I believe these scopes are all included in the repo (or public_repo) scopes we already request (see mentions of "webhooks" and "commit status" in available scopes).

Another relevant question is, provided we have the correct scopes, which users are allowed to perform these actions? Here it seems that only repository writers, maintainers and admins can create commit statuses, and only admins can create webhooks (see permissions for each role). That's why in GitHubRepositoryProvider.getProviderRepos, we filter for premissions.admin on the listed repositories when you want to create a Project.

@jankeromnes
Copy link
Contributor Author

jankeromnes commented Mar 9, 2022

For some reason Werft decided to hide the preview URL of this Pull Request, but the preview itself is still live FYI:

https://jx-fix-ghe-yml.staging.gitpod-dev.com/workspaces

components/dashboard/src/projects/NewProject.tsx Outdated Show resolved Hide resolved
app.use(bodyParser.json())
app.use(bodyParser.urlencoded({ extended: true }))
// Read bodies as JSON (but keep the raw body just in case)
app.use(bodyParser.json({ verify: (req, res, buffer) => { (req as any).rawBody = buffer; }}));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which also means we carry the row body for all requests. if there is no other reference, the buffer isn't collected anymore.

Copy link
Contributor Author

@jankeromnes jankeromnes Mar 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, with this change we'll carry the raw body of JSON requests we receive. I'm assuming that, apart from some webhooks, we don't receive too many / very large JSON payloads, but I might be wrong here.

components/server/src/github/github-repository-provider.ts Outdated Show resolved Hide resolved
components/server/src/github/api.ts Show resolved Hide resolved
components/server/ee/src/prebuilds/github-service.ts Outdated Show resolved Hide resolved
// Verify the webhook signature
const signature = req.header('X-Hub-Signature-256');
const body = (req as any).rawBody;
const tokenEntries = (await this.userDB.findTokensForIdentity(gitpodIdentity)).filter(tokenEntry => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how many tokenEntrys are expected to be there? it would be great to pick the current one if there are multiple.

Copy link
Contributor Author

@jankeromnes jankeromnes Mar 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As many tokenEntrys as repositories you've installed prebuilds on.

I didn't filter on the cloneUrl scope that we also add to tokens, because we don't update the cloneUrls in tokens when repositories get renamed/moved.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh! that sounds really expensive. if we get many of them, such operations will drive the event loop lag even higher.

Copy link
Contributor Author

@jankeromnes jankeromnes Mar 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How many GitHub Enterprise repositories do we expect a single user to enable prebuilds on? And can this number realistically ever get sufficiently high to make this loop's runtime cost significant?

But you're right, maybe we should add more tracing here.

EDIT: Looks like the current tracing is already sufficient.

await this.githubApi.run(user, gh => gh.repos.deleteWebhook({ owner, repo, hook_id: webhook.id }));
}
}
// TODO(janx): Also delete old tokens with scopes `GitHubService.PREBUILD_TOKEN_SCOPE` and `cloneUrl`?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please, yes! unless it's already done in createGitpodToken op.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't reusing a token another option here?
just in case, I'm thinking of the db-sync dilemma.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unless it's already done in createGitpodToken op.

Ah, good thought! How nice that this is actually the case: 😄

await this.userDB.deleteTokens(identity,
// delete any tokens with the same scopes
tokenEntry => tokenEntry.token.scopes.every(s => scopes.indexOf(s) !== -1)
);

isn't reusing a token another option here?
just in case, I'm thinking of the db-sync dilemma.

Could you please explain the db-sync dilemma regarding whether to reuse a token vs creating a new one? 🤔

FYI, I followed the same pattern as GitLab and Bitbucket webhooks, i.e. generate a new token for every repo (with the scopes "prebuilds" and cloneUrl).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please explain the db-sync dilemma regarding whether to reuse a token vs creating a new one? 🤔

If you override tokens in the DB for a particular repo, they won't be available in the other cluster until the following db-sync cycle. But that might be a general problem with self-managed webhooks.

@jankeromnes jankeromnes force-pushed the jx/fix-ghe-yml branch 3 times, most recently from bbf9a64 to 3d7e92f Compare March 9, 2022 15:15
@geropl
Copy link
Member

geropl commented Mar 9, 2022

@jankeromnes Thanks for the explanation.

As we've seen, there are already 13 GitHub Enterprise integrations on gitpod.io. This Pull Request makes them actually work.

💡 It's news to me that we allow/already support these. I saw you slack post earlier this week, but thought the entries where due to bugs.

I don't think it "breaks the pattern"

It's bit complicated because of the two use cases:

  • qualified, by admin, at config-time: connect 1 integration with a single, statically configured GitHub app
  • unqualified, by user, at runtime: connect n integrations; so far we only had GitLab/BitBucket, so used webhooks (and now for GHE as well)

My main concern was that I understood that we want to achieve 1.), but this PR uses the techniques we employ for 2.). This is what I meant with "breaking the pattern". Instead of focusing on the problem at hand (having a customer configure they own GitHub App for their own installation) we seem to be fixing the GHE integration in general. Which is also an interesting goal, but feels a bit out of scope here.

Also, webhooks and the GitHub App are not mutually exclusive

Not at runtime, yes, but I'd really like us to see it this way, because it additional maintenance burden, and now we have to schedule the task of cleaning it up (e.g., deciding for one way or the other).

As a next step to this PR, I'd love to explore what it could look like if you have a single Gitpod Self-Hosted installation, and connect its GitHub App singleton to a single GitHub Enterprise server (as opposed to connecting it to github.com as we do on SaaS)

Awesome, really looking forward to this. 💪

@jankeromnes
Copy link
Contributor Author

jankeromnes commented Mar 9, 2022

Many thanks @andrew-farries, @geropl and @AlexTugarev for your super helpful reviews! 🙏

I think I've addressed all nits, so this now seems ready to be merged. 🚢

/unhold

const sig = 'sha256=' + createHmac('sha256', user.id + '|' + tokenEntry.token.value)
.update(body)
.digest('hex');
return sig === signature;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably pedantry at this stage, but comparing the HMAC signatures using a simple string comparison is considered insecure because of the potential for timing attacks.

See the github webhook docs. Seems the best way to do it is with crypto.timingSafeEqual as suggested here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
deployed: webapp Meta team change is running in production deployed Change is completely running in production release-note size/XL team: webapp Issue belongs to the WebApp team
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

Epic: Support GitHub Enterprise
5 participants