Skip to content

[prebuilds] no prebuilds for inactive repos #9936

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 11, 2022
Merged

[prebuilds] no prebuilds for inactive repos #9936

merged 1 commit into from
May 11, 2022

Conversation

svenefftinge
Copy link
Member

@svenefftinge svenefftinge commented May 11, 2022

Description

This change prevents prebuilds for repositories that have been inactive (no regular workspace starts) for at least a week.
This would have prevented 98835 unnecessary prebuilds in the last 11 days (vs. 28174 prebuilds on repos without projects that are still active).

Related Issue(s)

See also my comment here #9898 (comment)

How to test

Release Notes

NONE

Comment on lines +367 to +372
return workspaceRepo
.createQueryBuilder("ws")
.where('context->"$.repository.cloneUrl" = :cloneURL', { cloneURL })
.andWhere("creationTime > :since", { since: since.toISOString() })
.andWhere("type = :type", { type })
.getCount();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, many thanks for this clever fix! 👀

My only worry here is, how do we know that this query on all d_b_workspace entries won't be too slow/expensive? (Especially since we'll run it on every received webhook, which seems very frequent.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a good point. All I did is running the query below against the prod failovwer DB:

SELECT count(*) from d_b_workspace where type='regular' and creationTime > '2022-05-04' and context->"$.repository.cloneUrl" = 'https://github.com/gitpod-io/gitpod.git'

Note, that it is only running for those webhooks where we don't find a project but still it could be too much. Any ideas how to move forward?

Copy link
Member Author

@svenefftinge svenefftinge May 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what EXPLAIN has to say:
Screenshot 2022-05-11 at 17 46 55

So there's a key on creationTime which narrows the query to all workspaces of the past seven days.

Copy link
Contributor

@jankeromnes jankeromnes May 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha, that totally addresses the performance concern then, right? (Or, do we need to somehow test/qualify this further?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@svenefftinge, the number of rows sent is also a perf issue sometimes.

If we're just checking for count === 0 down below, we could start out with a limited version.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect typeorm to translate to count(*) when using getCount. Doesn't it do that?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jankeromnes jankeromnes self-assigned this May 11, 2022
@jankeromnes
Copy link
Contributor

Briefly testing this, code looks good, so will approve if it works in preview.

const span = TraceContext.startSpan("shouldSkipInactiveRepository", ctx);
try {
return (
(await this.workspaceDB
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: Would it make sense to add some caching here? Just in-memory, time based eviction – let's say 1d.

Thinking out loud:

  • webhook events can be dense
  • checking for a project seems to be a great eviction policy in general

Copy link
Contributor

@jankeromnes jankeromnes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works like a charm! 🎆 🎉 🎊

Many thanks @svenefftinge for yet again shipping a clever small fix much faster than the more complicated solution we had in mind to solve this problem (i.e. enforcing projects for all prebuilds). 💯 🛹

Adding a hold in case you'd like to request changes @AlexTugarev.

(Will also briefly take a look at the prod DB to double-check there is a key on the workspace creationTime.)

/hold

} else if (!project && (await this.shouldSkipInactiveRepository({ span }, cloneURL))) {
prebuild.state = "aborted";
prebuild.error =
"Repository is inactive. Please create a project for this repository to re-enable prebuilds.";
Copy link
Contributor

@jankeromnes jankeromnes May 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I think this isn't totally accurate (I agree that we want users to create a Project, but the minimal fix is actually just to open a new workspace for that repository).

However, since this error isn't actually shown anywhere in the UI, users can never see it. 😅 So, for something only a Gitpod admin can see while looking into the DB, that's perfect. 💯

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know it's not technically correct, but it is what we would like people to do.😇

Copy link
Member

@AlexTugarev AlexTugarev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

/hold cancel

@roboquat roboquat merged commit 72aa5e0 into main May 11, 2022
@roboquat roboquat deleted the se/no-prebuilds branch May 11, 2022 17:42
@geropl
Copy link
Member

geropl commented May 12, 2022

@svenefftinge @AlexTugarev Honstely when looking at the query, I doubt we should send that to prod.

Iterating all regular workspaces from the last 7 days for every prebuild request is quite expensive, and I'm pretty sure will blow our DB budget - and I'd rather not find out in prod. 🙂

I don't have time to look into this until 10am, so maybe we exclude this one from the rollout.

@svenefftinge
Copy link
Member Author

Would in-memory caching as @AlexTugarev suggested be acceptable?

@geropl
Copy link
Member

geropl commented May 12, 2022

Would in-memory caching as @AlexTugarev suggested be acceptable?

Ideally I'd like to use an indexed version of context.repository.cloneUrl by adding a separate field + index for it (the MySQL way 🙄 ), even if that means we have to clone the content on insert/update.

@svenefftinge
Copy link
Member Author

Working on it here.

@roboquat roboquat added deployed: webapp Meta team change is running in production deployed Change is completely running in production labels May 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
deployed: webapp Meta team change is running in production deployed Change is completely running in production release-note-none size/M team: webapp Issue belongs to the WebApp team
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

5 participants