-
Notifications
You must be signed in to change notification settings - Fork 1.3k
[prebuilds] no prebuilds for inactive repos #9936
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
a29ede0
to
750e8bb
Compare
750e8bb
to
a59ee55
Compare
return workspaceRepo | ||
.createQueryBuilder("ws") | ||
.where('context->"$.repository.cloneUrl" = :cloneURL', { cloneURL }) | ||
.andWhere("creationTime > :since", { since: since.toISOString() }) | ||
.andWhere("type = :type", { type }) | ||
.getCount(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, many thanks for this clever fix! 👀
My only worry here is, how do we know that this query on all d_b_workspace
entries won't be too slow/expensive? (Especially since we'll run it on every received webhook, which seems very frequent.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a good point. All I did is running the query below against the prod failovwer DB:
SELECT count(*) from d_b_workspace where type='regular' and creationTime > '2022-05-04' and context->"$.repository.cloneUrl" = 'https://github.com/gitpod-io/gitpod.git'
Note, that it is only running for those webhooks where we don't find a project but still it could be too much. Any ideas how to move forward?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha, that totally addresses the performance concern then, right? (Or, do we need to somehow test/qualify this further?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@svenefftinge, the number of rows sent is also a perf issue sometimes.
If we're just checking for count === 0
down below, we could start out with a limit
ed version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would expect typeorm to translate to count(*) when using getCount
. Doesn't it do that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Briefly testing this, code looks good, so will approve if it works in preview. |
const span = TraceContext.startSpan("shouldSkipInactiveRepository", ctx); | ||
try { | ||
return ( | ||
(await this.workspaceDB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: Would it make sense to add some caching here? Just in-memory, time based eviction – let's say 1d.
Thinking out loud:
- webhook events can be dense
- checking for a project seems to be a great eviction policy in general
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Works like a charm! 🎆 🎉 🎊
Many thanks @svenefftinge for yet again shipping a clever small fix much faster than the more complicated solution we had in mind to solve this problem (i.e. enforcing projects for all prebuilds). 💯 🛹
Adding a hold in case you'd like to request changes @AlexTugarev.
(Will also briefly take a look at the prod DB to double-check there is a key on the workspace creationTime
.)
/hold
} else if (!project && (await this.shouldSkipInactiveRepository({ span }, cloneURL))) { | ||
prebuild.state = "aborted"; | ||
prebuild.error = | ||
"Repository is inactive. Please create a project for this repository to re-enable prebuilds."; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I think this isn't totally accurate (I agree that we want users to create a Project, but the minimal fix is actually just to open a new workspace for that repository).
However, since this error isn't actually shown anywhere in the UI, users can never see it. 😅 So, for something only a Gitpod admin can see while looking into the DB, that's perfect. 💯
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know it's not technically correct, but it is what we would like people to do.😇
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
/hold cancel
@svenefftinge @AlexTugarev Honstely when looking at the query, I doubt we should send that to prod. Iterating all regular workspaces from the last 7 days for every prebuild request is quite expensive, and I'm pretty sure will blow our DB budget - and I'd rather not find out in prod. 🙂 I don't have time to look into this until 10am, so maybe we exclude this one from the rollout. |
Would in-memory caching as @AlexTugarev suggested be acceptable? |
Ideally I'd like to use an indexed version of |
Working on it here. |
Description
This change prevents prebuilds for repositories that have been inactive (no regular workspace starts) for at least a week.
This would have prevented 98835 unnecessary prebuilds in the last 11 days (vs. 28174 prebuilds on repos without projects that are still active).
Related Issue(s)
See also my comment here #9898 (comment)
How to test
Release Notes