-
Notifications
You must be signed in to change notification settings - Fork 1.3k
[usage] Use attribution ID to reduce DB queries for usage report #10938
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
2a54e55
to
b966efd
Compare
b966efd
to
d416916
Compare
5baf448
to
ee88344
Compare
@@ -79,6 +79,7 @@ func ListWorkspaceInstancesInRange(ctx context.Context, conn *gorm.DB, from, to | |||
). | |||
Where("creationTime < ?", TimeToISO8601(to)). | |||
Where("startedTime != ?", ""). | |||
Where("usageAttributionId != ?", ""). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How will we want to handle potentially un-attributed instances?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the usageAttributionId
should always be present (the premise of your PR). The check here mostly ensures we only run on "new" data which does have the attribution set
func groupInstancesByAttributionID(instances []db.WorkspaceInstance) map[db.AttributionID][]db.WorkspaceInstance { | ||
result := map[db.AttributionID][]db.WorkspaceInstance{} | ||
for _, instance := range instances { | ||
if _, ok := result[instance.UsageAttributionID]; !ok { | ||
result[instance.UsageAttributionID] = []db.WorkspaceInstance{} | ||
} | ||
|
||
result[instance.UsageAttributionID] = append(result[instance.UsageAttributionID], instance) | ||
} | ||
|
||
return result | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure it makes sense performance-wise to re-implement a SQL "group by" in Go? I notice this adds another full iteration loop over all the very numerous instances of the current month. I think, given the sheer number of instances, we may want to limit all these iterations to ideally just a single one-shot iteration.
I wonder if maybe this "group by" should be part of the query. Relatedly, have you had a chance to chat with @geropl about the query's performance, and how to efficiently index and shard it? 👀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can only use group-by with summary functions (like count). That's because a group-by takes multiple rows, and turns them into a single row by aggregating them (and applying the summary function to it).
In practice, we need the full dataset anyway because the billing controller needs to do another pass to drop WSI which have already been billed.
Happy to go into more details on this.
d416916
to
c48fee9
Compare
ee88344
to
9ccf14e
Compare
I've moved this to Draft as I've added a bunch of fixes for flaky tests which I need to pull into a separate PR. The main logic is still reviewable, but I'll move it back to review once cleaned up. |
1610f34
to
a77bbc5
Compare
a77bbc5
to
7d7be79
Compare
cb1db57
to
bc5955d
Compare
Fixed up and rebased, ready for review again. |
Starting to review now... |
} | ||
for attribution, instances := range u { | ||
entity, id := attribution.Values() | ||
if entity != db.AttributionEntity_Team { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: A comment that we handle this in the future would be golden 👍
TeamID: membership.TeamID, | ||
Workspaces: workspacesByOwnerID[userID], | ||
}) | ||
attributedUsage[id] = int64(runtime) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels "dangerous"/off given that we do use uint64
for usage in other places in the code base. Why not make attributedUsage accumulate in uint64
and stick to that datatype everywhere? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to update to unit64 everywhere (but Stripe only accepts int64, not uint64). But the reason for using it here is to limit changes to the existing Usage Report reconcile logic with stripe in this PR. That happens here https://github.com/gitpod-io/gitpod/pull/10938/files#diff-b2499b7086d5733f081dcce586e0ff0e77206dd5d8fd6ede638e3fc8f284c798L25-L32 and it currently has int64 as the interface.
I'll update these in a follow-up PR (minus the Stripe API call which will have to be int64) if that's acceptable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, fine with this for now.
Not necessarily sth for this PR, but: before we enable this again I'd love to get away from the fixed interval, and instead wait for the current run to finish before we start a new one (code ref). |
@@ -79,6 +79,7 @@ func ListWorkspaceInstancesInRange(ctx context.Context, conn *gorm.DB, from, to | |||
). | |||
Where("creationTime < ?", TimeToISO8601(to)). | |||
Where("startedTime != ?", ""). | |||
Where("usageAttributionId != ?", ""). | |||
FindInBatches(&instancesInBatch, 1000, func(_ *gorm.DB, _ int) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should work out fine. 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Description
Updates usage controller flow to do the following:
usageAttributionId
set.We can't use SQL level group-by because it can only be used with summary functions (like count). The secondary reason is that the Billing controller (which runs after the usage controller) needs to given the full list of WorkspaceInstances to exclude any WorkspaceInstances which it may have already billed for (in another invoice, for example). For this, it has to get the raw data-set and can't receive an aggregation.
Related Issue(s)
How to test
Unit tests
Can be run against staging by port forwarding
Release Notes
Documentation
NONE
Werft options: