Skip to content

[usage] Use attribution ID to reduce DB queries for usage report #10938

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 28, 2022

Conversation

easyCZ
Copy link
Member

@easyCZ easyCZ commented Jun 27, 2022

Description

Updates usage controller flow to do the following:

  1. Find all Workspace Instances (with time bounds) which have an usageAttributionId set.
  2. Groups these (in-code) by Attribution ID to create the report

We can't use SQL level group-by because it can only be used with summary functions (like count). The secondary reason is that the Billing controller (which runs after the usage controller) needs to given the full list of WorkspaceInstances to exclude any WorkspaceInstances which it may have already billed for (in another invoice, for example). For this, it has to get the raw data-set and can't receive an aggregation.

Related Issue(s)

How to test

Unit tests
Can be run against staging by port forwarding

Release Notes

NONE

Documentation

NONE

Werft options:

  • /werft with-preview

@easyCZ easyCZ changed the base branch from main to mp/usage-attribution-id June 27, 2022 15:15
@easyCZ easyCZ force-pushed the mp/usage-attribution-id branch from 2a54e55 to b966efd Compare June 27, 2022 15:17
@easyCZ easyCZ force-pushed the mp/usage-attribution-id branch from b966efd to d416916 Compare June 28, 2022 08:08
@easyCZ easyCZ force-pushed the mp/usage-use-attribution-id branch 2 times, most recently from 5baf448 to ee88344 Compare June 28, 2022 08:10
@roboquat roboquat added size/XL and removed size/XXL labels Jun 28, 2022
@@ -79,6 +79,7 @@ func ListWorkspaceInstancesInRange(ctx context.Context, conn *gorm.DB, from, to
).
Where("creationTime < ?", TimeToISO8601(to)).
Where("startedTime != ?", "").
Where("usageAttributionId != ?", "").
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How will we want to handle potentially un-attributed instances?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the usageAttributionId should always be present (the premise of your PR). The check here mostly ensures we only run on "new" data which does have the attribution set

Comment on lines 208 to 214
func groupInstancesByAttributionID(instances []db.WorkspaceInstance) map[db.AttributionID][]db.WorkspaceInstance {
result := map[db.AttributionID][]db.WorkspaceInstance{}
for _, instance := range instances {
if _, ok := result[instance.UsageAttributionID]; !ok {
result[instance.UsageAttributionID] = []db.WorkspaceInstance{}
}

result[instance.UsageAttributionID] = append(result[instance.UsageAttributionID], instance)
}

return result
}
Copy link
Contributor

@jankeromnes jankeromnes Jun 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure it makes sense performance-wise to re-implement a SQL "group by" in Go? I notice this adds another full iteration loop over all the very numerous instances of the current month. I think, given the sheer number of instances, we may want to limit all these iterations to ideally just a single one-shot iteration.

I wonder if maybe this "group by" should be part of the query. Relatedly, have you had a chance to chat with @geropl about the query's performance, and how to efficiently index and shard it? 👀

Copy link
Member Author

@easyCZ easyCZ Jun 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can only use group-by with summary functions (like count). That's because a group-by takes multiple rows, and turns them into a single row by aggregating them (and applying the summary function to it).

In practice, we need the full dataset anyway because the billing controller needs to do another pass to drop WSI which have already been billed.

Happy to go into more details on this.

@easyCZ easyCZ force-pushed the mp/usage-attribution-id branch from d416916 to c48fee9 Compare June 28, 2022 09:42
Base automatically changed from mp/usage-attribution-id to main June 28, 2022 09:47
@easyCZ easyCZ force-pushed the mp/usage-use-attribution-id branch from ee88344 to 9ccf14e Compare June 28, 2022 09:49
@easyCZ easyCZ marked this pull request as ready for review June 28, 2022 09:51
@easyCZ easyCZ requested a review from a team June 28, 2022 09:51
@github-actions github-actions bot added the team: webapp Issue belongs to the WebApp team label Jun 28, 2022
@easyCZ easyCZ marked this pull request as draft June 28, 2022 11:08
@easyCZ
Copy link
Member Author

easyCZ commented Jun 28, 2022

I've moved this to Draft as I've added a bunch of fixes for flaky tests which I need to pull into a separate PR. The main logic is still reviewable, but I'll move it back to review once cleaned up.

@easyCZ easyCZ force-pushed the mp/usage-use-attribution-id branch from 1610f34 to a77bbc5 Compare June 28, 2022 13:29
@easyCZ easyCZ force-pushed the mp/usage-use-attribution-id branch from a77bbc5 to 7d7be79 Compare June 28, 2022 13:31
@easyCZ easyCZ marked this pull request as ready for review June 28, 2022 13:32
@easyCZ easyCZ force-pushed the mp/usage-use-attribution-id branch from cb1db57 to bc5955d Compare June 28, 2022 13:35
@easyCZ
Copy link
Member Author

easyCZ commented Jun 28, 2022

Fixed up and rebased, ready for review again.

@geropl
Copy link
Member

geropl commented Jun 28, 2022

Starting to review now...

@geropl geropl self-assigned this Jun 28, 2022
}
for attribution, instances := range u {
entity, id := attribution.Values()
if entity != db.AttributionEntity_Team {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: A comment that we handle this in the future would be golden 👍

TeamID: membership.TeamID,
Workspaces: workspacesByOwnerID[userID],
})
attributedUsage[id] = int64(runtime)
Copy link
Member

@geropl geropl Jun 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels "dangerous"/off given that we do use uint64 for usage in other places in the code base. Why not make attributedUsage accumulate in uint64 and stick to that datatype everywhere? 🤔

Copy link
Member Author

@easyCZ easyCZ Jun 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to update to unit64 everywhere (but Stripe only accepts int64, not uint64). But the reason for using it here is to limit changes to the existing Usage Report reconcile logic with stripe in this PR. That happens here https://github.com/gitpod-io/gitpod/pull/10938/files#diff-b2499b7086d5733f081dcce586e0ff0e77206dd5d8fd6ede638e3fc8f284c798L25-L32 and it currently has int64 as the interface.
I'll update these in a follow-up PR (minus the Stripe API call which will have to be int64) if that's acceptable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, fine with this for now.

@geropl
Copy link
Member

geropl commented Jun 28, 2022

Not necessarily sth for this PR, but: before we enable this again I'd love to get away from the fixed interval, and instead wait for the current run to finish before we start a new one (code ref).

@@ -79,6 +79,7 @@ func ListWorkspaceInstancesInRange(ctx context.Context, conn *gorm.DB, from, to
).
Where("creationTime < ?", TimeToISO8601(to)).
Where("startedTime != ?", "").
Where("usageAttributionId != ?", "").
FindInBatches(&instancesInBatch, 1000, func(_ *gorm.DB, _ int) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should work out fine. 👍

Copy link
Member

@geropl geropl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@roboquat roboquat merged commit 2e7d2ef into main Jun 28, 2022
@roboquat roboquat deleted the mp/usage-use-attribution-id branch June 28, 2022 15:53
@roboquat roboquat added deployed: webapp Meta team change is running in production deployed Change is completely running in production labels Jun 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
deployed: webapp Meta team change is running in production deployed Change is completely running in production release-note-none size/XL team: webapp Issue belongs to the WebApp team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants