[usage] List workspaces for each workspace instance in usage period #10495

easyCZ · 2022-06-07T07:58:51Z

Description

We first compute all workspace instances in a given period, for each WorkspaceID, we fetch the underlying Workspace from DB to have the necessary information to proceed to determining the owner (cost center)

Related Issue(s)

Fixes #

How to test

Unit tests

Release Notes

NONE

Documentation

NONE

components/usage/pkg/controller/reconciler.go

jankeromnes · 2022-06-08T15:49:28Z

components/usage/pkg/controller/reconciler.go

+type workspaceWithInstances struct {
+	Workspace db.Workspace
+	Instances []db.WorkspaceInstance
+}


Thought: I wonder if we actually want to load into memory the "fully-hydrated" shapes of both Workspace and WorkspaceInstance.

In the past, we've had issues with memory consumption, because we were keeping full shapes in memory, while only a few fields were actually relevant. This is especially true for workspace instances, because there are so many of them.

A usage record will probably look something like this:

export type WorkspaceUsageRecord = { instanceId: string; workspaceId: string; userId: string; projectId?: string; teamId?: string; fromTime: string; toTime: string; };

So, in my mind, the most efficient flow to get these usage records is:

Query only instanceId, workspaceId, creationTime, stoppedTime from d_b_workspace_instance (for the current period) -- all the other fields are unnecessary

Trim the time bounds between beginningOfMonth and min(getCurrentTime(), endOfMonth) (also set the toTime to the maximum bound when there is no stoppedTime yet)

For each unique workspaceId query only the owner userId and possible projectId from d_b_workspace

For each userId, query only the possible teamId (randomly pick one team for now)

If we query and then load the full Workspace and WorkspaceInstance shapes into memory, and start passing them around a lot, I worry that we'll have a bad time (because many workspace instances are created every month, and that number is rapidly increasing over time).

Sure, this makes sense. But I don't think it makes sense to do this now.

For me, this is an optimisation once we have something working. For now, I'd prefer to waste memory to get to a working version and then measure what is in fact the cost of these queries.

Fair point, thanks. (I don't really see why it's better to ship a less optimized version first when we know how to easily write a more optimized one, but I also agree about shipping this sooner than later, and maybe my suggestion could already be considered premature optimization). So, fine to defer this to later. 👍

Largely because I think the current handling will go away anyway once we do add attribution to the WSI. If we optimized here, it would then be wasted work because it would also be replaced by the attribution work from WSI.

I don't really see why it's better to ship a less optimized version first when we know how to easily write a more optimized one

We don't have any hard data to know it is in fact an optimisation. Suboptimal, but with data is more valuable than optimal but without baseline metrics.

jankeromnes

Many thanks @easyCZ!

With the unit tests, this seems to be working as intended. 🎉 I couldn't resist to add a few more thoughts on (premature?) optimization, mostly because I generally find it easier to design an efficient algorithm from scratch (minimal data footprint, limited iterations) than to optimize existing complex code, but you're right -- the current code might already go away when we attribute instances on start. Someone clever once said, "one of the most expensive mistakes engineers make is to optimize something that shouldn't exist in the first place".

Anyway, so many words to say, thanks for the progress here! Looking forward to the next iterations. 🚀

jankeromnes · 2022-06-09T06:19:56Z

components/usage/pkg/controller/reconciler.go

+		workspaceIDs = append(workspaceIDs, instance.WorkspaceID)
+	}
+
+	workspaces, err := db.ListWorkspacesByID(ctx, u.conn, toSet(workspaceIDs))


Thanks for making the IDs unique! I would have preferred to make the collection a unique "set" before all of the appends, in order to iterate just once over all instances (not once, then again in toSet to create the map, then another time over the unique IDs to convert them back to a list) -- but okay for quick iterations without worrying too much about complexity this early. 🛹

jankeromnes · 2022-06-09T06:31:31Z

components/usage/pkg/controller/reconciler.go

+	if err != nil {
+		return nil, fmt.Errorf("failed to load workspaces for workspace instances in time range: %w", err)
+	}
+	status.Workspaces = len(workspaces)


So, at this point we have:

a collection of all "fully hydrated" instances

a collection of all "fully hydrated" workspaces, each with an attached sub-collection of all its corresponding "fully hydrated" instances (although these will just be references to a single instance in memory, right, not a full copy of the object?)

I'm still slightly uneasy with loading so many large objects into memory, and iterating over them multiple times. So, I'm looking forward to seeing exactly how we'll use these collections, and see if we can later reduce the amount of used memory and the number of iterations. 👀

roboquat added do-not-merge/work-in-progress do-not-merge/release-note-label-needed size/M size/L and removed size/M labels Jun 7, 2022

easyCZ force-pushed the mp/usage-validate-instances branch from be1a7ce to 9fa1c65 Compare June 7, 2022 12:02

easyCZ force-pushed the mp/usage-list-workspaces branch from 08a44d6 to acdb029 Compare June 7, 2022 12:04

easyCZ force-pushed the mp/usage-validate-instances branch 3 times, most recently from 38183b3 to bb77bfb Compare June 8, 2022 07:30

easyCZ force-pushed the mp/usage-list-workspaces branch 2 times, most recently from ec61948 to 036774b Compare June 8, 2022 08:15

Base automatically changed from mp/usage-validate-instances to main June 8, 2022 09:24

roboquat added size/XL release-note-none and removed size/L do-not-merge/release-note-label-needed labels Jun 8, 2022

easyCZ force-pushed the mp/usage-list-workspaces branch from 036774b to 971cd31 Compare June 8, 2022 09:25

easyCZ marked this pull request as ready for review June 8, 2022 09:25

easyCZ requested a review from a team June 8, 2022 09:25

roboquat added size/L and removed do-not-merge/work-in-progress size/XL labels Jun 8, 2022

github-actions bot added the team: webapp Issue belongs to the WebApp team label Jun 8, 2022

jankeromnes reviewed Jun 8, 2022

View reviewed changes

components/usage/pkg/controller/reconciler.go Show resolved Hide resolved

jankeromnes reviewed Jun 8, 2022

View reviewed changes

components/usage/pkg/controller/reconciler.go Show resolved Hide resolved

easyCZ force-pushed the mp/usage-list-workspaces branch 4 times, most recently from 1040954 to 379bc08 Compare June 8, 2022 11:51

jankeromnes reviewed Jun 8, 2022

View reviewed changes

[usage] List workspaces for each workspace instance in usage period

63e98ca

easyCZ force-pushed the mp/usage-list-workspaces branch from 379bc08 to 63e98ca Compare June 8, 2022 20:20

jankeromnes approved these changes Jun 9, 2022

View reviewed changes

roboquat merged commit 9121279 into main Jun 9, 2022

roboquat deleted the mp/usage-list-workspaces branch June 9, 2022 06:42

roboquat added deployed: webapp Meta team change is running in production deployed Change is completely running in production labels Jun 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[usage] List workspaces for each workspace instance in usage period #10495

[usage] List workspaces for each workspace instance in usage period #10495

Uh oh!

easyCZ commented Jun 7, 2022 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

jankeromnes Jun 8, 2022 •

edited

Loading

Uh oh!

easyCZ Jun 8, 2022

Uh oh!

jankeromnes Jun 9, 2022

Uh oh!

easyCZ Jun 9, 2022 •

edited

Loading

Uh oh!

jankeromnes left a comment

Uh oh!

jankeromnes Jun 9, 2022

Uh oh!

jankeromnes Jun 9, 2022

Uh oh!

Uh oh!

[usage] List workspaces for each workspace instance in usage period #10495

[usage] List workspaces for each workspace instance in usage period #10495

Uh oh!

Conversation

easyCZ commented Jun 7, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issue(s)

How to test

Release Notes

Documentation

Uh oh!

Uh oh!

Uh oh!

jankeromnes Jun 8, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

easyCZ Jun 8, 2022

Choose a reason for hiding this comment

Uh oh!

jankeromnes Jun 9, 2022

Choose a reason for hiding this comment

Uh oh!

easyCZ Jun 9, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jankeromnes left a comment

Choose a reason for hiding this comment

Uh oh!

jankeromnes Jun 9, 2022

Choose a reason for hiding this comment

Uh oh!

jankeromnes Jun 9, 2022

Choose a reason for hiding this comment

Uh oh!

Uh oh!

easyCZ commented Jun 7, 2022 •

edited

Loading

jankeromnes Jun 8, 2022 •

edited

Loading

easyCZ Jun 9, 2022 •

edited

Loading