Improve transform performance (by caching affine transforms resulting from transform components) #8691

Wumpf · 2025-01-14T19:27:14Z

Part of Project: Performant visualization of scenes with large number of entities #8233
Related to Simplify transform hierarchy walk & computation #7025
Fixes Rerun is extremely slow when updating a root pose #7604
- 24ms (main @ b2b7f91) down to 12ms on my windows machine
- still more to do here, but this is enough of a speed-up that I'd call this particular issue closed :)

What

Introduces a new store subscriber, TransformCacheStoreSubscriber, that keeps track of when the tree/pose/pinhole transform of an entity changes and stores the resulting affine transforms (taking into account all transform components).

Effectively this splits out one of the responsibilities of the TransformContext into a separate construct, namely the calculation of transforms that need to be propagated in the tree. Since the actual tree propagation is still in what was previously the TransformContext, it got renamed to TransformTreeContext.
(there's more performance work to be done in that area, see comment notes for details)

For simplicity of implementation, TransformCacheStoreSubscriber doesn't calculate transforms when receiving store events, but rather upon request later on in TransformCacheStoreSubscriber::apply_all_updates. This avoids having to query the store while it is being populated.
In a similar vein, we absolutely do not want to reimplement latest-at semantics more than we need to which means that during apply_all_updates we calculate the transforms exactly as before via queries to the store, ignoring any prior knowledge about previous queries we might have.

Testing

There's lots of new tests added (which caught plenty of issues!), but I also inspected some of the snippets.
The lack of unit tests on the transform tree itself is a bit unnerving, we'll have to correct for that in the future

For performance comparisons I tried various large scenes, typically with --threads 1 to account for other things becoming the bottleneck (I'm look at you annotation context-context 🙄 ), but even without that overall there's a clear performance progression. How much depends on various factors, but the scene attached to #7604 gives an extreme case for the possible gains:

Numbers with time cursor at +297 682.003s, time panel minimized. Gathered on my Windows machine.

before (main @ b2b7f91)

24ms without profiler & all threads
23ms without profiler & --threads 2 (--threads 1 deadlocks on load for some scenes #8695: --threads 1 deadlocks on load?!)
--threads 2 profiler:

after:

12ms without profiler & all threads
14ms without profiler & --threads 2 (--threads 1 deadlocks on load for some scenes #8695: --threads 1 deadlocks on load?!)
--threads 2 profiler:

(as seen from the numbers we're unusually bad at parallelizing in this scene (which actually makes sense from the way it's set up [...]), profiler traces are a lot more readable with less threads though since all the large blocks are far down in worker therads otherwise)

We still loose an unreasonable amount of time in the TransformTreeContext in scenes with many entities which we'll need to address (even when there's few transforms in said scenes - above test scene doesn't show that, but others like the alien_cake_addict scene from revy do!). Furthermore, we have to expect that ingestion has a bit of a slow-down because of the added work due to substription and (more so) apply_all_updates later on - these impacts have not been tested so far.

…move unreachable handling

…ating cache on the fly

github-actions · 2025-01-14T19:28:54Z

Web viewer built successfully. If applicable, you should also test it:

I have tested the web viewer

Result	Commit	Link	Manifest
✅	`3e21fdd`	https://rerun.io/viewer/pr/8691	`+nightly` `+main`

^{Note: This comment is updated whenever you push a commit.}

emilk

Very nice! Is this whole chunk now effectively gone??

crates/viewer/re_view_spatial/src/contexts/transform_tree_context.rs

crates/viewer/re_view_spatial/src/transform_cache.rs

emilk · 2025-01-15T06:06:19Z

How is the ingestion performance affected?

Wumpf · 2025-01-15T09:13:11Z

How is the ingestion performance affected?

It can't be better that's for sure, but I haven't tested it yet as mentioned in the description. Expectation is that we take the big hit on each frame after lots of transform data comes in. I think to properly test this I'd have to set up a script that continously feeds lots of transforms. Not hard to do, but not something I got around doing so far, too many other pressing things... I can set an optimistic reminder for next week

Wumpf · 2025-01-15T09:17:59Z

Very nice! Is this whole chunk now effectively gone??

yes! :)
In extreme cases like the demonstrated (user provided!) scene. In some others we unfortunately still spend a whole milisecond for each 2d/3d view for the hierarchical propagation (often of entities that aren't even on screen -.-) even when there's no transforms, but even there it's better since we don't have to query anymore

crates/store/re_chunk/src/helpers.rs

Wumpf and others added 16 commits January 14, 2025 19:46

first somewhat working version of transform cache

b89680d

better structure: less lookups to build up transform tree

712a0f0

use transform cache for pose transforms as well

43fcc3a

move pinhole to transform cache

4723587

some cleanup

eb4bd9f

better commentary on TransformContext, fix for missing transforms, re…

68a289b

…move unreachable handling

use entity path hash in TransformContext instead of stored paths

5d121a8

some notes for next-week me

1d84726

transform cache now records updates and applies them instead of popul…

3452e20

…ating cache on the fly

improve "aspect" determiniation

5d7c31f

implement cache invalidation

10e0369

renamed transform context to transform tree context

6c92487

test wip (why is it failing?)

b719f5f

Fix not all events being processed

fbba77c

add more tests, fix clears, improve interface

2e2545f

add test for out of order updates on transform cache

71937de

Wumpf added 🚀 performance Optimization, memory use, etc include in changelog 📺 re_viewer affects re_viewer itself labels Jan 14, 2025

typos

a254f80

emilk approved these changes Jan 15, 2025

View reviewed changes

Wumpf added 5 commits January 15, 2025 11:00

address some doc & style related comments

a83035e

it ain't no queue!

625c584

more docs, Box pose & pinhole to save some memory

cb30d71

Move default view coordinates constant to Pinhole

6950a70

more comment & structure improvements

fb4efad

Wumpf added 3 commits January 15, 2025 11:22

fix wrong subscriber name

e330fe9

more comment & style

f40d443

consistently handle invalid transforms

c517a85

Wumpf force-pushed the andreas/transform-mat4-subscriber branch from d7d3942 to c517a85 Compare January 15, 2025 10:53

Wumpf added 5 commits January 15, 2025 11:54

collect instead of loop

f4155af

add garbage collection to transform cache

5d8fdaf

reference ticket for reporting invalid transforms

bbf8290

warn for non-mono transform lists

0d499f8

doc string fix

4aed794

teh-cmc reviewed Jan 15, 2025

View reviewed changes

crates/store/re_chunk/src/helpers.rs Outdated Show resolved Hide resolved

undo incorrect comment change

3e21fdd

Wumpf merged commit 77d4f58 into main Jan 15, 2025
31 of 32 checks passed

Wumpf deleted the andreas/transform-mat4-subscriber branch January 15, 2025 15:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve transform performance (by caching affine transforms resulting from transform components) #8691

Improve transform performance (by caching affine transforms resulting from transform components) #8691

Wumpf commented Jan 14, 2025 •

edited

Loading

github-actions bot commented Jan 14, 2025 •

edited

Loading

emilk left a comment

emilk commented Jan 15, 2025

Wumpf commented Jan 15, 2025

Wumpf commented Jan 15, 2025 •

edited

Loading

Improve transform performance (by caching affine transforms resulting from transform components) #8691

Improve transform performance (by caching affine transforms resulting from transform components) #8691

Conversation

Wumpf commented Jan 14, 2025 • edited Loading

Related

What

Testing

github-actions bot commented Jan 14, 2025 • edited Loading

emilk left a comment

Choose a reason for hiding this comment

emilk commented Jan 15, 2025

Wumpf commented Jan 15, 2025

Wumpf commented Jan 15, 2025 • edited Loading

Wumpf commented Jan 14, 2025 •

edited

Loading

github-actions bot commented Jan 14, 2025 •

edited

Loading

Wumpf commented Jan 15, 2025 •

edited

Loading