Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use data store #13

Merged
merged 68 commits into from
Jun 15, 2022
Merged

Use data store #13

merged 68 commits into from
Jun 15, 2022

Conversation

emilk
Copy link
Member

@emilk emilk commented Jun 10, 2022

This adds the data store crate to put all the data in, with new support for batch logging. The queries for getting the data requires for the 2D and 3D views are now much faster.

Paths

This splits DataPath into several parts, perhaps best illustrated with an example:

  • DataPath: /camera/"left"/points/42.pos, /camera/"left"/points/42.color, etc
  • ObjPath: /camera/"left"/points/42
  • FieldName: pos, color, etc
  • TypePath: /camera/*/points/*
  • IndexPath: /"left"/42

Each object has an ObjectType (e.g. ObjectType::Point). Types are mapped to TypePath, meaning all objects at the same TypePath have the same type. This means all tables are homogeneous.

Logging

LogMsg is now an enum of either TypeMsg or DataMsg.

TypeMsg maps a TypePath to a ObectType. You need to log this ONCE before logging any data to an object at that type path.

DataMsg maps a DataPath to a Data value (or possibly a batch of values - see below).

Batches

If you are logging many points it is inefficient to do so by logging each point individually. The indexing of it will also be slow. A better approach is to use batching.

Without batching you would log:

/camera/"left"/points/0.color = RED
/camera/"left"/points/1.color = BLUE
…

With batching you instead log:

/camera/"left"/points/_.pos = {0: RED, 1: BLUE, …}

That underscore (_) is an Index::Placeholder and is replaced by the 0, 1, etc from the batch. Note that is always only the last index that is batched.

Batches are faster to log, and are also faster to query inside the viewer.

NYUD

This PR also adds a new example app nyud, based on https://cs.nyu.edu/~silberman/datasets/nyu_depth_v2.html.

@emilk emilk mentioned this pull request Jun 10, 2022
@nikolausWest
Copy link
Member

Omg, 68 commits and +5,911 −2,426 lines! Great job finishing this!

It feels like we could probably make it possible not to require setting the schema (TypeMsg) up front in the python SDK later on for smoother prototyping right?

@emilk
Copy link
Member Author

emilk commented Jun 15, 2022

The logging SDK:s could handle the type messages automatically. For instance: in Python you could write something like rerun.log_point(point_path, pos=…, color=…) and the SDK would detect "this is the first time we log to this type path, so let's send a TypeMsg first!"

@nikolausWest
Copy link
Member

Also, when you wrote "You need to log this ONCE before logging any data of that object type", you meant "You need to log this ONCE before logging any data of that type path" right?

@emilk
Copy link
Member Author

emilk commented Jun 15, 2022

Yes, good catch, I'll update the text

@emilk emilk merged commit 61d5a70 into main Jun 15, 2022
@emilk emilk deleted the use-data-store branch June 15, 2022 11:58
@emilk emilk mentioned this pull request Jan 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants