Client-side chunks 3: micro-batching (#6440) · rerun-io/rerun@fde4a87

Commit

Client-side chunks 3: micro-batching (#6440)

This is a fork of the old `DataTable` batcher, and works very similarly.

Like before, this batcher will micro-batch using both space and time
thresholds.
There are two main differences:
- This batcher maintains a dataframe per-entity, as opposed to the old
one which worked globally.
- Once a threshold is reached, this batcher further splits the incoming
batch in order to fulfill these invariants:
  ```rust
  /// In particular, a [`Chunk`] cannot:
  /// * contain data for more than one entity path
  /// * contain rows with different sets of timelines
  /// * use more than one datatype for a given component
/// * contain more rows than a pre-configured threshold if one or more
timelines are unsorted
  ```

Most of the code is the same, the real interesting piece is
`PendingRow::many_into_chunks`, as well as the newly added tests.

- Fixes #4431

---

Part of a PR series to implement our new chunk-based data model on the
client-side (SDKs):
- #6437
- #6438
- #6439
- #6440
- #6441

Loading branch information

teh-cmc authored May 31, 2024

1 parent b4b7ec4 commit fde4a87

0 comments on commit `fde4a87`

Please sign in to comment.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Commit

There are no files selected for viewing

0 comments on commit `fde4a87`

Commit

There are no files selected for viewing

0 comments on commit fde4a87

0 comments on commit `fde4a87`