Introducing ScalarChart
in place of BarChart
?
#5622
Labels
🔩 data model
💬 discussion
🚀 performance
Optimization, memory use, etc
user-request
This is a pressing issue for one of our users
Context
We have the
Scalar
archetype that makes it possible to log and visualize scalar timeseries (in the most literal sense: the X axis is literally the recording's clock):This is good in that this allows users to just log their scalar data as it comes without having to manually keep track of any kind of state.
This is bad because it means each scalar has to be its own
DataRow
(1 row == 1 timepoint), which leads to performance issues if you just want to log a huge timeseries for which you already have all the data needed in one place.These performance issues come in two forms:
DataRow
s for every scalar.We do have a long-term plan that would allow users to log of "temporal batches", i.e. multiple timestamps worth of data in a single log call.
But A) it will be a while before this is implemented and B) it doesn't solve the second form of performance issue, discussed below.
DataRow
s.Indexing a row is a costly operation: it not only has to run all the datastore logic (indexing all the individual cells etc), but it also triggers a chain of events that need to propagate and update all downstream subscribers (datastore views, time panel, heuristics, clear cascades...).
AFAICT, batching row ingestion is much harder problem that batching on the logging side.
And even if we make it past the ingestion, rendering a time panel with a few million entries is probably still no cheap task (?).
Interestingly, we also have the
BarChart
archetype, which already has the nice property of accepting a batch of scalars all at once:The one downside is that this doesn't integrate at all with the time cursor, since the barchart as a whole is its own entity.
But, for many cases, this can still be a very useful tool in real world scenarios, especially when combined with timeless/static.
Proposal
Retire the
BarChart
archetype in favor of a new genericScalarChart
archetype:As for styling,
ScalarChart
would re-use the same styling archetypes asScalar
:SeriesLine
to visualize the data as a line, andSeriesPoint
to visualize it as a scatter plot.We would also introduce a new
SeriesBar
style:That new style would also retroactively work with the
Scalar
archetype.All in all, this would improve the existing
Scalar
type by making it possible to visualize the data as a bar chart, and would allow users to work with very large series using the newScalarChart
.Of course this has the same downside as the original
BarChart
archetype: it doesn't integrate with the time cursor.As part of this work, we would also use this opportunity to share a lot more code between
Scalar
andScalarChart
, so thatScalarChart
can benefit from all the recent improvements to the plot view (range caching, subpixel aggregation, etc).Random thoughts
Unrelated to any of the above: maybe we should still allow batches of vanilla
Scalar
s, if only so that people can at least batch their vanilla scalar data when they know they have more than a single value for a given timestamp? Sounds niche, but it is something that happens in the e.g. the VRS example 🤷The text was updated successfully, but these errors were encountered: