Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: use columnar data within vgplot #334

Merged
merged 13 commits into from
Mar 22, 2024
Merged

feat: use columnar data within vgplot #334

merged 13 commits into from
Mar 22, 2024

Conversation

domoritz
Copy link
Member

This defers chasing the arrow tables to arrays of objects and optimizes some of the computation to use columnar processing (e.g. in nearest).

There are possibly more things to optimize (e.g. Plot itself observablehq/plot#191) and we can remove some of the special handling when Arrow supports Decimals better.

@domoritz domoritz requested a review from jheer March 20, 2024 22:34
@domoritz
Copy link
Member Author

Hmm, this breaks the observable latency example. Need to check what's up.

@domoritz domoritz marked this pull request as draft March 20, 2024 23:10
@jheer
Copy link
Member

jheer commented Mar 21, 2024

I like the goal here, but some of these changes probably won’t help much (and others will hurt performance) until Plot also supports optimized columnar data.

Using Arrow’s automatic row object generation (via Proxy) has a non-trivial overhead cost. That’s why I have our own faster object generation method (arrowToObjects) to avoid costly unpacking within Plot. So for now we should still use that even if no type conversions are needed.

@mbostock
Copy link
Contributor

Just chiming in here — Plot already supports optimized columnar data, it just doesn’t support shorthand for that data. I added an example to observablehq/plot#191 (comment) that shows how to pass in columnar data. Please let me know if we can help!

@domoritz
Copy link
Member Author

Just chiming in here — Plot already supports optimized columnar data, it just doesn’t support shorthand for that data. I added an example to observablehq/plot#191 (comment) that shows how to pass in columnar data. Please let me know if we can help!

Something like 62e72fa (#334) ?

@domoritz domoritz marked this pull request as ready for review March 21, 2024 03:05
@mbostock
Copy link
Contributor

That looks right to me, @domoritz! 👍

@domoritz domoritz changed the title feat: leave arrow tables as arrow if possible feat: use columnar data within vgplot Mar 22, 2024
@domoritz domoritz merged commit 3dd57de into main Mar 22, 2024
2 checks passed
@domoritz domoritz deleted the dom/arrow branch March 22, 2024 00:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants