This repository has been archived by the owner on Feb 18, 2024. It is now read-only.
How to go from original data in Vec<T> to arrow (and parquet)? #1551
Unanswered
nbigaouette
asked this question in
Q&A
Replies: 1 comment
-
Hi, I have the same question, particularly about |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I'm trying to wrap my head around Parquet and Arrow. What I'd like to have is a way to serialize Rust data to a binary on disk format. I though parquet could be this binary format but I'm having trouble trying to properly do this serialization.
My Rust data can be described as a group of two (or more) vectors. For example:
Most of the time, I have a single
values
vector, so I can simplify to:And since memory is managed elsewhere, the following would be ideal to serialize without copy:
Now everywhere I look it seems that the serialization expects something like a "row" instead of a "column". This means I need to create a struct like that:
Since I already have my
Vec
in memory, do I really need to "chunk" some rows together before saving to parquet? Can't I just point to the already allocated memory and say "here are two slices, save them to disk"?Beta Was this translation helpful? Give feedback.
All reactions