Storing Parquet metadata separate from parquet file #4127
-
I see that https://docs.rs/parquet/latest/parquet/format/struct.FileMetaData.html is being returned when closing a Writer. And FileMetaData implements https://docs.rs/thrift/0.17.0/thrift/protocol/trait.TSerializable.html Is it possible to store FileMetaData serialized as raw bytes separate from the actual Parquet file? What guarantees are one the serialized Metadata? Is it tied to a specific parquet-rs version? Basically, whether the Metadata is suitable for long term storage, just like the parquet format is. My use case is storing Parquet files in S3, but I want to store the metadata in a separate db along with a link to the actual S3 file. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 2 replies
-
Yes, this is a stable representation as it is what gets encoded in the footer of the file itself - see here |
Beta Was this translation helpful? Give feedback.
Yes, this is a stable representation as it is what gets encoded in the footer of the file itself - see here