Add benchmarks for BYTE_STREAM_SPLIT
encoded Parquet FIXED_LEN_BYTE_ARRAY
data
#6203
Labels
enhancement
Any new improvement worthy of a entry in the changelog
parquet
Changes to the parquet crate
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
#6159 added support for using
BYTE_STREAM_SPLIT
withFIXED_LEN_BYTE_ARRAY
primitive types. While some effort was put into optimizing the encoding path, the decoding path is largely unoptimized (and seemingly quite slow). It would be nice to have some benchmarks for the new encodings to guide future optimization efforts.Describe the solution you'd like
Benchmarks for
Float16/FIXED_LEN_BYTE_ARRAY(2)
andDECIMAL/FIXED_LEN_BYTE_ARRAY(16)
would be a good start for some likely to be used data types.Describe alternatives you've considered
Additional context
See #6159 (comment) and following.
The text was updated successfully, but these errors were encountered: