Standardize creation and configuration of parquet --> Arrow readers ( ParquetRecordBatchReaderBuilder
)
#2427
Labels
enhancement
Any new improvement worthy of a entry in the changelog
parquet
Changes to the parquet crate
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
Currently a
ParquetFileArrowReader
is created from aChunkReader
or anArc<dyn FileReader>
, and an optional set ofArrowReaderOptions
. ThenArrowReader
can be used to obtain aParquetRecordBatchReader
from this.Not only is this interface deeply confusing, but it is unclear how to extend it to support functionality such as row filtering, predicate pushdown, etc... which needs the schema information before it can be computed, information which is only available after the file has been opened.
Describe the solution you'd like
I would like a
ParquetRecordBatchReaderBuilder
similar toParquetRecordBatchStreamBuilder
Describe alternatives you've considered
Additional context
The text was updated successfully, but these errors were encountered: