Skip to content

Commit

Permalink
Merge pull request #1709 from ion-elgreco/feat/enable_prebuffer_pyarrow
Browse files Browse the repository at this point in the history
feat: improve read performance by 7x with prebuffer
  • Loading branch information
rtyler authored Oct 9, 2023
2 parents 0a41ebc + 94b41b7 commit ab6b0cf
Show file tree
Hide file tree
Showing 2 changed files with 6 additions and 1 deletion.
6 changes: 5 additions & 1 deletion python/deltalake/table.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@
Expression,
FileSystemDataset,
ParquetFileFormat,
ParquetFragmentScanOptions,
ParquetReadOptions,
)

Expand Down Expand Up @@ -538,7 +539,10 @@ def to_pyarrow_dataset(
)
)

format = ParquetFileFormat(read_options=parquet_read_options)
format = ParquetFileFormat(
read_options=parquet_read_options,
default_fragment_scan_options=ParquetFragmentScanOptions(pre_buffer=True),
)

fragments = [
format.make_fragment(
Expand Down
1 change: 1 addition & 0 deletions python/stubs/pyarrow/dataset.pyi
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ Expression: Any
field: Any
partitioning: Any
FileSystemDataset: Any
ParquetFragmentScanOptions: Any
ParquetFileFormat: Any
ParquetReadOptions: Any
ParquetFileWriteOptions: Any
Expand Down

0 comments on commit ab6b0cf

Please sign in to comment.