Allow ParquetExec to parallelize work based on row groups #137
Labels
datafusion
Changes in the datafusion crate
enhancement
New feature or request
help wanted
Extra attention is needed
Note: migrated from original JIRA: https://issues.apache.org/jira/browse/ARROW-11056
ParquetExec currently parallelizes work by passinging individual files to threads. It would be nice to be able to do this in a finer-grained way by assigning row groups and/or column chunks instead. This will be especially important in distributed systems built on DataFusion.
The text was updated successfully, but these errors were encountered: