-
Notifications
You must be signed in to change notification settings - Fork 4k
Description
Related to ARROW-8062 (as there we will also need a way to expose the global FileMetadata). But independently, it would be useful to get access to the FileMetadata on each ParquetFileFragment (eg to get access to the statistics).
This would be relatively simple to code on the Python/R side, since we have access to the file path, and could read the metadata from the file backing the fragment, and return this as a FileMetadata object.
I am wondering if we want to integrate this with ARROW-8062, since when the fragments were created from a _metadata file, a ParquetFileFragment.metadata attribute would not need to read it from the parquet file in this case, but from the global metadata (at least for eg the row group data).
Another question: what for a ParquetFileFragment that maps to a single row group?
Reporter: Joris Van den Bossche / @jorisvandenbossche
Assignee: Ben Kietzman / @bkietz
Related issues:
PRs and other links:
Note: This issue was originally created as ARROW-8733. Please see the migration documentation for further details.