Skip to content

ParquetFileReader should allow closing resources except for file input stream #3207

@pan3793

Description

@pan3793

Describe the enhancement requested

This allows the caller to reuse the SeekableInputStream but avoid leaking other resources held by ParquetFileReader, for example, in Spark Parquet vectorized reading code path, it opens two times for each Parquet file:

  1. the first time opens and reads the footer, do some pruning and push down stuff.
  2. the second time opens and reads the row groups

Component(s)

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions