Skip to content

Add a way to clear out all buffered ranges from ParquetPushDecoder #8676

@alamb

Description

@alamb

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

The ParquetMetaDataPushDecoder and ParquetPushDecoder both use a structured named PushBuffers to store in memory buffer state.

The ParquetPushDecoder will free any ranges it explicitly asked for after it has completed reading a row group. However, if a user eagerly pushes ranges / parts of a file, the ParquetPushDecoder will use those for decoding, but won't explicitly free them. There is also no way to free them manually either

Describe the solution you'd like

As we mentioned in #7997 (comment), it would be nice to have some way to release all underlying memory in the ParquetPushDecoder

Describe alternatives you've considered
Perhaps we can add an API like

let mut push_decoder = ...;
// release references to any data that has previously been pushed via push_ranges
push_decoder.release_all_ranges()


**Additional context**
<!--
Add any other context or screenshots about the feature request here.
-->

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementAny new improvement worthy of a entry in the changelog

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions