Skip to content

[Parquet] Provide only encrypted column stats in plaintext footer #8304

@rok

Description

@rok

As per spec:

In the plaintext footer mode, the optional ColumnMetaData meta_data is set in the ColumnChunk structure for all columns, but is stripped of the statistics for the sensitive (encrypted) columns. These statistics are available for new readers with the column key - they decrypt the encrypted_column_metadata field, described in the section 5.3, and parse it to get statistics and all other column metadata values. The legacy readers are not aware of the encrypted metadata field; they parse the regular (plaintext) field as usual. While they can’t read the data of encrypted columns, they read their metadata to extract the offset and size of encrypted column data, required for column chunk vectorization.

Current writer writes stats into plaintext part of footers too. We would want to write stats into encrypted part of footer only as per the spec.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementAny new improvement worthy of a entry in the changelog

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions