Skip to content

Add Java metadata for PageEncodingStats #1949

@asfimport

Description

@asfimport

PARQUET-384 needs to determine whether an entire column chunk is dictionary-encoded, but it is difficult to detect that case based on the set of encodings for a column. For 1.0, this can be done by checking for a PLAIN page because both dictionary pages and dictionary-encoded pages use PLAIN_DICTIONARY and RLE/BIT_PACKING is only used for repetition and definition levels. But for 2.0, dictionary pages might be using PLAIN and there is no way to tell if a column has fallen back.

PageEncodingStats were added to the format to solve this problem, so we just need to implement them.

Reporter: Ryan Blue / @rdblue
Assignee: Ryan Blue / @rdblue

Related issues:

PRs and other links:

Note: This issue was originally created as PARQUET-548. Please see the migration documentation for further details.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions