Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Parquet] Plain encoded boolean column chunks limited to 2048 values #48

Closed
alamb opened this issue Apr 26, 2021 · 1 comment
Closed
Labels
parquet Changes to the parquet crate

Comments

@alamb
Copy link
Contributor

alamb commented Apr 26, 2021

Note: migrated from original JIRA: https://issues.apache.org/jira/browse/ARROW-6189

encoding::PlainEncoder::new creates a BitWriter with 256 bytes of storage, which limits the data page size that can be used.

I suggest that in

{{impl Encoder for PlainEncoder}}

the return value of put_value is tested and the BitWriter flushed+cleared whenever it runs out of space.

@alamb alamb added arrow Changes to the arrow crate parquet Changes to the parquet crate and removed arrow Changes to the arrow crate labels Apr 26, 2021
@nevi-me
Copy link
Contributor

nevi-me commented Jun 16, 2021

Fixed by #443

@nevi-me nevi-me closed this as completed Jun 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
parquet Changes to the parquet crate
Projects
None yet
Development

No branches or pull requests

2 participants