-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW2: Optimize parquet read memory usage #1657
Comments
It looks like related upstream issue is now closed: jorgecarleitao/arrow2#768 |
I think we can close it as arrow2 and arrow will be merged in the future. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the bug
First reported by @ic4y at #1556 (comment).
This is also causing TPCH q7 benchmark to fail due to OOM in #1652 (comment).
To Reproduce
Compare peak memory usage between 2008b1d and c0c9c72 when processing a parquet table.
Expected behavior
Memory usage should be on par with arrow-rs or alternatively we should have an option in arrow2 to let user make memory usage and array segmentation tradeoffs.
Additional context
Related upstream issue: jorgecarleitao/arrow2#768
The text was updated successfully, but these errors were encountered: