-
Notifications
You must be signed in to change notification settings - Fork 433
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(rust/python): optimize.compact
not working with tables with mixed large/normal arrow
#1926
fix(rust/python): optimize.compact
not working with tables with mixed large/normal arrow
#1926
Conversation
optimize.compact
not working with tables with mixed large and normal arrow schemas
optimize.compact
not working with tables with mixed large and normal arrow schemasoptimize.compact
not working with tables with mixed large/normal arrow
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The specific issue being addressed here I don't have an opinion on, but making cast_record_batch
a public API is fine to me.
@rtyler yeah the cast_record_batch is needed since we have writers who can write large arrow data into a parquet. According to Will, the arrow writers serialize the arrow schema in the metadata of the parquet, so when we re-read these parquets, there is a chance some recordbatches will have the large dtypes while the others aren't. |
dependency A dependency from optimize on the cast_record_batch function was added which cannot be met without the `datafusion` feature enabled See delta-io#1926
dependency A dependency from optimize on the cast_record_batch function was added which cannot be met without the `datafusion` feature enabled See #1926
Description
Issues