-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Description
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
In #8005 we added the capability to have the ArrowWriter accept record batches containing columns that are either the native array type, or a dictionary of values containing the same Arrow DataType.
For example, RecordBatch A contains column col of type DataType::Utf8 and RecordBatch B containing column col with type DataType::Dictionary<_, DataType::Utf8> can both be written by the same writer.
We can further improve the capability of the to detect data types that are logically equivalent. For example String and LargeString, or String, LargeString, and StringView.
Describe the solution you'd like
When the ArrowColumnWriter checks if the type for the array being written is compatible with its field, it should the logic should be improved to account all types that are logically equivalent (e.g. array types that contain the same value).
Describe alternatives you've considered
Additional context
Related discussion: #8005 (review)