-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support union arrays in concat_tables
#44397
Comments
One problem with my implementation above (besides the fact that it's done in Python rather than on a lower level in the Arrow engine) is that try:
result[column_name] = concat_tables(
tables=[t.select([column_name]) for t in tables],
promote_options="permissive",
)[column_name]
except ArrowTypeError: But that still feels hacky... |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the enhancement requested
It would be nice to have another step up from
promote_options="permissive"
, e.g.,promote_options="union"
which uses dense unions when columns are heterogeneous across schemas. For example:The latter should use a dense union for column "a".
I've implemented this myself, but it's hard to do, because there is no
is_mergeable
function which exposes the logic used byconcat_tables(tables=…, promote_options="permissive")
for me to use, causing me to have to re-implement that, either using guesswork, or using lots oftry-except
s. Here is a rough attempt, which works for some cases, but not all. It also does not preserve metadata, nor support missing columns:Component(s)
Python
The text was updated successfully, but these errors were encountered: