-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DataFrame.except() does not work with structs in schema #10749
Comments
I just skimmed this real quick, so I might be wrong here. But might the issue be rooted at arrow-rs itself:
|
The next arrow-rs release will support nested comparison apache/arrow-rs#5792 |
arrow 52 has been released with the fix but the underlying |
…would work This relies on newer functionality in arrow 52 and allows DataFrame.except() to properly work on schemas with structs and lists Closes apache#10749
It seems like the distinct kernel still does not support nested comparisons (which is fine) but that is the reason #11117 was needed. : https://github.com/apache/arrow-rs/blob/main/arrow-ord/src/cmp.rs#L235-L238 I filed apache/arrow-rs#5960 to track adding this |
…would work (apache#11117) This relies on newer functionality in arrow 52 and allows DataFrame.except() to properly work on schemas with structs and lists Closes apache#10749
Describe the bug
When taking two
DataFrame
objects and runningexcept
the function fails when there are Structs in the schema, but succeeds with more simple schemas.For example, this works:
To Reproduce
Expected behavior
I would expect the above to pass assertions, instead this output is produced:
Additional context
I should also note that I tested this with DataFusion 37 and 38, same results.
The text was updated successfully, but these errors were encountered: