-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support correct output column names and struct field names when consuming/producing Substrait #10829
Support correct output column names and struct field names when consuming/producing Substrait #10829
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great to me -- thank you @Blizzara (kudos for the tests especially)
I think this PR has some merge conflicts, but once those are resolved we can merge it in
\n TableScan: data projection=[a]", | ||
false // NULL vs Int64(NULL) | ||
true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉
false, // "List(..)" vs "make_array(..)" | ||
) | ||
.await | ||
roundtrip("SELECT [[1,2,3], [], NULL, [NULL]] FROM data").await |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I verified that roundtrip
actually is more stringent than assert_expected_plan
as it checks the plan before and after roundtripping 👍
e4b4b3d
to
6c1bb88
Compare
(assert_eq gives nicer error messages than assert)
🚀 -- thanks again @Blizzara |
…ming/producing Substrait (apache#10829) * produce flattened list of names including inner struct fields * add a (failing) test * rename output columns (incl. inner struct fields) according to the given list of names * fix a test * add column names project to the new TPC-H test and fix case (assert_eq gives nicer error messages than assert)
Which issue does this PR close?
Closes #10817
Rationale for this change
What changes are included in this PR?
names
listAre these changes tested?
Tested by a new test specifically for testing Alias (" AS "), as well as tightening existing tests.
Are there any user-facing changes?
Plans produced by DataFusion before this change are not compatible with plans produced after this change, if the plan contains inner struct fields, since the length of the "names" field does not match. The previous behavior was wrong, but still it's a breaking change. If that's considered important, it is possible to add an extra handling for those cases, but I'd prefer not to unless necessary.
Also some output column names might change if people are depending on the old behavior that did not rename them.