Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor: Add sql test for UNION / UNION ALL + plans #7787

Merged
merged 1 commit into from
Oct 11, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
74 changes: 74 additions & 0 deletions datafusion/sqllogictest/test_files/union.slt
Original file line number Diff line number Diff line change
Expand Up @@ -174,6 +174,80 @@ UNION ALL
Alice
John

# nested_union
query T rowsort
SELECT name FROM t1 UNION (SELECT name from t2 UNION SELECT name || '_new' from t2)
----
Alex
Alex_new
Alice
Bob
Bob_new
John
John_new

# should be un-nested
# https://github.com/apache/arrow-datafusion/issues/7786
query TT
EXPLAIN SELECT name FROM t1 UNION (SELECT name from t2 UNION SELECT name || '_new' from t2)
----
logical_plan
Aggregate: groupBy=[[t1.name]], aggr=[[]]
--Union
----TableScan: t1 projection=[name]
----Aggregate: groupBy=[[t2.name]], aggr=[[]]
------Union
--------TableScan: t2 projection=[name]
--------Projection: t2.name || Utf8("_new") AS name
----------TableScan: t2 projection=[name]
physical_plan
AggregateExec: mode=FinalPartitioned, gby=[name@0 as name], aggr=[]
--CoalesceBatchesExec: target_batch_size=8192
----RepartitionExec: partitioning=Hash([name@0], 4), input_partitions=8
------AggregateExec: mode=Partial, gby=[name@0 as name], aggr=[]
--------UnionExec
----------MemoryExec: partitions=4, partition_sizes=[1, 0, 0, 0]
----------AggregateExec: mode=FinalPartitioned, gby=[name@0 as name], aggr=[]
------------CoalesceBatchesExec: target_batch_size=8192
--------------RepartitionExec: partitioning=Hash([name@0], 4), input_partitions=8
----------------AggregateExec: mode=Partial, gby=[name@0 as name], aggr=[]
------------------UnionExec
--------------------MemoryExec: partitions=4, partition_sizes=[1, 0, 0, 0]
--------------------ProjectionExec: expr=[name@0 || _new as name]
----------------------MemoryExec: partitions=4, partition_sizes=[1, 0, 0, 0]

# nested_union_all
query T rowsort
SELECT name FROM t1 UNION ALL (SELECT name from t2 UNION ALL SELECT name || '_new' from t2)
----
Alex
Alex
Alex_new
Alice
Bob
Bob
Bob_new
John
John_new

# Plan is unnested
query TT
EXPLAIN SELECT name FROM t1 UNION ALL (SELECT name from t2 UNION ALL SELECT name || '_new' from t2)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test passes even without #7695 as the flattening happens in the SQL pass. #7695 move the flattening so it happens as part of the normal optimizer flow

----
logical_plan
Union
--TableScan: t1 projection=[name]
--TableScan: t2 projection=[name]
--Projection: t2.name || Utf8("_new") AS name
----TableScan: t2 projection=[name]
physical_plan
UnionExec
--MemoryExec: partitions=4, partition_sizes=[1, 0, 0, 0]
--MemoryExec: partitions=4, partition_sizes=[1, 0, 0, 0]
--ProjectionExec: expr=[name@0 || _new as name]
----MemoryExec: partitions=4, partition_sizes=[1, 0, 0, 0]


# union_with_type_coercion
query TT
explain
Expand Down