-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-42525][SQL] Collapse two adjacent windows with semantically-same partition/order #40115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala
Outdated
Show resolved
Hide resolved
sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala
Outdated
Show resolved
Hide resolved
sql/core/src/test/scala/org/apache/spark/sql/DataFrameWindowFunctionsSuite.scala
Outdated
Show resolved
Hide resolved
|
@zml1206 Could you update the PR title to |
|
Merged to master. |
|
the change LGTM but the PR title is a bit confusing. How is it related to subquery? |
subquery is one of the cases where the qualifiers are different,it's really confusing, is there anything else I can do to modify it |
|
|
…ion/order in subquery
### What changes were proposed in this pull request?
Extend the CollapseWindow rule to collapse Window nodes, when one window in subquery.
### Why are the changes needed?
```
select a, b, c, row_number() over (partition by a order by b) as d from
( select a, b, rank() over (partition by a order by b) as c from t1) t2
== Optimized Logical Plan ==
before
Window [row_number() windowspecdefinition(a#11, b#12 ASC NULLS FIRST, specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) AS d#26], [a#11], [b#12 ASC NULLS FIRST]
+- Window [rank(b#12) windowspecdefinition(a#11, b#12 ASC NULLS FIRST, specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) AS c#25], [a#11], [b#12 ASC NULLS FIRST]
+- InMemoryRelation [a#11, b#12], StorageLevel(disk, memory, deserialized, 1 replicas)
+- *(1) Project [_1#6 AS a#11, _2#7 AS b#12]
+- *(1) SerializeFromObject [knownnotnull(assertnotnull(input[0, scala.Tuple2, true]))._1 AS _1#6, knownnotnull(assertnotnull(input[0, scala.Tuple2, true]))._2 AS _2#7]
+- *(1) MapElements org.apache.spark.sql.DataFrameSuite$$Lambda$1517/16288483683a479fda, obj#5: scala.Tuple2
+- *(1) DeserializeToObject staticinvoke(class java.lang.Long, ObjectType(class java.lang.Long), valueOf, id#0L, true, false, true), obj#4: java.lang.Long
+- *(1) Range (0, 10, step=1, splits=2)
after
Window [rank(b#12) windowspecdefinition(a#11, b#12 ASC NULLS FIRST, specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) AS c#25, row_number() windowspecdefinition(a#11, b#12 ASC NULLS FIRST, specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) AS d#26], [a#11], [b#12 ASC NULLS FIRST]
+- InMemoryRelation [a#11, b#12], StorageLevel(disk, memory, deserialized, 1 replicas)
+- *(1) Project [_1#6 AS a#11, _2#7 AS b#12]
+- *(1) SerializeFromObject [knownnotnull(assertnotnull(input[0, scala.Tuple2, true]))._1 AS _1#6, knownnotnull(assertnotnull(input[0, scala.Tuple2, true]))._2 AS _2#7]
+- *(1) MapElements org.apache.spark.sql.DataFrameSuite$$Lambda$1518/19280286724d7a64ca, obj#5: scala.Tuple2
+- *(1) DeserializeToObject staticinvoke(class java.lang.Long, ObjectType(class java.lang.Long), valueOf, id#0L, true, false, true), obj#4: java.lang.Long
+- *(1) Range (0, 10, step=1, splits=2)
```
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
UT
Closes apache#40115 from zml1206/SPARK-42525.
Authored-by: zml1206 <zhuml1206@gmail.com>
Signed-off-by: Yuming Wang <yumwang@ebay.com>
# Conflicts:
# sql/core/src/test/scala/org/apache/spark/sql/DataFrameWindowFunctionsSuite.scala
What changes were proposed in this pull request?
Extend the CollapseWindow rule to collapse Window nodes, when one window in subquery.
Why are the changes needed?
Does this PR introduce any user-facing change?
No
How was this patch tested?
UT