-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-43611][SQL][PS][CONNCECT] Make ExtractWindowExpressions retain the PLAN_ID_TAG
#42086
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-43611][SQL][PS][CONNCECT] Make ExtractWindowExpressions retain the PLAN_ID_TAG
#42086
Conversation
|
I have checked with @cloud-fan that we might have to modify the rules one by one. |
Great! Could you help creating tickets for remaining 3~4 more rules under SPARK-42497?? |
TBH, I am not very sure about which rules to modify for the remaining UTs, it needs further investigation |
|
I got it. Just created single ticket here: SPARK-44492 for addressing undefined remaining tests so we don't miss it. |
|
I am fine with this as a workaround for now but such implementation depending on tags is sort of flaky. The tags are easily lost when you, e.g., copy the expressions IIRC. |
38bc650 to
f3efddf
Compare
666b2c1 to
f3efddf
Compare
2ec3f81 to
e22e1d9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| other.getTagValue(tag) | |
| .foreach(this.setTagValue(tag, _)) | |
| other.getTagValue(tag).foreach(this.setTagValue(tag, _)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we copy all tags?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good, let me have a try
e22e1d9 to
ff6d479
Compare
ExtractWindowExpressions retain the PLAN_ID_TAG ExtractWindowExpressions retain the PLAN_ID_TAG
ff6d479 to
fc488da
Compare
|
cc @cloud-fan all tests passed, would you mind taking another look? |
|
thanks! merged to master |
|
Thanks! |
### What changes were proposed in this pull request? Enable more tests, they were excluded from #42086 due to the flaky CI issues ### Why are the changes needed? for test parity ### Does this PR introduce _any_ user-facing change? no, test-only ### How was this patch tested? enabled tests Closes #42182 from zhengruifeng/spark_43611_followup. Authored-by: Ruifeng Zheng <ruifengz@apache.org> Signed-off-by: Ruifeng Zheng <ruifengz@apache.org>
|
Should we merge this to 3.5 too? |
|
@HyukjinKwon not sure, maybe not needed? |
…in the `PLAN_ID_TAG` ### What changes were proposed in this pull request? Make rule `ExtractWindowExpressions` retain the `PLAN_ID_TAG ` ### Why are the changes needed? In apache#39925, we introduced a new mechanism to resolve expression with specified plan. However, sometimes the plan ID might be discarded by some analyzer rules, and then some expressions can not be correctly resolved, this issue is the main blocker of PS on Connect. ### Does this PR introduce _any_ user-facing change? yes, a lot of Pandas APIs enabled ### How was this patch tested? Enable UTs Closes apache#42086 from zhengruifeng/ps_connect_analyze_window. Authored-by: Ruifeng Zheng <ruifengz@apache.org> Signed-off-by: Ruifeng Zheng <ruifengz@apache.org>
…wExpressions` & `WidenSetOperationTypes` retain the `PLAN_ID_TAG` ### What changes were proposed in this pull request? Backport for #42086 and #42230 ### Why are the changes needed? for functionality parity ### Does this PR introduce _any_ user-facing change? Enabling couple of pandas APIs ### How was this patch tested? Enabling the existing UTs Closes #42252 from itholic/SPARK-43611-3.5. Lead-authored-by: Ruifeng Zheng <ruifengz@apache.org> Co-authored-by: itholic <haejoon.lee@databricks.com> Signed-off-by: Ruifeng Zheng <ruifengz@apache.org>


What changes were proposed in this pull request?
Make rule
ExtractWindowExpressionsretain thePLAN_ID_TAGWhy are the changes needed?
In #39925, we introduced a new mechanism to resolve expression with specified plan.
However, sometimes the plan ID might be discarded by some analyzer rules, and then some expressions can not be correctly resolved, this issue is the main blocker of PS on Connect.
Does this PR introduce any user-facing change?
yes, a lot of Pandas APIs enabled
How was this patch tested?
Enable UTs