-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-12728][SQL] Integrates SQL generation with native view #10733
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-12728][SQL] Integrates SQL generation with native view #10733
Conversation
|
cc @cloud-fan @yhuai |
|
Test build #49287 has finished for PR 10733 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no s.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also explain the fallback behaviour?
|
Do we have a test that fails with the current native view but works after your PR? |
|
@cloud-fan "CTE within view" is such a test case. |
|
Test build #49320 has finished for PR 10733 at commit
|
d042c3f to
0d9d55b
Compare
|
retest this please |
|
Test build #49323 has finished for PR 10733 at commit
|
|
LGTM |
|
Thanks @liancheng. I'd like to also review it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we enable it by default? So, we can get it tested more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, should we enable nativeView as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm for enabling canonical native view by default, but not sure whether native view should be enabled by default. After all, native view is still feature incomplete, no matter canonicalized or not.
|
Test build #49567 has finished for PR 10733 at commit
|
c9e9c1b to
3c50fd6
Compare
|
Test build #49589 has finished for PR 10733 at commit
|
|
test this please |
|
Test build #50108 has finished for PR 10733 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it safe to call zip? We need to check the number of fields, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's also have a test for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's an invariant condition of CreateViewAsSelect that these two have the same size. Users can't construct a case that violates this condition, thus adding test for this might not be necessary. I'm for adding a check here though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nvm, we already have the test at https://github.com/apache/spark/pull/10733/files#diff-074b1d8480e0d0d7c212bc4461f3d4acR43.
|
Overall looks good. Left two comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Drops view vewName...?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, nice catch, thanks!
|
Test build #50125 has finished for PR 10733 at commit
|
|
Test build #50128 has finished for PR 10733 at commit
|
|
retest this please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry. I missed this. How about we also have a check for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just realized that we do the check at https://github.com/apache/spark/pull/10733/files#diff-074b1d8480e0d0d7c212bc4461f3d4acR43 (assert(tableDesc.schema == Nil || tableDesc.schema.length == childSchema.length))
|
Test build #50140 has finished for PR 10733 at commit
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry.... Actually we do not need this check. We have already done it at https://github.com/apache/spark/pull/10733/files#diff-074b1d8480e0d0d7c212bc4461f3d4acR43....
|
Test build #50162 has finished for PR 10733 at commit
|
|
LGTM. Merging to master. |
This PR is a follow-up of PR #10541. It integrates the newly introduced SQL generation feature with native view to make native view canonical.
In this PR, a new SQL option
spark.sql.nativeView.canonicalis added. When this option andspark.sql.nativeVieware bothtrue, Spark SQL tries to handleCREATE VIEWDDL statements using SQL query strings generated from view definition logical plans. If we failed to map the plan to SQL, we fallback to the original native view approach.One important issue this PR fixes is that, now we can use CTE when defining a view. Originally, when native view is turned on, we wrap the view definition text with an extra
SELECT. However, HiveQL parser doesn't allow CTE appearing as a subquery. Namely, something like this is disallowed:This PR fixes this issue because the extra
SELECTis no longer needed (also, CTE expressions are inlined as subqueries during analysis phase, thus there won't be CTE expressions in the generated SQL query string).