Skip to content

Conversation

@liancheng
Copy link
Contributor

This PR is a follow-up of PR #10541. It integrates the newly introduced SQL generation feature with native view to make native view canonical.

In this PR, a new SQL option spark.sql.nativeView.canonical is added. When this option and spark.sql.nativeView are both true, Spark SQL tries to handle CREATE VIEW DDL statements using SQL query strings generated from view definition logical plans. If we failed to map the plan to SQL, we fallback to the original native view approach.

One important issue this PR fixes is that, now we can use CTE when defining a view. Originally, when native view is turned on, we wrap the view definition text with an extra SELECT. However, HiveQL parser doesn't allow CTE appearing as a subquery. Namely, something like this is disallowed:

SELECT n
FROM (
  WITH w AS (SELECT 1 AS n)
  SELECT * FROM w
) v

This PR fixes this issue because the extra SELECT is no longer needed (also, CTE expressions are inlined as subqueries during analysis phase, thus there won't be CTE expressions in the generated SQL query string).

@liancheng
Copy link
Contributor Author

cc @cloud-fan @yhuai

@SparkQA
Copy link

SparkQA commented Jan 13, 2016

Test build #49287 has finished for PR 10733 at commit 70c19ce.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no s.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also explain the fallback behaviour?

@cloud-fan
Copy link
Contributor

Do we have a test that fails with the current native view but works after your PR?

@liancheng
Copy link
Contributor Author

@cloud-fan "CTE within view" is such a test case.

@SparkQA
Copy link

SparkQA commented Jan 13, 2016

Test build #49320 has finished for PR 10733 at commit d042c3f.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@liancheng liancheng force-pushed the spark-12728.integrate-sql-gen-with-native-view branch from d042c3f to 0d9d55b Compare January 13, 2016 18:20
@liancheng
Copy link
Contributor Author

retest this please

@SparkQA
Copy link

SparkQA commented Jan 13, 2016

Test build #49323 has finished for PR 10733 at commit 0d9d55b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

LGTM

@yhuai
Copy link
Contributor

yhuai commented Jan 13, 2016

Thanks @liancheng. I'd like to also review it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we enable it by default? So, we can get it tested more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, should we enable nativeView as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm for enabling canonical native view by default, but not sure whether native view should be enabled by default. After all, native view is still feature incomplete, no matter canonicalized or not.

@SparkQA
Copy link

SparkQA commented Jan 18, 2016

Test build #49567 has finished for PR 10733 at commit c9e9c1b.

  • This patch fails Spark unit tests.
  • This patch does not merge cleanly.
  • This patch adds no public classes.

@liancheng liancheng force-pushed the spark-12728.integrate-sql-gen-with-native-view branch from c9e9c1b to 3c50fd6 Compare January 18, 2016 09:39
@SparkQA
Copy link

SparkQA commented Jan 18, 2016

Test build #49589 has finished for PR 10733 at commit 3c50fd6.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@yhuai
Copy link
Contributor

yhuai commented Jan 26, 2016

test this please

@SparkQA
Copy link

SparkQA commented Jan 26, 2016

Test build #50108 has finished for PR 10733 at commit 3c50fd6.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it safe to call zip? We need to check the number of fields, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also have a test for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's an invariant condition of CreateViewAsSelect that these two have the same size. Users can't construct a case that violates this condition, thus adding test for this might not be necessary. I'm for adding a check here though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yhuai
Copy link
Contributor

yhuai commented Jan 26, 2016

Overall looks good. Left two comments.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drops view vewName...?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, nice catch, thanks!

@SparkQA
Copy link

SparkQA commented Jan 26, 2016

Test build #50125 has finished for PR 10733 at commit 51b9db2.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jan 26, 2016

Test build #50128 has finished for PR 10733 at commit 737a7d0.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@liancheng
Copy link
Contributor Author

retest this please

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry. I missed this. How about we also have a check for this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realized that we do the check at https://github.com/apache/spark/pull/10733/files#diff-074b1d8480e0d0d7c212bc4461f3d4acR43 (assert(tableDesc.schema == Nil || tableDesc.schema.length == childSchema.length))

@SparkQA
Copy link

SparkQA commented Jan 27, 2016

Test build #50140 has finished for PR 10733 at commit 737a7d0.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry.... Actually we do not need this check. We have already done it at https://github.com/apache/spark/pull/10733/files#diff-074b1d8480e0d0d7c212bc4461f3d4acR43....

@SparkQA
Copy link

SparkQA commented Jan 27, 2016

Test build #50162 has finished for PR 10733 at commit 0ce28d9.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@yhuai
Copy link
Contributor

yhuai commented Jan 27, 2016

LGTM. Merging to master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants