[SPARK-30568][SQL] Invalidate interval type as a field table schema #27277

yaooqinn · 2020-01-19T07:51:34Z

What changes were proposed in this pull request?

After this commit d67b98e, we are able to create table or alter table with interval column types if the external catalog accepts which is varying the interval type's purpose for internal usage. With d67b98e 's original purpose it should only work from cast logic.

Instead of adding type checker for the interval type from commands to commands to work among catalogs, It much simpler to treat interval as an invalid data type but can be identified by cast only.

Why are the changes needed?

enhance interval internal usage purpose.

Does this PR introduce any user-facing change?

NO,
Additionally, this PR restores user behavior when using interval type to create/alter table schema, e.g. for hive catalog
for 2.4,

Caused by: org.apache.spark.sql.catalyst.parser.ParseException:
DataType calendarinterval is not supported.(line 1, pos 0)

for master after d67b98e

Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.IllegalArgumentException: Error: type expected at the position 0 of 'interval' but 'interval' is found.
  at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:862)

now with this pr, we restore the type checker in spark side.

How was this patch tested?

add more ut

SparkQA · 2020-01-19T08:05:02Z

Test build #116994 has finished for PR 27277 at commit b0c3169.

This patch fails due to an unknown error code, -9.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2020-01-19T12:03:53Z

Test build #116995 has finished for PR 27277 at commit 46a96a3.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

cloud-fan · 2020-01-19T14:24:10Z

This does reduce the scope of d67b98e to cast only, but we need more than just "keep behavior the same as 2.4", we need to hide interval type from external data sources/catalogs.

In 2.4, CREATE TABLE can still has interval type column, even if parser doesn't allow it. This is because we have a java API spark.catalog.createTable. Fortunately, this is OK in 2.4, as Hive catalog doesn't allow it and the CREATE TABLE command fails at the end.

In 3.0, the catalog becomes an API, and it's possible that we leak interval type column to external catalog implementations.

I think it's OK to allow interval type in the parser, as we allow it in spark.catalog.createTable already. But it's important to disallow creating table with interval type column like 2.4. We need to add a check in the analyzer.

This reverts commit 46a96a3.

…schema" This reverts commit b0c3169.

cloud-fan · 2020-01-20T07:44:32Z

sql/core/src/test/scala/org/apache/spark/sql/connector/DataSourceV2SQLSuite.scala

    checkAnswer(spark.internalCreateDataFrame(rdd, table.schema), Seq.empty)
  }

+  test("CreateTable: invalid schema if has interval type") {


can we also test CTAS and replace table?

SparkQA · 2020-01-20T08:05:01Z

Test build #117089 has finished for PR 27277 at commit 2c29977.

This patch fails due to an unknown error code, -9.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2020-01-20T11:27:16Z

Test build #117102 has finished for PR 27277 at commit 4fc3f4b.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2020-01-20T15:59:19Z

Test build #117115 has finished for PR 27277 at commit 9e5b209.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2020-01-20T17:49:14Z

Test build #117119 has finished for PR 27277 at commit a796d68.

This patch passes all tests.
This patch does not merge cleanly.
This patch adds no public classes.

SparkQA · 2020-01-20T19:58:15Z

Test build #117126 has finished for PR 27277 at commit 18163ae.

This patch passes all tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
case class UnresolvedTableWithViewExists(view: ResolvedView) extends LeafNode
case class ResolvedView(identifier: Identifier, isTempView: Boolean) extends LeafNode
abstract class AlterTable extends Command
case class AlterTableAddColumns(
case class AlterTableAlterColumn(
case class AlterTableRenameColumn(
case class AlterTableDropColumns(
case class AlterTableSetProperties(
case class AlterTableUnsetProperties(
case class AlterTableSetLocation(

cloud-fan · 2020-01-21T03:14:41Z

thanks, merging to master!

yaooqinn added 2 commits January 19, 2020 15:40

[SPARK-30568][SQL] Invalidate interval type as a field table schema

b0c3169

nit

46a96a3

yaooqinn requested review from cloud-fan and wangyum and removed request for cloud-fan January 19, 2020 07:53

yaooqinn mentioned this pull request Jan 19, 2020

[SPARK-28435][SQL] Support accepting the interval keyword in the schema string #25189

Closed

yaooqinn added 5 commits January 20, 2020 10:51

Revert "nit"

c93611d

This reverts commit 46a96a3.

Revert "[SPARK-30568][SQL] Invalidate interval type as a field table …

3a0e784

…schema" This reverts commit b0c3169.

Merge branch 'master' into SPARK-30568

430c493

check analysis

64952ea

style

2c29977

cloud-fan reviewed Jan 20, 2020

View reviewed changes

add tests/ revert v1 checker

4fc3f4b

cloud-fan approved these changes Jan 20, 2020

View reviewed changes

yaooqinn added 3 commits January 20, 2020 19:40

fix test

9e5b209

checker for v2 writer

a796d68

Merge branch 'master' into SPARK-30568

18163ae

cloud-fan closed this in 0388b7a Jan 21, 2020

dongjoon-hyun added the SQL label Feb 5, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-30568][SQL] Invalidate interval type as a field table schema #27277

[SPARK-30568][SQL] Invalidate interval type as a field table schema #27277

Uh oh!

yaooqinn commented Jan 19, 2020 •

edited

Loading

Uh oh!

SparkQA commented Jan 19, 2020

Uh oh!

SparkQA commented Jan 19, 2020

Uh oh!

cloud-fan commented Jan 19, 2020

Uh oh!

cloud-fan Jan 20, 2020

Uh oh!

SparkQA commented Jan 20, 2020

Uh oh!

SparkQA commented Jan 20, 2020

Uh oh!

SparkQA commented Jan 20, 2020

Uh oh!

SparkQA commented Jan 20, 2020

Uh oh!

SparkQA commented Jan 20, 2020

Uh oh!

cloud-fan commented Jan 21, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[SPARK-30568][SQL] Invalidate interval type as a field table schema #27277

[SPARK-30568][SQL] Invalidate interval type as a field table schema #27277

Uh oh!

Conversation

yaooqinn commented Jan 19, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

SparkQA commented Jan 19, 2020

Uh oh!

SparkQA commented Jan 19, 2020

Uh oh!

cloud-fan commented Jan 19, 2020

Uh oh!

cloud-fan Jan 20, 2020

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Jan 20, 2020

Uh oh!

SparkQA commented Jan 20, 2020

Uh oh!

SparkQA commented Jan 20, 2020

Uh oh!

SparkQA commented Jan 20, 2020

Uh oh!

SparkQA commented Jan 20, 2020

Uh oh!

cloud-fan commented Jan 21, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

yaooqinn commented Jan 19, 2020 •

edited

Loading