Skip to content

Conversation

@jiangxb1987
Copy link
Contributor

@jiangxb1987 jiangxb1987 commented Mar 1, 2017

What changes were proposed in this pull request?

Currently we don't explicitly forbid the following behaviors:

  1. The statement CREATE VIEW AS INSERT INTO throws the following exception:
scala> spark.sql("CREATE VIEW testView AS INSERT INTO tab VALUES (1, \"a\")")
org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: at least one column must be specified for the table;
 scala> spark.sql("CREATE VIEW testView(a, b) AS INSERT INTO tab VALUES (1, \"a\")")
org.apache.spark.sql.AnalysisException: The number of columns produced by the SELECT clause (num: `0`) does not match the number of column names specified by CREATE VIEW (num: `2`).;
  1. The statement INSERT INTO view VALUES throws the following exception from checkAnalysis:
scala> spark.sql("INSERT INTO testView VALUES (1, \"a\")")
org.apache.spark.sql.AnalysisException: Inserting into an RDD-based table is not allowed.;;
'InsertIntoTable View (`default`.`testView`, [a#16,b#17]), false, false
+- LocalRelation [col1#14, col2#15]

After this PR, the behavior changes to:

scala> spark.sql("CREATE VIEW testView AS INSERT INTO tab VALUES (1, \"a\")")
org.apache.spark.sql.catalyst.parser.ParseException: Operation not allowed: CREATE VIEW ... AS INSERT INTO;

scala> spark.sql("CREATE VIEW testView(a, b) AS INSERT INTO tab VALUES (1, \"a\")")
org.apache.spark.sql.catalyst.parser.ParseException: Operation not allowed: CREATE VIEW ... AS INSERT INTO;

scala> spark.sql("INSERT INTO testView VALUES (1, \"a\")")
org.apache.spark.sql.AnalysisException: `default`.`testView` is a view, inserting into a view is not allowed;

How was this patch tested?

Add a new test case in SparkSqlParserSuite;
Update the corresponding test case in SQLViewSuite.

@SparkQA
Copy link

SparkQA commented Mar 1, 2017

Test build #73696 has finished for PR 17125 at commit ae28e4b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jiangxb1987
Copy link
Contributor Author

cc @gatorsmile @cloud-fan Please have a look at this when you have time, thanks!

// Inserting into a view is not allowed, we should throw an AnalysisException.
newTable match {
case v: View =>
u.failAnalysis(s"${v.desc.identifier} is a view, inserting into a view is not allowed")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move this to PreprocessTableInsertion?


// CREATE VIEW AS INSERT INTO ... is not allowed, we should throw an AnalysisException.
analyzedPlan match {
case i: InsertIntoHadoopFsRelationCommand =>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_: InsertIntoHadoopFsRelationCommand

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we forbid all commands? e.g. CREATE VIEW xxx AS CREATE TABLE ... should also be disallowed right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sql parser only allows CREATE VIEW AS query here, a query can only be a SELECT ... or INSERT INTO ... or a CTE, so perhaps we don't have to consider other commands here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmmm, why INSERT INTO ... is a query?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

queryNoWith
    : insertInto? queryTerm queryOrganization                                              #singleInsertQuery
    | fromClause multiInsertQueryBody+                                                     #multiInsertQuery
    ;

Seems we have mixed them together.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we fix it at parser side? cc @hvanhovell

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm... fixing this in grammar itself would a little bit of work (we also support multi-insert and that makes this harder). I suppose we could try to add a check in the AstBuilder.

Why not check this in the CreateViewCommand? That seems safer to me.

analyzedPlan match {
case i: InsertIntoHadoopFsRelationCommand =>
throw new AnalysisException("Creating a view as insert into a table is not allowed")
case i: InsertIntoDataSourceCommand =>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same here

// CREATE VIEW AS INSERT INTO ... is not allowed, we should throw an AnalysisException.
analyzedPlan match {
case i: InsertIntoHadoopFsRelationCommand =>
throw new AnalysisException("Creating a view as insert into a table is not allowed")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be nice to put a view name in the error message.

i.copy(table = EliminateSubqueryAliases(lookupTableFromCatalog(u)))
val newTable = EliminateSubqueryAliases(lookupTableFromCatalog(u))
// Inserting into a view is not allowed, we should throw an AnalysisException.
newTable match {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT In this case if would just write the following:

if (newTable.isInstanceOf[View]) {
  u.failAnalysis(s"${v.desc.identifier} is a view, inserting into a view is not allowed")
}

val analyzedPlan = qe.analyzed

// CREATE VIEW AS INSERT INTO ... is not allowed, we should throw an AnalysisException.
analyzedPlan match {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pattern match does not work with multi-inserts (hive feature). Those are represented using a Union of inserts.

i.copy(table = EliminateSubqueryAliases(lookupTableFromCatalog(u)))
val newTable = EliminateSubqueryAliases(lookupTableFromCatalog(u))
// Inserting into a view is not allowed, we should throw an AnalysisException.
if (newTable.isInstanceOf[View]) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rule ResolveRelations executes before PreprocessTableInsertion, so we can fail early here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We prefer to doing all the error handling in the same rule. It can help us find the hole and maintain the codes. cc @cloud-fan

Copy link
Contributor Author

@jiangxb1987 jiangxb1987 Mar 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact I'm neutral with which rule this code block should be placed, but I feel this is not a typical "Preprocess" since the query failed and the process ends. @cloud-fan @hvanhovell any suggestions?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just move it to preprocess insert.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed with @cloud-fan and the concern is that in case the child of the view node is invalid(e.g. exists cyclic view reference or exceed max reference depth), we should still indicate that INSERT INTO VIEW is not allowed(instead of other error messages), so we should not resolve the child of the view, instead we throw an Exception here.

@SparkQA
Copy link

SparkQA commented Mar 2, 2017

Test build #73780 has finished for PR 17125 at commit 792cca9.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 3, 2017

Test build #73857 has finished for PR 17125 at commit 57b64ad.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

withView("testView") {
// Single insert query
val e1 = intercept[ParseException] {
sql(s"CREATE VIEW testView AS INSERT INTO jt VALUES(1, 1)")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: remove string interpolator.

// Multi insert query
val e2 = intercept[ParseException] {
sql(s"CREATE VIEW testView AS FROM jt INSERT INTO tbl1 SELECT * WHERE jt.id < 5 " +
s"INSERT INTO tbl2 SELECT * WHERE jt.id > 4")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: remove the above two string interpolators

}
}

test("create view as insert into table") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move it to SparkSqlParserSuite?

If we do it here, it will be tested twice.

} else {
// CREATE VIEW ... AS INSERT INTO is not allowed.
val query = ctx.query.queryNoWith
query match {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: ctx.query.queryNoWith match {

query match {
case s: SingleInsertQueryContext if s.insertInto != null =>
operationNotAllowed("CREATE VIEW ... AS INSERT INTO", ctx)
case m: MultiInsertQueryContext =>
Copy link
Member

@gatorsmile gatorsmile Mar 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: case _: MultiInsertQueryContext =>

} else {
// CREATE VIEW ... AS INSERT INTO is not allowed.
ctx.query.queryNoWith match {
case s: SingleInsertQueryContext if s.insertInto != null =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when s.insertInto will be null?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, CREATE VIEW v AS SELECT * FROM jt.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, a select query is SingleInsertQueryContext?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah... You can see that in:

queryNoWith
    : insertInto? queryTerm queryOrganization                                              #singleInsertQuery
    | fromClause multiInsertQueryBody+                                                     #multiInsertQuery
    ;

def apply(plan: LogicalPlan): LogicalPlan = plan resolveOperators {
case i @ InsertIntoTable(u: UnresolvedRelation, parts, child, _, _) if child.resolved =>
i.copy(table = EliminateSubqueryAliases(lookupTableFromCatalog(u)))
i.copy(table = resolveRelation(EliminateSubqueryAliases(lookupTableFromCatalog(u))))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this change?

Copy link
Contributor Author

@jiangxb1987 jiangxb1987 Mar 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we try to insert into a view, the logical plan that lookupTableFromCatalog() returns is not resolved, so we have to perform resolveRelation() over the node.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so we will resolve something to View by doing this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will resolve the child of the view by doing this I think.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need to resolve the child of view? Once we see the pattern Insert(View, ...) we will throw exception, we don't care about whether the child of view is resolved or not.

@cloud-fan
Copy link
Contributor

LGTM except 2 questions

@SparkQA
Copy link

SparkQA commented Mar 4, 2017

Test build #73894 has finished for PR 17125 at commit 68cee40.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 6, 2017

Test build #73958 has finished for PR 17125 at commit 9af2d7e.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

// Inserting into a view is not allowed, we should throw an AnalysisException.
if (newTable.isInstanceOf[View]) {
u.failAnalysis(s"${newTable.asInstanceOf[View].desc.identifier} is a view, inserting " +
s"into a view is not allowed")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: s"into a view is not allowed" -> "into a view is not allowed"

u.failAnalysis(s"${newTable.asInstanceOf[View].desc.identifier} is a view, inserting " +
s"into a view is not allowed")
}
i.copy(table = newTable)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about?

        lookupTableFromCatalog(u).canonicalized match {
          case v: View =>
            u.failAnalysis(s"Inserting into a view is not allowed. View: ${v.desc.identifier}.")
          case other => i.copy(table = other)
        }

@SparkQA
Copy link

SparkQA commented Mar 6, 2017

Test build #73978 has started for PR 17125 at commit 8d4be05.

@cloud-fan
Copy link
Contributor

LGTM, pending tests

@cloud-fan
Copy link
Contributor

retest this please

@SparkQA
Copy link

SparkQA commented Mar 6, 2017

Test build #73981 has finished for PR 17125 at commit 8d4be05.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member

LGTM too.

@gatorsmile
Copy link
Member

Thanks! Merging to master.

@asfgit asfgit closed this in 9991c2d Mar 6, 2017
@jiangxb1987 jiangxb1987 deleted the insert-with-view branch March 7, 2017 02:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants