Skip to content

Conversation

@gatorsmile
Copy link
Member

What changes were proposed in this pull request?

In CREATE TABLE AS SELECT, if the SELECT query failed, the table should not exist. For example,

CREATE TABLE tab
STORED AS TEXTFILE
SELECT 1 AS a, (SELECT a FROM (SELECT 1 AS a UNION ALL SELECT 2 AS a) t) AS b

The above query failed as expected but an empty table t is created.

This PR is to drop the created table when hitting any non-fatal exception.

How was this patch tested?

Added a test case to verify the behavior

@hvanhovell
Copy link
Contributor

Look pretty good. Are there any other places where this can happen?

@gatorsmile
Copy link
Member Author

Thanks! @hvanhovell

After going over the other functions that called sparkSession.sessionState.catalog.createTable, this is the only function that has such an issue.

@gatorsmile
Copy link
Member Author

BTW, @rxin @yhuai @marmbrus I am still investigating the potential issues for Hive As A Data Source. Thanks!

@SparkQA
Copy link

SparkQA commented Jun 27, 2016

Test build #61312 has finished for PR 13926 at commit c0f08a5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member Author

CreateDataSourceTableAsSelectCommand and CreateHiveTableAsSelectCommand should be combined in the next release. metastoreRelation is just a BaseRelation

@gatorsmile
Copy link
Member Author

ping @hvanhovell Could you please take a look at this again? : )

@gatorsmile
Copy link
Member Author

retest this please

@SparkQA
Copy link

SparkQA commented Jun 30, 2016

Test build #61527 has finished for PR 13926 at commit c0f08a5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member Author

retest this please

@SparkQA
Copy link

SparkQA commented Jul 4, 2016

Test build #61732 has finished for PR 13926 at commit c0f08a5.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member Author

@hvanhovell @cloud-fan @liancheng Any comment about this PR?

@cloud-fan
Copy link
Contributor

does CreateDataSourceTableAsSelectCommand has this problem?

@gatorsmile
Copy link
Member Author

@cloud-fan CreateDataSourceTableAsSelectCommand creates a DataFrame at first, and then create a data source table. The order is different from CreateHiveTableAsSelectCommand. If we hit any issue when creating a DataFrame, we will stop before trying to create a data source table. Thus, it should be fine based on my understanding.

@cloud-fan
Copy link
Contributor

Is it possible we also use this order in CreateHiveTableAsSelectCommand?

@liancheng
Copy link
Contributor

LGTM

@liancheng
Copy link
Contributor

@cloud-fan Probably not? CreateHiveTableAsSelectCommand uses InsertIntoTable, which is translated into InsertIntoHiveTable, which requires a MetastoreRelation.

@asfgit asfgit closed this in 21eadd1 Jul 6, 2016
asfgit pushed a commit that referenced this pull request Jul 6, 2016
#### What changes were proposed in this pull request?
In `CREATE TABLE AS SELECT`, if the `SELECT` query failed, the table should not exist. For example,

```SQL
CREATE TABLE tab
STORED AS TEXTFILE
SELECT 1 AS a, (SELECT a FROM (SELECT 1 AS a UNION ALL SELECT 2 AS a) t) AS b
```
The above query failed as expected but an empty table `t` is created.

This PR is to drop the created table when hitting any non-fatal exception.

#### How was this patch tested?
Added a test case to verify the behavior

Author: gatorsmile <gatorsmile@gmail.com>

Closes #13926 from gatorsmile/dropTableAfterException.

(cherry picked from commit 21eadd1)
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
@cloud-fan
Copy link
Contributor

thanks, merging to master and 2.0!

@gatorsmile
Copy link
Member Author

Thanks! @cloud-fan @liancheng @hvanhovell

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants