Skip to content

Conversation

@MaxGekk
Copy link
Member

@MaxGekk MaxGekk commented Dec 11, 2020

What changes were proposed in this pull request?

Throw PartitionsAlreadyExistException from createPartitions() in Hive external catalog when a partition exists. Currently, HiveExternalCatalog.createPartitions() throws AlreadyExistsException wrapped by AnalysisException.

In the PR, I propose to catch AlreadyExistsException in HiveClientImpl and replace it by PartitionsAlreadyExistException.

Why are the changes needed?

The behaviour of Hive external catalog deviates from V1/V2 in-memory catalogs that throw PartitionsAlreadyExistException. To improve user experience with Spark SQL, it would be better to throw the same exception.

Does this PR introduce any user-facing change?

Yes

How was this patch tested?

By running existing test suites:

$ build/sbt -Phive -Phive-thriftserver "hive/test:testOnly org.apache.spark.sql.hive.client.VersionsSuite"
$ build/sbt -Phive -Phive-thriftserver "hive/test:testOnly org.apache.spark.sql.hive.execution.HiveDDLSuite"

Authored-by: Max Gekk max.gekk@gmail.com
Signed-off-by: Dongjoon Hyun dongjoon@apache.org
(cherry picked from commit fab2995)
Signed-off-by: Max Gekk max.gekk@gmail.com

…ernalCatalog.createPartitions()

Throw `PartitionsAlreadyExistException` from `createPartitions()` in Hive external catalog when a partition exists. Currently, `HiveExternalCatalog.createPartitions()` throws `AlreadyExistsException` wrapped by `AnalysisException`.

In the PR, I propose to catch `AlreadyExistsException` in `HiveClientImpl` and replace it by `PartitionsAlreadyExistException`.

The behaviour of Hive external catalog deviates from V1/V2 in-memory catalogs that throw `PartitionsAlreadyExistException`. To improve user experience with Spark SQL, it would be better to throw the same exception.

Yes

By running existing test suites:
```
$ build/sbt -Phive-2.3 -Phive-thriftserver "test:testOnly *AlterTableAddPartitionSuite"
```

Closes apache#30711 from MaxGekk/hive-partition-exception.

Authored-by: Max Gekk <max.gekk@gmail.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit fab2995)
Signed-off-by: Max Gekk <max.gekk@gmail.com>
@github-actions github-actions bot added the SQL label Dec 11, 2020
@SparkQA
Copy link

SparkQA commented Dec 11, 2020

Test build #132647 has finished for PR 30729 at commit b284ea3.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 11, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37250/

@MaxGekk
Copy link
Member Author

MaxGekk commented Dec 11, 2020

jenkins, retest this, please

@SparkQA
Copy link

SparkQA commented Dec 11, 2020

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37250/

@SparkQA
Copy link

SparkQA commented Dec 11, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37259/

@SparkQA
Copy link

SparkQA commented Dec 11, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37259/

@SparkQA
Copy link

SparkQA commented Dec 11, 2020

Test build #132653 has finished for PR 30729 at commit b284ea3.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Thanks, @MaxGekk .
Merged to branch-3.1.

dongjoon-hyun pushed a commit that referenced this pull request Dec 11, 2020
…veExternalCatalog.createPartitions()

### What changes were proposed in this pull request?
Throw `PartitionsAlreadyExistException` from `createPartitions()` in Hive external catalog when a partition exists. Currently, `HiveExternalCatalog.createPartitions()` throws `AlreadyExistsException` wrapped by `AnalysisException`.

In the PR, I propose to catch `AlreadyExistsException` in `HiveClientImpl` and replace it by `PartitionsAlreadyExistException`.

### Why are the changes needed?
The behaviour of Hive external catalog deviates from V1/V2 in-memory catalogs that throw `PartitionsAlreadyExistException`. To improve user experience with Spark SQL, it would be better to throw the same exception.

### Does this PR introduce _any_ user-facing change?
Yes

### How was this patch tested?
By running existing test suites:
```
$ build/sbt -Phive -Phive-thriftserver "hive/test:testOnly org.apache.spark.sql.hive.client.VersionsSuite"
$ build/sbt -Phive -Phive-thriftserver "hive/test:testOnly org.apache.spark.sql.hive.execution.HiveDDLSuite"
```

Authored-by: Max Gekk <max.gekkgmail.com>
Signed-off-by: Dongjoon Hyun <dongjoonapache.org>
(cherry picked from commit fab2995)
Signed-off-by: Max Gekk <max.gekkgmail.com>

Closes #30729 from MaxGekk/hive-partition-exception-3.1.

Authored-by: Max Gekk <max.gekk@gmail.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
@MaxGekk MaxGekk deleted the hive-partition-exception-3.1 branch December 11, 2020 20:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants