Skip to content

Conversation

@MaxGekk
Copy link
Member

@MaxGekk MaxGekk commented Oct 8, 2022

What changes were proposed in this pull request?

In the PR, I propose to remove PartitionAlreadyExistsException and use PartitionsAlreadyExistException instead of it.

Why are the changes needed?

  1. To simplify user apps. After the changes, users don't need to catch both exceptions PartitionsAlreadyExistException as well as PartitionAlreadyExistsException .
  2. To improve code maintenance since don't need to support almost the same code.
  3. To avoid errors like the PR [SPARK-40702][SQL] Fix partition specs in PartitionsAlreadyExistException #38152 fixed PartitionsAlreadyExistException but not PartitionAlreadyExistsException.

Does this PR introduce any user-facing change?

Yes.

How was this patch tested?

By running the affected test suites:

$ build/sbt -Phive-2.3 -Phive-thriftserver "test:testOnly *SupportsPartitionManagementSuite"
$ build/sbt -Phive-2.3 -Phive-thriftserver "test:testOnly *.AlterTableAddPartitionSuite"

@github-actions github-actions bot added the SQL label Oct 8, 2022
@github-actions github-actions bot added the DOCS label Oct 9, 2022
@MaxGekk MaxGekk changed the title [WIP][SQL] Remove PartitionAlreadyExistsException [SPARK-40714][SQL] Remove PartitionAlreadyExistsException Oct 9, 2022
@MaxGekk MaxGekk marked this pull request as ready for review October 9, 2022 07:52
Copy link
Contributor

@LuciferYang LuciferYang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1,LGTM

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although these interfaces are Experimental, this looks like a big change on SupportsAtomicPartitionManagement and SupportsPartitionManagement.

InternalRow ident,
Map<String, String> properties)
throws PartitionAlreadyExistsException, UnsupportedOperationException {
throws PartitionsAlreadyExistException, UnsupportedOperationException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does Spark try-catch the errors thrown by these 2 APIs? If yes then we may have bugs as the existing data source implementations may still throw PartitionAlreadyExistsException

Copy link
Member Author

@MaxGekk MaxGekk Oct 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the current implementation, Spark invokes createPartition()/createPartitions() from AddPartitionExec (ALTER TABLE .. ADD PARTITION) where all exceptions from the methods are propagated to users directly. But it doesn't matter in our case because AddPartitionExec.run() checks partitions exist BEFORE the invokes, see:

val (existsParts, notExistsParts) =
partSpecs.partition(p => table.partitionExists(p.ident))
if (existsParts.nonEmpty && !ignoreIfExists) {
throw new PartitionsAlreadyExistException(
table.name(), existsParts.map(_.ident), table.partitionSchema())
}

So, this PR doesn't affect ALTER TABLE .. ADD PARTITION but it affects v2 ALTER TABLE .. RENAME PARTITION which propagates the exception directly to users.

Copy link
Contributor

@cloud-fan cloud-fan Oct 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in this case it's OK to change the type of the exception that is propagated to users.

@MaxGekk
Copy link
Member Author

MaxGekk commented Oct 10, 2022

this looks like a big change on SupportsAtomicPartitionManagement and SupportsPartitionManagement.

I think it is right time to change the APIs since they are not broadly used so far. I guess, it will be difficult to predict which of PartitionAlreadyExistsException and PartitionsAlreadyExistException to expect from particular Spark DDL, and user apps will be written in the way to handle both exceptions. I think we should avoid this from the beginning, and don't introduce unnecessary complexity in the user code.

@MaxGekk
Copy link
Member Author

MaxGekk commented Oct 10, 2022

Merging to master. Thank you, @LuciferYang @cloud-fan @dongjoon-hyun for review.

@MaxGekk MaxGekk closed this in 288bdd2 Oct 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants