Skip to content

Conversation

@viirya
Copy link
Member

@viirya viirya commented Dec 4, 2019

What changes were proposed in this pull request?

This patch proposes to allow insert overwrite same table if using dynamic partition overwrite.

Why are the changes needed?

Currently, Insert overwrite cannot overwrite to same table even it is dynamic partition overwrite. But for dynamic partition overwrite, we do not delete partition directories ahead. We write to staging directories and move data to final partition directories. We should be able to insert overwrite to same table under dynamic partition overwrite.

This enables users to read data from a table and insert overwrite to same table by using dynamic partition overwrite. Because this is not allowed for now, users need to write to other temporary location and move it back to the table.

Does this PR introduce any user-facing change?

Yes. Users can insert overwrite same table if using dynamic partition overwrite.

How was this patch tested?

Unit test.

@viirya
Copy link
Member Author

viirya commented Dec 4, 2019

cc @cloud-fan

@SparkQA
Copy link

SparkQA commented Dec 4, 2019

Test build #114806 has finished for PR 26752 at commit e5f4122.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@viirya viirya force-pushed the dynamic-overwrite-same-table branch from e5f4122 to 378fbc4 Compare December 4, 2019 04:47
@SparkQA
Copy link

SparkQA commented Dec 4, 2019

Test build #114827 has finished for PR 26752 at commit 378fbc4.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@viirya
Copy link
Member Author

viirya commented Dec 4, 2019

retest this please.

@SparkQA
Copy link

SparkQA commented Dec 4, 2019

Test build #114840 has finished for PR 26752 at commit 378fbc4.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member

Retest this please.

@dongjoon-hyun
Copy link
Member

Thank you for making a PR for this issue.


test("it is allowed to write to a table while querying it for dynamic partition overwrite.") {
Seq(PartitionOverwriteMode.DYNAMIC.toString,
PartitionOverwriteMode.STATIC.toString).foreach { usedMode =>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

usedMode -> mode.

"""
|CREATE TABLE insertTable(i int, part1 int, part2 int) USING PARQUET
|PARTITIONED BY (part1, part2)
""".stripMargin)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indentation?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops! thanks!

"""
|INSERT OVERWRITE TABLE insertTable PARTITION(part1=1, part2)
|SELECT i + 1, part2 FROM insertTable
""".stripMargin)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indentation?

@SparkQA
Copy link

SparkQA commented Dec 6, 2019

Test build #114920 has finished for PR 26752 at commit 378fbc4.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 6, 2019

Test build #114921 has finished for PR 26752 at commit b1fbb41.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member

cc @cloud-fan


private lazy val parameters = CaseInsensitiveMap(options)

private[sql] lazy val dynamicPartitionOverwriteEnabled: Boolean = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we just name it dynamicPartitionOverwrite? then we can remove https://github.com/apache/spark/pull/26752/files#diff-f9858c2d9d1a3c3e48753ef675bc865aR109

"INSERT OVERWRITE to a table while querying it should not be allowed.")
}

test("it is allowed to write to a table while querying it for dynamic partition overwrite.") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we put the JIRA id?

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Merged to master.
Thank you so much, @viirya and @cloud-fan .
I also verified the last commit manually.

@SparkQA
Copy link

SparkQA commented Dec 6, 2019

Test build #114956 has finished for PR 26752 at commit 42606bc.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@viirya
Copy link
Member Author

viirya commented Dec 6, 2019

thanks! @cloud-fan @dongjoon-hyun

@viirya viirya deleted the dynamic-overwrite-same-table branch December 27, 2023 18:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants