[SPARK-31684][SQL] Overwrite partition failed with 'WRONG FS' when the target partition is not belong to the filesystem as same as the table #28511

yaooqinn · 2020-05-12T06:50:00Z

What changes were proposed in this pull request?

With SPARK-18107, we will disable the underlying replace(overwrite) and instead do delete in spark side and only do copy in hive side to bypass the performance issue - HIVE-11940

Conditionally, if the table location and partition location do not belong to the same FileSystem, We should not disable hive overwrite. Otherwise, hive will use the FileSystem instance belong to the table location to copy files, which will fail in FileSystem#checkPath
https://github.com/apache/hive/blob/rel/release-2.3.7/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L1657

In this PR, for Hive 2.0.0 and onwards, as HIVE-11940 has been fixed, and there is no performance issue anymore. We should leave the overwrite logic to hive to avoid failure in FileSystem#checkPath

NOTE THAT For Hive 2.2.0 and earlier, if the table and partition locations do not belong together, we will still get the same error thrown by hive encryption check due to HIVE-14380 which need to fix in another ticket SPARK-31675.

Why are the changes needed?

bugfix. a logic table can be decoupled with the storage layer and may contain data from remote storage systems.

Does this PR introduce any user-facing change?

no

How was this patch tested?

Currently verified manually. add benchmark tests

-INSERT INTO DYNAMIC                                7742           7918         248          0.0      756044.0       1.0X
-INSERT INTO HYBRID                                 1289           1307          26          0.0      125866.3       6.0X
-INSERT INTO STATIC                                  371            393          38          0.0       36219.4      20.9X
-INSERT OVERWRITE DYNAMIC                           8456           8554         138          0.0      825790.3       0.9X
-INSERT OVERWRITE HYBRID                            1303           1311          12          0.0      127198.4       5.9X
-INSERT OVERWRITE STATIC                             434            447          13          0.0       42373.8      17.8X
+INSERT INTO DYNAMIC                                7382           7456         105          0.0      720904.8       1.0X
+INSERT INTO HYBRID                                 1128           1129           1          0.0      110169.4       6.5X
+INSERT INTO STATIC                                  349            370          39          0.0       34095.4      21.1X
+INSERT OVERWRITE DYNAMIC                           8149           8362         301          0.0      795821.8       0.9X
+INSERT OVERWRITE HYBRID                            1317           1318           2          0.0      128616.7       5.6X
+INSERT OVERWRITE STATIC                             387            408          37          0.0       37804.1      19.1X

for master

for this PR

both using hive 2.3.7

…e target partition is not belong to the filesystem as same as the table

yaooqinn · 2020-05-12T06:52:28Z

cc @cloud-fan @maropu @dongjoon-hyun @HyukjinKwon thanks.

Marked this ticket as a blocker for 3.0, you may lower the priority if necessary

SparkQA · 2020-05-12T07:05:02Z

Test build #122536 has finished for PR 28511 at commit ff71732.

This patch fails due to an unknown error code, -9.
This patch merges cleanly.
This patch adds no public classes.

yaooqinn · 2020-05-12T07:11:02Z

retest this please

maropu · 2020-05-12T07:42:03Z

also cc: @viirya

maropu · 2020-05-12T07:43:52Z

sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala

+        tblAuthority.equalsIgnoreCase(partAuthority)
+      } else {
+        val defaultUri = FileSystem.getDefaultUri(hadoopConf)
+        if (tblAuthority != null) {


nit: for readability, could we replace the current nested ifs with flatten patten-matching?

maropu · 2020-05-12T07:48:27Z

sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala

+            .version
+          // https://issues.apache.org/jira/browse/SPARK-31684,
+          // For Hive 2.0.0 and onwards, as https://issues.apache.org/jira/browse/HIVE-11940
+          // has been fixed, and there is no performance issue anymore.


Just to check; have you checked that Spark(w/ this fix)+Hive2.0.0+ has the same performance with the current Spark (w/o this fix)?

I haven't done the perf tests. According to the discussion in #15726, I choose to believe at the moment

This code block was there for a long time, would be better to re-run benchmark now, to make sure we don't introduce regression.

Any existing benchmark？Or I can try to add one.

yea please add one.

No problem.

I have just added a benchmark for InsertIntoHiveTable and updated the results.
It uses the "INSERT INTO" as the control group, and "INSERT OVERWRITE" as the experimental group.
With builtin hive 2.3.7, the results of 2 groups are close.
WIth builtin hive 1.2.1.xxx, the results of the experimental group reveal a huge performance degradation when the dynamic partition column exists (the reason here is that #15726 is not merged.

viirya

I'm a bit confused by the description, maybe you can explain more.

For Hive versions before 2.0.0, we need to leave the replace work to hive when the table and partition locations do not belong to the same FileSystem
NOTE THAT For Hive 2.2.0 and earlier, if the table and partition locations do not belong together, we will still get the same error thrown by hive encryption check due to HIVE-14380.

For 2, so when the table and partition locations are not the same FileSystem, we leave to Hive with replace. That is to avoid an exception.

But if the table and partition locations are not the same FileSystem, you will get a same exception by HIVE-14380? So what is the difference to leave to Hive or not? You still get the same exception.

yaooqinn · 2020-05-12T09:17:16Z

I'm a bit confused by the description, maybe you can explain more.

For Hive versions before 2.0.0, we need to leave the replace work to hive when the table and partition locations do not belong to the same FileSystem
NOTE THAT For Hive 2.2.0 and earlier, if the table and partition locations do not belong together, we will still get the same error thrown by hive encryption check due to HIVE-14380.

For 2, so when the table and partition locations are not the same FileSystem, we leave to Hive with replace. That is to avoid an exception.

But if the table and partition locations are not the same FileSystem, you will get a same exception by HIVE-14380? So what is the difference to leave to Hive or not? You still get the same exception.

Thanks. I missed the ticket number in the description https://issues.apache.org/jira/browse/SPARK-31675

They are different exceptions although looks similar

SparkQA · 2020-05-12T11:01:49Z

Test build #122537 has finished for PR 28511 at commit ff71732.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

cloud-fan · 2020-05-12T14:02:50Z

how are you going to fix https://issues.apache.org/jira/browse/SPARK-31675 ?

yaooqinn · 2020-05-13T03:22:31Z

how are you going to fix SPARK-31675?

It's a good question.

Besides the encryption check issue in SPARK-31675, I find that Hive 1.x has many places to do improper FileSystem checkPath recently. It's impossible to rely on the Hive client APIs directly to achieve the goal here. (For internal use only in our org, we have chosen to fix all the problems on the Hive side).

To achieve that, a hacky way that re-implementing the methods ofHiveShim may work, e.g. for loadTable we need re-implementing or reusing branches of methods that include getTable from the catalog, copyFiles or replaceFiles from storage system, etc. But I have to admit that this is not a good and future-proofing way which sounds like just making a mess of something on a mess. Then, a complete redesign has to be made to how to work with hive external catalog which I don't think can be covered by SPARK-31675.

For now, as we have upgraded built-in Hive to 2.3.7, it's not that urgent to fix that long-standing issues in those quite old hive 1.x versions. So may warning messages be enough if users are trying to modify a table contains data crossing clusters with Hive 1.x?

viirya

Even you show a warning message, the users still get an exception later. If we are certain that in this case the exception will be thrown in Hive side, we may directly throw an exception in Spark side with more clear message telling what is going on and how to workaround with it.

yaooqinn · 2020-05-13T07:43:02Z

@viirya thanks. your suggestion SGTM. I will take that when fixing SPARK-31675 if there are no better solutions.

Now let us get back to this PR, shall we just use hive version to determine whether leave overwrite to hive or not? At least we can fix the problem for the default one and those cases with hive 2.x.

cloud-fan · 2020-05-15T08:36:47Z

sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala

+  // scalastyle:off line.size.limit
+  /**
+   * If the table location and partition location do not belong to the same [[FileSystem]], We
+   * should not disable hive overwrite. Otherwise, hive will use the [[FileSystem]] instance belong


can we treat this as a known limitation, with the faith that most users will use Hive 2.x in the long term? Then we just need to check the version.

SparkQA · 2020-05-15T15:58:34Z

Test build #122683 has finished for PR 28511 at commit 85b27d6.

This patch fails build dependency tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2020-05-15T20:54:02Z

Test build #122682 has finished for PR 28511 at commit e4bfc7b.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2020-05-15T21:21:35Z

Test build #122688 has finished for PR 28511 at commit bef4259.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

yaooqinn · 2020-05-16T05:59:48Z

retest this please

SparkQA · 2020-05-16T06:18:32Z

Test build #122717 has finished for PR 28511 at commit bef4259.

This patch fails build dependency tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2020-05-18T02:32:40Z

Test build #122777 has finished for PR 28511 at commit 7a33929.

This patch fails build dependency tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
case class UnresolvedHaving(

cloud-fan · 2020-05-18T04:25:47Z

...e/src/test/scala/org/apache/spark/sql/execution/benchmark/InsertIntoHiveTableBenchmark.scala

+    }
+  }
+
+  override def suffix: String = if (HiveUtils.isHive23) "" else "hive1.2"


can we put hive2.3 as suffix as well?

cloud-fan · 2020-05-18T04:32:15Z

sql/hive/benchmarks/InsertIntoHiveTableBenchmark-results.txt

+INSERT INTO STATIC                                  344            367          48          0.0       33585.1      21.4X
+INSERT OVERWRITE DYNAMIC                           7656           7714          82          0.0      747622.7       1.0X
+INSERT OVERWRITE HYBRID                            1179           1183           6          0.0      115163.3       6.2X
+INSERT OVERWRITE STATIC                             400            408          10          0.0       39014.2      18.4X


do you have the numbers before your PR?

Let me run this benchmark on the master branch and update the result later in the PR description.

-INSERT INTO DYNAMIC 7742 7918 248 0.0 756044.0 1.0X -INSERT INTO HYBRID 1289 1307 26 0.0 125866.3 6.0X -INSERT INTO STATIC 371 393 38 0.0 36219.4 20.9X -INSERT OVERWRITE DYNAMIC 8456 8554 138 0.0 825790.3 0.9X -INSERT OVERWRITE HYBRID 1303 1311 12 0.0 127198.4 5.9X -INSERT OVERWRITE STATIC 434 447 13 0.0 42373.8 17.8X +INSERT INTO DYNAMIC 7382 7456 105 0.0 720904.8 1.0X +INSERT INTO HYBRID 1128 1129 1 0.0 110169.4 6.5X +INSERT INTO STATIC 349 370 39 0.0 34095.4 21.1X +INSERT OVERWRITE DYNAMIC 8149 8362 301 0.0 795821.8 0.9X +INSERT OVERWRITE HYBRID 1317 1318 2 0.0 128616.7 5.6X +INSERT OVERWRITE STATIC 387 408 37 0.0 37804.1 19.1X

*+ for master *- for this PR

both using hive 2.3.7

yaooqinn · 2020-05-18T08:33:52Z

Hmm, what's wrong with the dependency tests on Jenkins? I passed them locally.

yaooqinn · 2020-05-18T08:33:58Z

retest this please

cloud-fan · 2020-05-18T10:59:05Z

retest this please

cloud-fan · 2020-05-18T11:01:40Z

sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala

+          // check. see https://issues.apache.org/jira/browse/HIVE-14380.
+          // So we still disable for Hive overwrite for Hive 1.x for better performance because
+          // the partition and table are on the same cluster in most cases.
+          val hiveVersDoHiveOverwrite: Set[HiveVersion] = Set(v2_0, v2_1, v2_2, v2_3, v3_0, v3_1)


do we able to write hiveVersion < v2_0? otherwise we need to update this list everytime we support a new hive version.

cloud-fan · 2020-05-18T11:03:01Z

...e/src/test/scala/org/apache/spark/sql/execution/benchmark/InsertIntoHiveTableBenchmark.scala

+  hadoopConf.set("hive.exec.dynamic.partition.mode", "nonstrict")
+  hadoopConf.set("hive.exec.max.dynamic.partitions", numRows.toString)
+
+  def withTempTable(tableNames: String*)(f: => Unit): Unit = {


withTempViews

Actually we can inline it, as we only have one temp view.

SparkQA · 2020-05-18T12:32:53Z

Test build #122806 has finished for PR 28511 at commit 1145632.

This patch fails build dependency tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2020-05-18T12:47:05Z

Test build #122807 has finished for PR 28511 at commit af4bfa9.

This patch fails build dependency tests.
This patch merges cleanly.
This patch adds no public classes.

sql/hive/src/main/scala/org/apache/spark/sql/hive/client/package.scala

sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala

cloud-fan · 2020-05-18T13:20:21Z

...e/src/test/scala/org/apache/spark/sql/execution/benchmark/InsertIntoHiveTableBenchmark.scala

+
+  override def getSparkSession: SparkSession = TestHive.sparkSession
+
+  val tempTable = "temp"


cloud-fan · 2020-05-18T13:21:32Z

...e/src/test/scala/org/apache/spark/sql/execution/benchmark/InsertIntoHiveTableBenchmark.scala

+  def insertOverwriteDynamic(table: String, benchmark: Benchmark): Unit = {
+    benchmark.addCase("INSERT OVERWRITE DYNAMIC") { _ =>
+      sql(s"INSERT OVERWRITE TABLE $table SELECT CAST(id AS INT) AS a," +
+        s" CAST(id % 10 AS INT) AS b, CAST(id % 100 AS INT) AS c FROM $tempTable DISTRIBUTE BY a")


how does DISTRIBUTE BY a help here?

create numRows files in total.

SparkQA · 2020-05-18T16:17:57Z

Test build #122803 has finished for PR 28511 at commit f7c6b51.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2020-05-18T19:46:58Z

Test build #122811 has finished for PR 28511 at commit eea99f5.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

cloud-fan · 2020-05-19T04:48:53Z

...e/src/test/scala/org/apache/spark/sql/execution/benchmark/InsertIntoHiveTableBenchmark.scala

+  def insertOverwriteDynamic(table: String, benchmark: Benchmark): Unit = {
+    benchmark.addCase("INSERT OVERWRITE DYNAMIC") { _ =>
+      sql(s"INSERT OVERWRITE TABLE $table SELECT CAST(id AS INT) AS a," +
+        s" CAST(id % 10 AS INT) AS b, CAST(id % 100 AS INT) AS c FROM $tempView DISTRIBUTE BY a")


why does the number of files matter? DISTRIBUTE BY a will add shuffle AFAIK, then the perf is not purely about table insertion and may be less accurate.

the cause of the regression of hive is to copy files one by one, so I use DISTRIBUTE BY to create a certain number of files to feed it.

BUT after rerun the test w/o DISTRIBUTE BY, the result seems ok to illustrate the problem. I will rm that clause.

cloud-fan · 2020-05-19T06:49:06Z

LGTM

cloud-fan · 2020-05-19T06:52:03Z

LGTM

SparkQA · 2020-05-19T07:05:02Z

Test build #122830 has finished for PR 28511 at commit 78e0972.

This patch fails due to an unknown error code, -9.
This patch merges cleanly.
This patch adds no public classes.

yaooqinn · 2020-05-19T07:10:41Z

retest this please

SparkQA · 2020-05-19T10:47:21Z

Test build #122839 has finished for PR 28511 at commit 78e0972.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

yaooqinn · 2020-05-19T11:01:33Z

retest this please

SparkQA · 2020-05-19T13:05:58Z

Test build #122836 has finished for PR 28511 at commit 78e0972.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

cloud-fan · 2020-05-19T14:08:17Z

This fixes a bug for a corner case, when table and partition locations are in different file systems. I'm merging it to master only, to reduce risk. Thanks!

SparkQA · 2020-05-19T16:31:42Z

Test build #122847 has finished for PR 28511 at commit 78e0972.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

yaooqinn · 2020-05-20T02:36:01Z

thanks for reviewing and merging!

[SPARK-31684][SQL] Overwrite partition failed with 'WRONG FS' when th…

ff71732

…e target partition is not belong to the filesystem as same as the table

probot-autolabeler bot added the SQL label May 12, 2020

maropu reviewed May 12, 2020

View reviewed changes

viirya reviewed May 12, 2020

View reviewed changes

viirya reviewed May 13, 2020

View reviewed changes

cloud-fan reviewed May 15, 2020

View reviewed changes

add benchmark & check only version

e4bfc7b

probot-autolabeler bot added the CORE label May 15, 2020

nit

85b27d6

Merge branch 'master' into SPARK-31684

bef4259

Merge branch 'master' into SPARK-31684

7a33929

cloud-fan reviewed May 18, 2020

View reviewed changes

suffix

f7c6b51

cloud-fan reviewed May 18, 2020

View reviewed changes

yaooqinn added 2 commits May 18, 2020 20:20

make hiveversion comparable

1145632

address comments

af4bfa9

cloud-fan reviewed May 18, 2020

View reviewed changes

sql/hive/src/main/scala/org/apache/spark/sql/hive/client/package.scala Show resolved Hide resolved

cloud-fan reviewed May 18, 2020

View reviewed changes

sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala Show resolved Hide resolved

cloud-fan reviewed May 18, 2020

View reviewed changes

yaooqinn added 2 commits May 18, 2020 22:00

address comments

ddeb3e2

naming

eea99f5

cloud-fan reviewed May 19, 2020

View reviewed changes

rm distribute by

78e0972

cloud-fan closed this in 1f29f1b May 19, 2020


		override def getSparkSession: SparkSession = TestHive.sparkSession

		val tempTable = "temp"

[SPARK-31684][SQL] Overwrite partition failed with 'WRONG FS' when the target partition is not belong to the filesystem as same as the table #28511

[SPARK-31684][SQL] Overwrite partition failed with 'WRONG FS' when the target partition is not belong to the filesystem as same as the table #28511

Uh oh!

Conversation

yaooqinn commented May 12, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

yaooqinn commented May 12, 2020

Uh oh!

SparkQA commented May 12, 2020

Uh oh!

yaooqinn commented May 12, 2020

Uh oh!

maropu commented May 12, 2020

Uh oh!

maropu May 12, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cloud-fan May 15, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

viirya left a comment

Choose a reason for hiding this comment

Uh oh!

yaooqinn commented May 12, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

SparkQA commented May 12, 2020

Uh oh!

cloud-fan commented May 12, 2020

Uh oh!

yaooqinn commented May 13, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

viirya left a comment

Choose a reason for hiding this comment

Uh oh!

yaooqinn commented May 13, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SparkQA commented May 15, 2020

Uh oh!

SparkQA commented May 15, 2020

Uh oh!

SparkQA commented May 15, 2020

Uh oh!

yaooqinn commented May 16, 2020

Uh oh!

SparkQA commented May 16, 2020

Uh oh!

SparkQA commented May 18, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yaooqinn May 18, 2020 • edited by cloud-fan Loading Uh oh! There was an error while loading. Please reload this page.

yaooqinn commented May 12, 2020 •

edited

Loading

maropu May 12, 2020 •

edited

Loading

cloud-fan May 15, 2020 •

edited

Loading

yaooqinn commented May 12, 2020 •

edited

Loading

yaooqinn commented May 13, 2020 •

edited

Loading

yaooqinn May 18, 2020 •

edited by cloud-fan

Loading