[SPARK-19617][SS]Fix the race condition when starting and stopping a query quickly (branch-2.1) #16979

zsxwing · 2017-02-18T03:12:27Z

What changes were proposed in this pull request?

Backport #16947 to branch 2.1. Note: we still need to support old Hadoop versions in 2.1.*.

How was this patch tested?

Jenkins

zsxwing · 2017-02-18T03:13:00Z

sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala

This file is almost same as #16947 except this comment.

zsxwing · 2017-02-18T03:13:41Z

sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/HDFSMetadataLog.scala

Also run mkdirs into runUninterruptiblyIfLocal because it calls Shell.runCommand.

zsxwing · 2017-02-18T03:15:48Z

sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/HDFSMetadataLog.scala

Fixed the comment. I added it in 88c43f4 but it was wrong. We don't need to use runUninterruptibly to workaround HADOOP-14084. The root cause is HADOOP-10622.

SparkQA · 2017-02-18T04:54:17Z

Test build #73091 has finished for PR 16979 at commit 7a0b199.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

…query quickly (branch-2.1)

SparkQA · 2017-02-18T06:47:40Z

Test build #73101 has started for PR 16979 at commit 1d776c3.

zsxwing · 2017-02-18T22:11:16Z

retest this please

SparkQA · 2017-02-19T00:24:56Z

Test build #73117 has finished for PR 16979 at commit 1d776c3.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

zsxwing · 2017-02-21T06:51:39Z

sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/HDFSMetadataLog.scala

+  }
+
+  private def runUninterruptiblyIfLocal[T](body: => T): T = {
+    if (fileManager.isLocalFileSystem && Thread.currentThread.isInstanceOf[UninterruptibleThread]) {


Have to change the condition here because StreamExecution will create a HDFSMetadata in a non UninterruptibleThread. (mkdirs)

So we are changing this to a best-effort attempt, rather than the try-and-explicitly-fail attempt, in the case of a local file system... right?

zsxwing · 2017-02-21T22:26:45Z

sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/HDFSMetadataLog.scala

+
+  private def runUninterruptiblyIfLocal[T](body: => T): T = {
+    if (fileManager.isLocalFileSystem && Thread.currentThread.isInstanceOf[UninterruptibleThread]) {
+      // When using a local file system, some file system APIs like "create" or "mkdirs" must be


I fixed the comments to point to the root cause: HADOOP-10622.

tdas · 2017-02-22T03:42:14Z

LGTM.

zsxwing · 2017-02-22T04:15:29Z

Thanks. Merging to 2.1.

… query quickly (branch-2.1) ## What changes were proposed in this pull request? Backport #16947 to branch 2.1. Note: we still need to support old Hadoop versions in 2.1.*. ## How was this patch tested? Jenkins Author: Shixiong Zhu <shixiong@databricks.com> Closes #16979 from zsxwing/SPARK-19617-branch-2.1.

zsxwing commented Feb 18, 2017

View reviewed changes

zsxwing mentioned this pull request Feb 18, 2017

[SPARK-19617][SS]Fix the race condition when starting and stopping a query quickly #16947

Closed

[SPARK-19617][SS]Fix the race condition when starting and stopping a …

1d776c3

…query quickly (branch-2.1)

zsxwing commented Feb 21, 2017

View reviewed changes

zsxwing closed this Feb 22, 2017

zsxwing deleted the SPARK-19617-branch-2.1 branch February 22, 2017 04:16

[SPARK-19617][SS]Fix the race condition when starting and stopping a query quickly (branch-2.1) #16979

[SPARK-19617][SS]Fix the race condition when starting and stopping a query quickly (branch-2.1) #16979

Uh oh!

Conversation

zsxwing commented Feb 18, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

zsxwing Feb 18, 2017

Choose a reason for hiding this comment

Uh oh!

zsxwing Feb 18, 2017

Choose a reason for hiding this comment

Uh oh!

zsxwing Feb 18, 2017

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Feb 18, 2017

Uh oh!

SparkQA commented Feb 18, 2017

Uh oh!

zsxwing commented Feb 18, 2017

Uh oh!

SparkQA commented Feb 19, 2017

Uh oh!

zsxwing Feb 21, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tdas Feb 22, 2017

Choose a reason for hiding this comment

Uh oh!

zsxwing Feb 21, 2017

Choose a reason for hiding this comment

Uh oh!

tdas commented Feb 22, 2017

Uh oh!

zsxwing commented Feb 22, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

zsxwing commented Feb 18, 2017 •

edited

Loading

zsxwing Feb 21, 2017 •

edited

Loading