Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-36242][CORE] Ensure spill file closed before set success = true in ExternalSorter.spillMemoryIteratorToDisk method #33460

Closed

Conversation

LuciferYang
Copy link
Contributor

@LuciferYang LuciferYang commented Jul 21, 2021

What changes were proposed in this pull request?

The main change of this pr is move writer.close() before success = true to ensure spill file closed before set success = true in ExternalSorter.spillMemoryIteratorToDisk method.

Why are the changes needed?

Avoid setting success = true first and then failure of close spill file

Does this PR introduce any user-facing change?

No

How was this patch tested?

  • Pass the Jenkins or GitHub Action
  • Add a new Test case to check The spill file should not exists if writer close fails

@github-actions github-actions bot added the CORE label Jul 21, 2021
@LuciferYang LuciferYang marked this pull request as draft July 21, 2021 09:49
@LuciferYang LuciferYang marked this pull request as ready for review July 21, 2021 10:05
@SparkQA
Copy link

SparkQA commented Jul 21, 2021

Test build #141402 has finished for PR 33460 at commit 1e970f3.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jul 21, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45920/

@SparkQA
Copy link

SparkQA commented Jul 21, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45920/

@SparkQA
Copy link

SparkQA commented Jul 21, 2021

Kubernetes integration test unable to build dist.

exiting with code: 1
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45926/

@SparkQA
Copy link

SparkQA commented Jul 21, 2021

Test build #141409 has finished for PR 33460 at commit f38d735.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

@mridulm mridulm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ExternalAppendOnlyMap does the right thing.
Looks like we missed it in ExternalSorter.

+CC @Ngone51

I would prefer if we had a test for this btw @LuciferYang.

@LuciferYang
Copy link
Contributor Author

LuciferYang commented Jul 22, 2021

I would prefer if we had a test for this btw @LuciferYang.

How can I create a successful flush but fail to writer.close() case when calling this method?

writer is a local variable in ExternalSorter#spillMemoryIteratorToDisk method and spillMemoryIteratorToDisk is private method, so it's hard to mock writer`'s behavior.

Do you have any suggestions for this? @mridulm @HyukjinKwon @Ngone51

Or do we need more refactoring work to make this method easy to test?

@mridulm
Copy link
Contributor

mridulm commented Jul 22, 2021

Mock blockManager.getDiskWriter ? See UnsafeShuffleWriterSuite for an example

@LuciferYang
Copy link
Contributor Author

Mock blockManager.getDiskWriter ? See UnsafeShuffleWriterSuite for an example

Thx ~

ioe.getMessage.equals(errorMessage)
// The `TempShuffleBlock` create by diskBlockManager
// will remain before SPARK-36242
assert(!spillFilesCreated(0).exists())
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mridulm @Ngone51 @HyukjinKwon before this pr, assert(!spillFilesCreated(0).exists()) will failed, I tested it manually.

ExternalSorterSpillSuite.this.spillFilesCreated.apply(0).exists() was true
ScalaTestFailureLocation: org.apache.spark.util.collection.ExternalSorterSpillSuite at (ExternalSorterSpillSuite.scala:137)
org.scalatest.exceptions.TestFailedException: ExternalSorterSpillSuite.this.spillFilesCreated.apply(0).exists() was true
	at org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:472)
	at org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:471)
	at org.scalatest.Assertions$.newAssertionFailedException(Assertions.scala:1231)
	at org.scalatest.Assertions$AssertionsHelper.macroAssert(Assertions.scala:1295)
	at org.apache.spark.util.collection.ExternalSorterSpillSuite.$anonfun$new$1(ExternalSorterSpillSuite.scala:137)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me check this case with Scala 2.13

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me check this case with Scala 2.13

Manual test passed

ExternalSorterSuite relies on 'LocalSparkContext', so I added a new test file

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the great work!

@SparkQA
Copy link

SparkQA commented Jul 22, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46001/

@SparkQA
Copy link

SparkQA commented Jul 22, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46001/

@SparkQA
Copy link

SparkQA commented Jul 22, 2021

Test build #141483 has finished for PR 33460 at commit 65e6987.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

private var taskContext: TaskContext = _

override protected def beforeEach(): Unit = {
tempDir = util.Utils.createTempDir(null, "test")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

import Utils directly?

Copy link
Contributor Author

@LuciferYang LuciferYang Jul 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is another Utils in org.apache.spark.util.collection package, so util.Utils is used here.

a57e76a change to import Utils directly and rename it to UUtils

Do you have any other suggestions for the naming of UUtils?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Looks fine either way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok~

Comment on lines 78 to 79
import org.apache.spark.storage.TempShuffleBlockId
import java.util.UUID
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move to the imports group?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok ~

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@SparkQA
Copy link

SparkQA commented Jul 23, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46048/

@SparkQA
Copy link

SparkQA commented Jul 23, 2021

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46048/

@SparkQA
Copy link

SparkQA commented Jul 23, 2021

Test build #141530 has finished for PR 33460 at commit a57e76a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@Ngone51 Ngone51 closed this in f61d599 Jul 23, 2021
Ngone51 pushed a commit that referenced this pull request Jul 23, 2021
…ue` in `ExternalSorter.spillMemoryIteratorToDisk` method

### What changes were proposed in this pull request?
The main change of this pr is move `writer.close()` before `success = true` to ensure spill file closed before set `success = true` in `ExternalSorter.spillMemoryIteratorToDisk` method.

### Why are the changes needed?
Avoid setting `success = true` first and then failure of close spill file

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?

- Pass the Jenkins or GitHub Action
- Add a new Test case to check `The spill file should not exists if writer close fails`

Closes #33460 from LuciferYang/external-sorter-spill-close.

Authored-by: yangjie01 <yangjie01@baidu.com>
Signed-off-by: yi.wu <yi.wu@databricks.com>
(cherry picked from commit f61d599)
Signed-off-by: yi.wu <yi.wu@databricks.com>
@Ngone51
Copy link
Member

Ngone51 commented Jul 23, 2021

Thanks, merged to master/3.2.

@dongjoon-hyun
Copy link
Member

+1, LGTM. Thank you, @LuciferYang and all!

@dongjoon-hyun
Copy link
Member

BTW, this looks like a bug fix. Could you make a backport to branch-3.1/3.0 please?

@LuciferYang
Copy link
Contributor Author

BTW, this looks like a bug fix. Could you make a backport to branch-3.1/3.0 please?

@dongjoon-hyun OK, I'll do it as soon as possible

LuciferYang added a commit to LuciferYang/spark that referenced this pull request Jul 26, 2021
…ue` in `ExternalSorter.spillMemoryIteratorToDisk` method

### What changes were proposed in this pull request?
The main change of this pr is move `writer.close()` before `success = true` to ensure spill file closed before set `success = true` in `ExternalSorter.spillMemoryIteratorToDisk` method.

### Why are the changes needed?
Avoid setting `success = true` first and then failure of close spill file

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?

- Pass the Jenkins or GitHub Action
- Add a new Test case to check `The spill file should not exists if writer close fails`

Closes apache#33460 from LuciferYang/external-sorter-spill-close.

Authored-by: yangjie01 <yangjie01@baidu.com>
Signed-off-by: yi.wu <yi.wu@databricks.com>
LuciferYang added a commit to LuciferYang/spark that referenced this pull request Jul 26, 2021
…ue` in `ExternalSorter.spillMemoryIteratorToDisk` method

### What changes were proposed in this pull request?
The main change of this pr is move `writer.close()` before `success = true` to ensure spill file closed before set `success = true` in `ExternalSorter.spillMemoryIteratorToDisk` method.

### Why are the changes needed?
Avoid setting `success = true` first and then failure of close spill file

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?

- Pass the Jenkins or GitHub Action
- Add a new Test case to check `The spill file should not exists if writer close fails`

Closes apache#33460 from LuciferYang/external-sorter-spill-close.

Authored-by: yangjie01 <yangjie01@baidu.com>
Signed-off-by: yi.wu <yi.wu@databricks.com>
@LuciferYang
Copy link
Contributor Author

@dongjoon-hyun
branch-3.1: #33513
branch-3.0: #33514

@LuciferYang
Copy link
Contributor Author

thx ~ @Ngone51 @mridulm @dongjoon-hyun

@LuciferYang LuciferYang deleted the external-sorter-spill-close branch October 22, 2023 07:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants