Skip to content

Conversation

@harishreedharan
Copy link
Contributor

...s.

This happens when an RDD is partially written when the driver fails. At this point,
the directory that the RDD is being written to gets reused when the app restarts.
This can cause the write to fail, since the underlying MR API does not allow writing
to a directory that already exists.

This PR introduces a new method in RDD.scala that allows overwriting any existing data
in a directory. This method is prive[spark] and is used by Spark Streaming to overwrite
any data present in the directory.

…arts.

This happens when an RDD is partially written when the driver fails. At this point,
the directory that the RDD is being written to gets reused when the app restarts.
This can cause the write to fail, since the underlying MR API does not allow writing
to a directory that already exists.

This PR introduces a new method in RDD.scala that allows overwriting any existing data
in a directory. This method is prive[spark] and is used by Spark Streaming to overwrite
any data present in the directory.
@SparkQA
Copy link

SparkQA commented Feb 3, 2015

Test build #26581 has started for PR 4322 at commit dd73fe0.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Feb 3, 2015

Test build #26581 has finished for PR 4322 at commit dd73fe0.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26581/
Test FAILed.

@SparkQA
Copy link

SparkQA commented Feb 3, 2015

Test build #26582 has started for PR 4322 at commit ea2d490.

  • This patch merges cleanly.

@JoshRosen
Copy link
Contributor

This seems similar to #3832 / https://issues.apache.org/jira/browse/SPARK-4835.

@SparkQA
Copy link

SparkQA commented Feb 3, 2015

Test build #26582 has finished for PR 4322 at commit ea2d490.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26582/
Test FAILed.

@harishreedharan
Copy link
Contributor Author

Jenkins, test this please

@SparkQA
Copy link

SparkQA commented Feb 3, 2015

Test build #26606 has started for PR 4322 at commit ea2d490.

  • This patch merges cleanly.

@harishreedharan
Copy link
Contributor Author

Was already fixed by #3832

@SparkQA
Copy link

SparkQA commented Feb 3, 2015

Test build #26606 has finished for PR 4322 at commit ea2d490.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/26606/
Test PASSed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants