Skip to content

Conversation

@liancheng
Copy link
Contributor

When inserting data into a HadoopFsRelation, if commitTask() of the writer container fails, abortTask() will be invoked. However, both commitTask() and abortTask() try to close the output writer(s). The problem is that, closing underlying writers may not be an idempotent operation. E.g., ParquetRecordWriter.close() throws NPE when called twice.

@liancheng
Copy link
Contributor Author

cc @yhuai

@yhuai
Copy link
Contributor

yhuai commented Aug 17, 2015

LGTM

@SparkQA
Copy link

SparkQA commented Aug 17, 2015

Test build #1628 has finished for PR 8236 at commit 9b668c3.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

asfgit pushed a commit that referenced this pull request Aug 17, 2015
…sk() fails

When inserting data into a `HadoopFsRelation`, if `commitTask()` of the writer container fails, `abortTask()` will be invoked. However, both `commitTask()` and `abortTask()` try to close the output writer(s). The problem is that, closing underlying writers may not be an idempotent operation. E.g., `ParquetRecordWriter.close()` throws NPE when called twice.

Author: Cheng Lian <lian@databricks.com>

Closes #8236 from liancheng/spark-7837/double-closing.

(cherry picked from commit 76c155d)
Signed-off-by: Cheng Lian <lian@databricks.com>
@asfgit asfgit closed this in 76c155d Aug 17, 2015
@liancheng
Copy link
Contributor Author

Merged to master and branch-1.5.

@liancheng liancheng deleted the spark-7837/double-closing branch August 17, 2015 17:01
@SparkQA
Copy link

SparkQA commented Aug 17, 2015

Test build #41014 has finished for PR 8236 at commit 9b668c3.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants