Skip to content
This repository has been archived by the owner on Sep 18, 2023. It is now read-only.

Spark executor lost while DatasetFileWriter failed with speculation #891

Closed
jackylee-ch opened this issue May 7, 2022 · 0 comments
Closed
Labels
bug Something isn't working

Comments

@jackylee-ch
Copy link
Contributor

jackylee-ch commented May 7, 2022

Describe the bug
When spark speculation is true, the FinalTask may be killed as another attempt succeeded. In this case, DatasetFileWriter may failed and threw exception, which will cause executor failed.

java.lang.RuntimeException: java.lang.RuntimeException: Opening HDFS file '/tmp/_temporary/0/_temporary/attempt_202205071013574190200715410661668_0055_m_000273_180429/part-00273-3dd5b446-1003-4e56-99b1-cd4754b0b97b-c000.parquet' failed
	at org.apache.arrow.dataset.file.DatasetFileWriter.write(DatasetFileWriter.java:52)
	at com.intel.oap.spark.sql.ArrowWriteQueue.$anonfun$writeThread$1(ArrowWriteQueue.scala:54)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: Opening HDFS file '/tmp/_temporary/0/_temporary/attempt_202205071013574190200715410661668_0055_m_000273_180429/part-00273-3dd5b446-1003-4e56-99b1-cd4754b0b97b-c000.parquet' failed
	at org.apache.arrow.dataset.file.JniWrapper.writeFromScannerToFile(Native Method)
	at org.apache.arrow.dataset.file.DatasetFileWriter.write(DatasetFileWriter.java:49)
	... 2 more

To Reproduce
Write arrow with spark speculation and ArrowWriteExtension.

Expected behavior
Only warn log should be found, not losting executor.

Additional context
None.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants