Native Shuffle failed on SF3000 TPCH Q10 w/ 25K batch size #567

zhouyuan · 2021-11-15T06:44:31Z

Describe the bug
the same code runs OK on 20k batch size, so it looks like some wrong data structure used

Job aborted due to stage failure: Task 250 in stage 573.0 failed 4 times, most recent failure: Lost task 250.3 in stage 573.0 (TID 215824) (sr260 executor 71): java.lang.RuntimeException: Native split: splitter split failed - Error during calling Java code from native code
	at com.intel.oap.vectorized.ShuffleSplitterJniWrapper.split(Native Method)
	at org.apache.spark.shuffle.ColumnarShuffleWriter.write(ColumnarShuffleWriter.scala:162)
	at org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59)
	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)

To Reproduce
Run SF3000 TPCH Q10 w/ 25k batch size

Expected behavior
run ok w/ 25k or bigger batch size

Additional context
N/A

The text was updated successfully, but these errors were encountered:

zhouyuan · 2021-11-18T03:59:14Z

@PHILO-HE

zhouyuan · 2021-12-06T13:26:01Z

close by #594

zhouyuan added the bug Something isn't working label Nov 15, 2021

zhouyuan assigned zhouyuan and unassigned zhouyuan Nov 18, 2021

zhouyuan pinned this issue Nov 18, 2021

zhouyuan mentioned this issue Nov 18, 2021

Cannot successfully run TPC-H 3TB scale without failing tasks when use 20k as batchsize #564

Closed

zhouyuan closed this as completed Dec 6, 2021

zhouyuan unpinned this issue Dec 6, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Native Shuffle failed on SF3000 TPCH Q10 w/ 25K batch size #567

Native Shuffle failed on SF3000 TPCH Q10 w/ 25K batch size #567

zhouyuan commented Nov 15, 2021

zhouyuan commented Nov 18, 2021

zhouyuan commented Dec 6, 2021

Native Shuffle failed on SF3000 TPCH Q10 w/ 25K batch size #567

Native Shuffle failed on SF3000 TPCH Q10 w/ 25K batch size #567

Comments

zhouyuan commented Nov 15, 2021

zhouyuan commented Nov 18, 2021

zhouyuan commented Dec 6, 2021