Skip to content
This repository has been archived by the owner on Sep 18, 2023. It is now read-only.

Native Shuffle failed on SF3000 TPCH Q10 w/ 25K batch size #567

Closed
zhouyuan opened this issue Nov 15, 2021 · 2 comments
Closed

Native Shuffle failed on SF3000 TPCH Q10 w/ 25K batch size #567

zhouyuan opened this issue Nov 15, 2021 · 2 comments
Labels
bug Something isn't working

Comments

@zhouyuan
Copy link
Collaborator

Describe the bug
the same code runs OK on 20k batch size, so it looks like some wrong data structure used

Job aborted due to stage failure: Task 250 in stage 573.0 failed 4 times, most recent failure: Lost task 250.3 in stage 573.0 (TID 215824) (sr260 executor 71): java.lang.RuntimeException: Native split: splitter split failed - Error during calling Java code from native code
	at com.intel.oap.vectorized.ShuffleSplitterJniWrapper.split(Native Method)
	at org.apache.spark.shuffle.ColumnarShuffleWriter.write(ColumnarShuffleWriter.scala:162)
	at org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59)
	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)

To Reproduce
Run SF3000 TPCH Q10 w/ 25k batch size

Expected behavior
run ok w/ 25k or bigger batch size

Additional context
N/A

@zhouyuan zhouyuan added the bug Something isn't working label Nov 15, 2021
@zhouyuan zhouyuan assigned zhouyuan and unassigned zhouyuan Nov 18, 2021
@zhouyuan zhouyuan pinned this issue Nov 18, 2021
@zhouyuan
Copy link
Collaborator Author

@PHILO-HE

@zhouyuan
Copy link
Collaborator Author

zhouyuan commented Dec 6, 2021

close by #594

@zhouyuan zhouyuan closed this as completed Dec 6, 2021
@zhouyuan zhouyuan unpinned this issue Dec 6, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant