Skip to content
This repository has been archived by the owner on Jan 3, 2023. It is now read-only.

The function of vmem-cache and guava-cache should not be associated with arrow. #190

Closed
haojinIntel opened this issue Aug 2, 2021 · 1 comment
Labels
bug Something isn't working

Comments

@haojinIntel
Copy link
Contributor

Thanks to the commit 6bc51cea2690cef9ee5f2d6ac15741492bc474af, we should add arrow-java*.jar when using vmem-cache or guava-cache. The error messages are showed below when not enabling arrow-java*.jar:


2021-08-01 11:17:20,382 WARN scheduler.TaskSetManager: Lost task 190.0 in stage 0.0 (TID 190) (vsr420 executor 6): java.lang.NoClassDefFoundError: org/apache/arrow/plasma/exceptions/PlasmaClientException
        at org.apache.spark.sql.execution.datasources.oap.filecache.FiberCacheManager.<init>(FiberCacheManager.scala:96)
        at org.apache.spark.sql.oap.OapExecutorRuntime.<init>(OapRuntime.scala:108)
        at org.apache.spark.sql.oap.OapRuntime$.init(OapRuntime.scala:153)
        at org.apache.spark.sql.oap.OapRuntime$.init(OapRuntime.scala:141)
        at org.apache.spark.sql.oap.OapRuntime$.getOrCreate(OapRuntime.scala:134)
        at org.apache.spark.sql.execution.datasources.oap.index.BTreeIndexRecordReader.getBTreeFiberCache(BTreeIndexRecordReader.scala:91)
        at org.apache.spark.sql.execution.datasources.oap.index.BTreeIndexRecordReaderV1.readBTreeFooter(BTreeIndexRecordReaderV1.scala:59)
        at org.apache.spark.sql.execution.datasources.oap.index.BTreeIndexRecordReaderV1.initializeReader(BTreeIndexRecordReaderV1.scala:50)
        at org.apache.spark.sql.execution.datasources.oap.index.BTreeIndexRecordReader.analyzeStatistics(BTreeIndexRecordReader.scala:139)
        at org.apache.spark.sql.execution.datasources.oap.index.BPlusTreeScanner.analyzeStatistics(BPlusTreeScanner.scala:57)
        at org.apache.spark.sql.execution.datasources.oap.index.IndexScanner.analysisResByStatistics(IndexScanner.scala:135)
        at org.apache.spark.sql.execution.datasources.oap.index.IndexScanner.analysisResByPolicies(IndexScanner.scala:100)
        at org.apache.spark.sql.execution.datasources.oap.index.IndexScanners.$anonfun$isIndexFileBeneficial$1(IndexScanner.scala:332)
        at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238)
        at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
        at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
        at scala.collection.TraversableLike.map(TraversableLike.scala:238)
        at scala.collection.TraversableLike.map$(TraversableLike.scala:231)
        at scala.collection.AbstractTraversable.map(Traversable.scala:108)
        at org.apache.spark.sql.execution.datasources.oap.index.IndexScanners.isIndexFileBeneficial(IndexScanner.scala:332)
        at org.apache.spark.sql.execution.datasources.oap.io.OapDataReaderV1.initialize(OapDataReaderWriter.scala:94)
        at org.apache.spark.sql.execution.datasources.oap.io.OapDataReaderV1.read(OapDataReaderWriter.scala:150)
        at org.apache.spark.sql.execution.datasources.oap.OptimizedOrcFileFormat.$anonfun$buildReaderWithPartitionValues$5(OptimizedOrcFileFormat.scala:145)
        at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.org$apache$spark$sql$execution$datasources$FileScanRDD$$anon$$readCurrentFile(FileScanRDD.scala:116)
        at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:169)
        at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:93)
        at org.apache.spark.sql.execution.OapFileSourceScanExec$$anon$1.hasNext(OapFileSourceScanExec.scala:393)
        at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.columnartorow_nextBatch_0$(Unknown Source)
        at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)
        at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
        at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:755)
        at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458)
        at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458)
        at scala.collection.Iterator.foreach(Iterator.scala:941)
        at scala.collection.Iterator.foreach$(Iterator.scala:941)
        at scala.collection.AbstractIterator.foreach(Iterator.scala:1429)
        at org.apache.spark.rdd.RDD.$anonfun$foreach$2(RDD.scala:1012)
        at org.apache.spark.rdd.RDD.$anonfun$foreach$2$adapted(RDD.scala:1012)
        at org.apache.spark.SparkContext.$anonfun$runJob$5(SparkContext.scala:2242)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
        at org.apache.spark.scheduler.Task.run(Task.scala:131)
        at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:497)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1439)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:500)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassNotFoundException: org.apache.arrow.plasma.exceptions.PlasmaClientException
        at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        ... 48 more
@haojinIntel
Copy link
Contributor Author

@yma11 @winningsix @zhixingheyi-tian Please help to track the issue. Thanks!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants