-
Notifications
You must be signed in to change notification settings - Fork 358
no lmdbjni in java.library.path exception #7
Comments
Sorry it was solved by Please close it. Thanks. |
Hi,I have the same problem in my program. And I also use "export LD_LIBRARY_PATH=${CAFFE_ON_SPARK}/caffe-public/distribute/lib:${CAFFE_ON_SPARK}/caffe-distri/distribute/lib". It is useless. 16/10/09 19:49:00 ERROR ApplicationMaster: User class threw exception: java.lang.UnsatisfiedLinkError: no lmdbjni in java.library.path |
|
I check the .so file exists in the "${CAFFE_ON_SPARK}/caffe-distri/distribute/lib", they are libcaffedistri.so and liblmdbjni.so. I use yarn to run my job. Of cource I had set the LD_LIBRARY_PATH in both executor and driver as spark-submit option.
It seems the LD_LIBRARY_PATH was useless because it dose not contain in Djava.library.path. |
Do you have CaffeOnSpark directories on the driver node? Please note that, Andy On Sun, Oct 9, 2016 at 8:55 PM, yilaguan notifications@github.com wrote:
|
I had CaffeOnSpark directories on the driver node. And I use “--master yarn --deploy-mode=cluster” |
I'm experiencing a similar error with the missing lmdbjni while trying to run the mnist-example in yarn-cluster mode (like yilaguan) after following the GetStarted_yarn instructions. Just for clarification: Do I have to put anything from CaffeOnSpark on the node B/C to eliminate this error? The complete directory? I don't think so because of usage of YARN. Or any configs or exports in the .bash_profile's? Or do any of the dependencies like glog or protobuf need to be installed on B/C? |
The .so files should exist on the executors and PATH needs to be set properly. You don't need to put those files manually. We usually create a tar file containing all the required library files, then use "--archives /path/to/CaffeOnSparkLibrary_archive.tgz" to ship it by yarn. so that the executors know where to find them. |
Thanks for you answer. Okay, that sound like useful information. The --archives option is necessary? Then its missing in the wiki-instruction. |
I did not write the tutorial, so I am not sure what context it was. It seems to me the whole cluster is in a single box, i.e. everything is local. In that case, all the executors should know where the .so files are. You don't have to ship anything. I am not sure why were you getting error if you were running "local" yarn mode. I was talking about a case where the executors are distributed, where --archives is needed. "CaffeOnSparkLibrary_archive.tgz" is just a tar.gz file that containing all the required library files, you create it yourself by "tar czf CaffeOnSparkLibrary_archive.tgz all_library_files". Basically, you need to copy files under caffe-public/distribute/lib and caffe-distri/distribute/lib to a temp directory and tar everything in that temp directory into a tar-ball. |
I think you mean a pseudo cluster. In my case it's a real cluster. Alright. I added the archives and extended the paths as mentioned and this error disappeared. But another one appeared (other topic, I mentioned it in existing #239 (comment)). Other story, real reason why I'm writing here again: In this archive-context a strange error appeared when executing the cifar example of the same tutorial (where I also added the archive).
I don't even understand the error. How is it possible, that the system says I think the error above also causes the final exception, with which the app exits failed:
|
The error message is clear. The program can not find the library. i.e. libcaffedistri.so Your LD_LIBRARY_PATH was not set properly, it appears. |
Okay. Just to give some more information, the
So as you see, everything is as it should be. But the |
|
I was able to resolve the problem, here my solution for other users having a similar problem. Dumb as I was I just packed the libs from
Thanks junshi15 for your help! |
@BlueRayONE Thanks for your feedback. It will help other users who experienced the same issue. |
16/02/26 16:34:34 INFO caffe.DataSource$: Source data layer:0
16/02/26 16:34:34 INFO caffe.LMDB: Batch size:64
Exception in thread "main" java.lang.UnsatisfiedLinkError: no lmdbjni in java.library.path
at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1864)
at java.lang.Runtime.loadLibrary0(Runtime.java:870)
at java.lang.System.loadLibrary(System.java:1122)
at com.yahoo.ml.caffe.LMDB$.makeSequence(LMDB.scala:28)
at com.yahoo.ml.caffe.LMDB.makeRDD(LMDB.scala:94)
at com.yahoo.ml.caffe.CaffeOnSpark.train(CaffeOnSpark.scala:113)
at com.yahoo.ml.caffe.CaffeOnSpark$.main(CaffeOnSpark.scala:44)
at com.yahoo.ml.caffe.CaffeOnSpark.main(CaffeOnSpark.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
16/02/26 16:34:34 INFO spark.SparkContext: Invoking stop() from shutdown hook
The text was updated successfully, but these errors were encountered: