Skip to content
This repository has been archived by the owner on Nov 16, 2019. It is now read-only.

org.fusesource.lmdbjni.LMDBException: Permission denied #95

Closed
prateekarora-git opened this issue Jun 22, 2016 · 3 comments
Closed

org.fusesource.lmdbjni.LMDBException: Permission denied #95

prateekarora-git opened this issue Jun 22, 2016 · 3 comments

Comments

@prateekarora-git
Copy link

Hi

I am running CaffeonSpark on yarn . my cluster is running on ubuntu 14.04.
i am using https://github.com/yahoo/CaffeOnSpark/wiki/GetStarted_yarn to setup CaffeOnSpark on my cluster.

when i tried to Train a DNN network using CaffeOnSpark using below command

SPARK_WORKER_INSTANCES=2
DEVICES=1
CAFFE_ON_SPARK=/home/ubuntu/CaffeOnSpark/
LD_LIBRARY_PATH=${CAFFE_ON_SPARK}/caffe-public/distribute/lib:${CAFFE_ON_SPARK}/caffe-distri/distribute/lib
LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/usr/local/cuda-7.5/lib64

spark-submit --master yarn-client
--num-executors 2
--files /home/ubuntu/lenet_memory_solver.prototxt,/home/ubuntu/lenet_memory_train_test.prototxt
--conf spark.driver.extraLibraryPath="${LD_LIBRARY_PATH}"
--conf spark.executorEnv.LD_LIBRARY_PATH="${LD_LIBRARY_PATH}"
--conf spark.dynamicAllocation.maxExecutors=2
--conf spark.dynamicAllocation.minExecutors=2
--class com.yahoo.ml.caffe.CaffeOnSpark
${CAFFE_ON_SPARK}/caffe-grid/target/caffe-grid-0.1-SNAPSHOT-jar-with-dependencies.jar
-train
-features accuracy,loss -label label
-conf /home/ubuntu/lenet_memory_solver.prototxt
-devices ${DEVICES}
-connection ethernet
-model hdfs:/user/ubuntu/mnist.model
-output hdfs:/user/ubuntu/mnist_features_result

i got below Error ->

16/06/19 22:07:28 INFO executor.Executor: Running task 1.0 in stage 2.0 (TID 5)
16/06/19 22:07:28 INFO broadcast.TorrentBroadcast: Started reading broadcast variable 3
16/06/19 22:07:28 INFO storage.MemoryStore: Block broadcast_3_piece0 stored as bytes in memory (estimated size 2.2 KB, free 12.2 KB)
16/06/19 22:07:28 INFO broadcast.TorrentBroadcast: Reading broadcast variable 3 took 20 ms
16/06/19 22:07:28 INFO storage.MemoryStore: Block broadcast_3 stored as values in memory (estimated size 3.5 KB, free 15.7 KB)
16/06/19 22:07:29 INFO spark.CacheManager: Partition rdd_6_1 not found, computing it
16/06/19 22:07:29 INFO spark.CacheManager: Partition rdd_1_1 not found, computing it
16/06/19 22:07:29 INFO caffe.LmdbRDD: Processing partition 1
16/06/19 22:07:29 INFO caffe.LmdbRDD: local LMDB path:/home/ubuntu/CaffeOnSpark/data/mnist_train_lmdb
16/06/19 22:07:29 ERROR executor.Executor: Exception in task 1.0 in stage 2.0 (TID 5)
org.fusesource.lmdbjni.LMDBException: Permission denied
at org.fusesource.lmdbjni.Util.checkErrorCode(Util.java:44)
at org.fusesource.lmdbjni.Env.open(Env.java:192)
at org.fusesource.lmdbjni.Env.open(Env.java:72)
at org.fusesource.lmdbjni.Env.open(Env.java:65)
at org.fusesource.lmdbjni.Env.(Env.java:47)
at com.yahoo.ml.caffe.LmdbRDD.com$yahoo$ml$caffe$LmdbRDD$$openDB(LmdbRDD.scala:202)
at com.yahoo.ml.caffe.LmdbRDD$$anon$1.(LmdbRDD.scala:102)
at com.yahoo.ml.caffe.LmdbRDD.compute(LmdbRDD.scala:100)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:69)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:268)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:69)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:268)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)

@anfeng
Copy link
Contributor

anfeng commented Jun 22, 2016

Please make sure that /home/ubuntu/CaffeOnSpark/data/mnist_train_lmdb
folder is writable.

  • chmod -R +w /home/ubuntu/CaffeOnSpark/data/mnist_train_lmdb

Not sure about your command line? It may need to be:

spark-submit --master yarn --deploy-mode client

Andy

On Wed, Jun 22, 2016 at 9:54 AM, Prateek Arora notifications@github.com
wrote:

Hi

I am running CaffeonSpark on yarn . my cluster is running on ubuntu 14.04.
i am using https://github.com/yahoo/CaffeOnSpark/wiki/GetStarted_yarn to
setup CaffeOnSpark on my cluster.

when i tried to Train a DNN network using CaffeOnSpark using below command

SPARK_WORKER_INSTANCES=2
DEVICES=1
CAFFE_ON_SPARK=/home/ubuntu/CaffeOnSpark/

LD_LIBRARY_PATH=${CAFFE_ON_SPARK}/caffe-public/distribute/lib:${CAFFE_ON_SPARK}/caffe-distri/distribute/lib
LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/usr/local/cuda-7.5/lib64

spark-submit --master yarn-client
--num-executors 2
--files
/home/ubuntu/lenet_memory_solver.prototxt,/home/ubuntu/lenet_memory_train_test.prototxt

--conf spark.driver.extraLibraryPath="${LD_LIBRARY_PATH}"
--conf spark.executorEnv.LD_LIBRARY_PATH="${LD_LIBRARY_PATH}"
--conf spark.dynamicAllocation.maxExecutors=2
--conf spark.dynamicAllocation.minExecutors=2
--class com.yahoo.ml.caffe.CaffeOnSpark
${CAFFE_ON_SPARK}/caffe-grid/target/caffe-grid-0.1-SNAPSHOT-jar-with-dependencies.jar

-train
-features accuracy,loss -label label
-conf /home/ubuntu/lenet_memory_solver.prototxt
-devices ${DEVICES}
-connection ethernet
-model hdfs:/user/ubuntu/mnist.model
-output hdfs:/user/ubuntu/mnist_features_result

i got below Error ->

16/06/19 22:07:28 INFO executor.Executor: Running task 1.0 in stage 2.0
(TID 5)
16/06/19 22:07:28 INFO broadcast.TorrentBroadcast: Started reading
broadcast variable 3
16/06/19 22:07:28 INFO storage.MemoryStore: Block broadcast_3_piece0
stored as bytes in memory (estimated size 2.2 KB, free 12.2 KB)
16/06/19 22:07:28 INFO broadcast.TorrentBroadcast: Reading broadcast
variable 3 took 20 ms
16/06/19 22:07:28 INFO storage.MemoryStore: Block broadcast_3 stored as
values in memory (estimated size 3.5 KB, free 15.7 KB)
16/06/19 22:07:29 INFO spark.CacheManager: Partition rdd_6_1 not found,
computing it
16/06/19 22:07:29 INFO spark.CacheManager: Partition rdd_1_1 not found,
computing it
16/06/19 22:07:29 INFO caffe.LmdbRDD: Processing partition 1
16/06/19 22:07:29 INFO caffe.LmdbRDD: local LMDB
path:/home/ubuntu/CaffeOnSpark/data/mnist_train_lmdb
16/06/19 22:07:29 ERROR executor.Executor: Exception in task 1.0 in stage
2.0 (TID 5)
org.fusesource.lmdbjni.LMDBException: Permission denied
at org.fusesource.lmdbjni.Util.checkErrorCode(Util.java:44)
at org.fusesource.lmdbjni.Env.open(Env.java:192)
at org.fusesource.lmdbjni.Env.open(Env.java:72)
at org.fusesource.lmdbjni.Env.open(Env.java:65)
at org.fusesource.lmdbjni.Env.(Env.java:47)
at com.yahoo.ml.caffe.LmdbRDD.com
$yahoo$ml$caffe$LmdbRDD$$openDB(LmdbRDD.scala:202)
at com.yahoo.ml.caffe.LmdbRDD$$anon$1.(LmdbRDD.scala:102)
at com.yahoo.ml.caffe.LmdbRDD.compute(LmdbRDD.scala:100)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:69)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:268)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:69)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:268)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#95, or mute the thread
https://github.com/notifications/unsubscribe/AClTeIWZFLQhJdURF5k9PIxdyc9LknFJks5qOWjIgaJpZM4I7_w9
.

@mriduljain
Copy link
Contributor

@prateekarora151083 If this has been solved, please send back what you did and close it

@prateekarora-git
Copy link
Author

yes my problem was resolved.
i put all mnist_test_lmdb and mnist_train_lmdb/ data folder into hdfs and change source path in lenet_memory_train_test.prototxt.

Regards
Prateek

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants