-
Notifications
You must be signed in to change notification settings - Fork 357
java.lang.NullPointerException when running in standalone mode #71
Comments
Could you not use file:/home/caffe/Caffe/CaffeOnSpark/mnist_train_lmdb and Let's see if that works On Mon, May 30, 2016 at 4:33 AM, abhaymise notifications@github.com wrote:
|
The following log indicate that you may don't have LMDB data files at file:/home/caffe/Caffe/CaffeOnSpark/mnist_train_lmdb folder. Please following instruction at https://github.com/yahoo/CaffeOnSpark/wiki/GetStarted_local, especially step 6. Exception in thread "main" java.lang.NullPointerException |
Thanks for your timely reply. I was able to run the example. The file url had to be modifies as suggested by Mridul. |
I am getting null pointer exception when submitting CaffenOnSpark on mnist data.
The command to submit is :
spark-submit --master ${MASTER_URL} --files ${CAFFE_ON_SPARK}/data/lenet_memory_solver.prototxt,${CAFFE_ON_SPARK}/data/lenet_memory_train_test.prototxt --conf spark.cores.max=${TOTAL_CORES} --conf spark.task.cpus=${CORES_PER_WORKER} --conf spark.driver.extraLibraryPath="${LD_LIBRARY_PATH}" --conf spark.executorEnv.LD_LIBRARY_PATH="${LD_LIBRARY_PATH}" --class com.yahoo.ml.caffe.CaffeOnSpark ${CAFFE_ON_SPARK}/caffe-grid/target/caffe-grid-0.1-SNAPSHOT-jar-with-dependencies.jar -train -features accuracy,loss -label label -conf lenet_memory_solver.prototxt -clusterSize ${SPARK_WORKER_INSTANCES} -devices 1 -connection ethernet -model file:${CAFFE_ON_SPARK}/mnist_lenet.model -output file:${CAFFE_ON_SPARK}/lenet_features_result.
Log generated is :
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
16/05/30 16:47:42 INFO SparkContext: Running Spark version 1.6.1
16/05/30 16:47:42 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/05/30 16:47:43 WARN SparkConf:
SPARK_WORKER_INSTANCES was detected (set to '1').
This is deprecated in Spark 1.0+.
Please instead use:
16/05/30 16:47:43 WARN Utils: Your hostname, ubuntu-H81M-S resolves to a loopback address: 127.0.0.1; using 192.168.1.29 instead (on interface eth0)
16/05/30 16:47:43 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
16/05/30 16:47:43 INFO SecurityManager: Changing view acls to: caffe
16/05/30 16:47:43 INFO SecurityManager: Changing modify acls to: caffe
16/05/30 16:47:43 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(caffe); users with modify permissions: Set(caffe)
16/05/30 16:47:43 INFO Utils: Successfully started service 'sparkDriver' on port 44682.
16/05/30 16:47:43 INFO Slf4jLogger: Slf4jLogger started
16/05/30 16:47:43 INFO Remoting: Starting remoting
16/05/30 16:47:43 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriverActorSystem@192.168.1.29:47591]
16/05/30 16:47:43 INFO Utils: Successfully started service 'sparkDriverActorSystem' on port 47591.
16/05/30 16:47:43 INFO SparkEnv: Registering MapOutputTracker
16/05/30 16:47:43 INFO SparkEnv: Registering BlockManagerMaster
16/05/30 16:47:43 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-c94ea90e-4190-4605-b380-f147255c9ac3
16/05/30 16:47:43 INFO MemoryStore: MemoryStore started with capacity 511.1 MB
16/05/30 16:47:43 INFO SparkEnv: Registering OutputCommitCoordinator
16/05/30 16:47:44 INFO Utils: Successfully started service 'SparkUI' on port 4040.
16/05/30 16:47:44 INFO SparkUI: Started SparkUI at http://192.168.1.29:4040
16/05/30 16:47:44 INFO HttpFileServer: HTTP File server directory is /tmp/spark-888c7abf-7cbd-4936-8d25-d3b8f0b875a0/httpd-93aa28a2-68f1-4414-bbc5-816e350a7466
16/05/30 16:47:44 INFO HttpServer: Starting HTTP Server
16/05/30 16:47:44 INFO Utils: Successfully started service 'HTTP file server' on port 46265.
16/05/30 16:47:44 INFO SparkContext: Added JAR file:/home/caffe/Caffe/CaffeOnSpark/caffe-grid/target/caffe-grid-0.1-SNAPSHOT-jar-with-dependencies.jar at http://192.168.1.29:46265/jars/caffe-grid-0.1-SNAPSHOT-jar-with-dependencies.jar with timestamp 1464607064217
16/05/30 16:47:44 INFO Utils: Copying /home/caffe/Caffe/CaffeOnSpark/data/lenet_memory_solver.prototxt to /tmp/spark-888c7abf-7cbd-4936-8d25-d3b8f0b875a0/userFiles-d2e43ca6-f05c-48d2-8dbb-d71223681689/lenet_memory_solver.prototxt
16/05/30 16:47:44 INFO SparkContext: Added file file:/home/caffe/Caffe/CaffeOnSpark/data/lenet_memory_solver.prototxt at http://192.168.1.29:46265/files/lenet_memory_solver.prototxt with timestamp 1464607064313
16/05/30 16:47:44 INFO Utils: Copying /home/caffe/Caffe/CaffeOnSpark/data/lenet_memory_train_test.prototxt to /tmp/spark-888c7abf-7cbd-4936-8d25-d3b8f0b875a0/userFiles-d2e43ca6-f05c-48d2-8dbb-d71223681689/lenet_memory_train_test.prototxt
16/05/30 16:47:44 INFO SparkContext: Added file file:/home/caffe/Caffe/CaffeOnSpark/data/lenet_memory_train_test.prototxt at http://192.168.1.29:46265/files/lenet_memory_train_test.prototxt with timestamp 1464607064319
16/05/30 16:47:44 INFO AppClient$ClientEndpoint: Connecting to master spark://ubuntu-H81M-S:7077...
16/05/30 16:47:44 INFO SparkDeploySchedulerBackend: Connected to Spark cluster with app ID app-20160530164744-0010
16/05/30 16:47:44 INFO AppClient$ClientEndpoint: Executor added: app-20160530164744-0010/0 on worker-20160530160108-192.168.1.29-36155 (192.168.1.29:36155) with 1 cores
16/05/30 16:47:44 INFO SparkDeploySchedulerBackend: Granted executor ID app-20160530164744-0010/0 on hostPort 192.168.1.29:36155 with 1 cores, 1024.0 MB RAM
16/05/30 16:47:44 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 36036.
16/05/30 16:47:44 INFO NettyBlockTransferService: Server created on 36036
16/05/30 16:47:44 INFO BlockManagerMaster: Trying to register BlockManager
16/05/30 16:47:44 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.1.29:36036 with 511.1 MB RAM, BlockManagerId(driver, 192.168.1.29, 36036)
16/05/30 16:47:44 INFO BlockManagerMaster: Registered BlockManager
16/05/30 16:47:44 INFO AppClient$ClientEndpoint: Executor updated: app-20160530164744-0010/0 is now RUNNING
16/05/30 16:47:46 INFO SparkDeploySchedulerBackend: Registered executor NettyRpcEndpointRef(null) (ubuntu-H81M-S:42453) with ID 0
16/05/30 16:47:46 INFO SparkDeploySchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 1.0
16/05/30 16:47:46 INFO BlockManagerMasterEndpoint: Registering block manager ubuntu-H81M-S:46625 with 511.1 MB RAM, BlockManagerId(0, ubuntu-H81M-S, 46625)
16/05/30 16:47:48 INFO DataSource$: Source data layer:0
16/05/30 16:47:48 INFO LMDB: Batch size:64
16/05/30 16:47:48 INFO SparkContext: Starting job: collect at CaffeOnSpark.scala:127
16/05/30 16:47:48 INFO DAGScheduler: Got job 0 (collect at CaffeOnSpark.scala:127) with 1 output partitions
16/05/30 16:47:48 INFO DAGScheduler: Final stage: ResultStage 0 (collect at CaffeOnSpark.scala:127)
16/05/30 16:47:48 INFO DAGScheduler: Parents of final stage: List()
16/05/30 16:47:48 INFO DAGScheduler: Missing parents: List()
16/05/30 16:47:48 INFO DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[3] at map at CaffeOnSpark.scala:116), which has no missing parents
16/05/30 16:47:48 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 3.3 KB, free 3.3 KB)
16/05/30 16:47:48 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 2.1 KB, free 5.4 KB)
16/05/30 16:47:48 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 192.168.1.29:36036 (size: 2.1 KB, free: 511.1 MB)
16/05/30 16:47:48 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1006
16/05/30 16:47:48 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 0 (MapPartitionsRDD[3] at map at CaffeOnSpark.scala:116)
16/05/30 16:47:48 INFO TaskSchedulerImpl: Adding task set 0.0 with 1 tasks
16/05/30 16:47:48 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, ubuntu-H81M-S, partition 0,PROCESS_LOCAL, 2200 bytes)
16/05/30 16:47:48 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on ubuntu-H81M-S:46625 (size: 2.1 KB, free: 511.1 MB)
16/05/30 16:47:49 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 795 ms on ubuntu-H81M-S (1/1)
16/05/30 16:47:49 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
16/05/30 16:47:49 INFO DAGScheduler: ResultStage 0 (collect at CaffeOnSpark.scala:127) finished in 0.800 s
16/05/30 16:47:49 INFO DAGScheduler: Job 0 finished: collect at CaffeOnSpark.scala:127, took 1.052826 s
16/05/30 16:47:49 INFO CaffeOnSpark: rank = 0, address = null, hostname = ubuntu-H81M-S
16/05/30 16:47:49 INFO CaffeOnSpark: rank 0:ubuntu-H81M-S
16/05/30 16:47:49 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 112.0 B, free 5.5 KB)
16/05/30 16:47:49 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 53.0 B, free 5.5 KB)
16/05/30 16:47:49 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on 192.168.1.29:36036 (size: 53.0 B, free: 511.1 MB)
16/05/30 16:47:49 INFO SparkContext: Created broadcast 1 from broadcast at CaffeOnSpark.scala:146
16/05/30 16:47:49 INFO BlockManagerInfo: Removed broadcast_0_piece0 on 192.168.1.29:36036 in memory (size: 2.1 KB, free: 511.1 MB)
16/05/30 16:47:49 INFO SparkContext: Starting job: collect at CaffeOnSpark.scala:155
16/05/30 16:47:49 INFO BlockManagerInfo: Removed broadcast_0_piece0 on ubuntu-H81M-S:46625 in memory (size: 2.1 KB, free: 511.1 MB)
16/05/30 16:47:49 INFO DAGScheduler: Got job 1 (collect at CaffeOnSpark.scala:155) with 1 output partitions
16/05/30 16:47:49 INFO DAGScheduler: Final stage: ResultStage 1 (collect at CaffeOnSpark.scala:155)
16/05/30 16:47:49 INFO DAGScheduler: Parents of final stage: List()
16/05/30 16:47:49 INFO DAGScheduler: Missing parents: List()
16/05/30 16:47:49 INFO DAGScheduler: Submitting ResultStage 1 (MapPartitionsRDD[5] at map at CaffeOnSpark.scala:149), which has no missing parents
16/05/30 16:47:49 INFO MemoryStore: Block broadcast_2 stored as values in memory (estimated size 2.6 KB, free 2.8 KB)
16/05/30 16:47:49 INFO MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 1597.0 B, free 4.3 KB)
16/05/30 16:47:49 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on 192.168.1.29:36036 (size: 1597.0 B, free: 511.1 MB)
16/05/30 16:47:49 INFO SparkContext: Created broadcast 2 from broadcast at DAGScheduler.scala:1006
16/05/30 16:47:49 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 1 (MapPartitionsRDD[5] at map at CaffeOnSpark.scala:149)
16/05/30 16:47:49 INFO TaskSchedulerImpl: Adding task set 1.0 with 1 tasks
16/05/30 16:47:49 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 1, ubuntu-H81M-S, partition 0,PROCESS_LOCAL, 2200 bytes)
16/05/30 16:47:49 INFO ContextCleaner: Cleaned accumulator 1
16/05/30 16:47:49 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on ubuntu-H81M-S:46625 (size: 1597.0 B, free: 511.1 MB)
16/05/30 16:47:49 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on ubuntu-H81M-S:46625 (size: 53.0 B, free: 511.1 MB)
16/05/30 16:47:49 INFO DAGScheduler: ResultStage 1 (collect at CaffeOnSpark.scala:155) finished in 0.084 s
16/05/30 16:47:49 INFO DAGScheduler: Job 1 finished: collect at CaffeOnSpark.scala:155, took 0.103977 s
16/05/30 16:47:49 INFO TaskSetManager: Finished task 0.0 in stage 1.0 (TID 1) in 84 ms on ubuntu-H81M-S (1/1)
16/05/30 16:47:49 INFO TaskSchedulerImpl: Removed TaskSet 1.0, whose tasks have all completed, from pool
Exception in thread "main" java.lang.NullPointerException
at scala.collection.mutable.ArrayOps$ofRef$.length$extension(ArrayOps.scala:114)
at scala.collection.mutable.ArrayOps$ofRef.length(ArrayOps.scala:114)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:32)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
at com.yahoo.ml.caffe.LmdbRDD.localLMDBFile(LmdbRDD.scala:185)
at com.yahoo.ml.caffe.LmdbRDD.com$yahoo$ml$caffe$LmdbRDD$$openDB(LmdbRDD.scala:202)
at com.yahoo.ml.caffe.LmdbRDD.getPartitions(LmdbRDD.scala:46)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
at com.yahoo.ml.caffe.CaffeOnSpark.train(CaffeOnSpark.scala:158)
at com.yahoo.ml.caffe.CaffeOnSpark$.main(CaffeOnSpark.scala:40)
at com.yahoo.ml.caffe.CaffeOnSpark.main(CaffeOnSpark.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
16/05/30 16:47:49 INFO SparkContext: Invoking stop() from shutdown hook
16/05/30 16:47:49 INFO SparkUI: Stopped Spark web UI at http://192.168.1.29:4040
16/05/30 16:47:49 INFO SparkDeploySchedulerBackend: Shutting down all executors
16/05/30 16:47:49 INFO SparkDeploySchedulerBackend: Asking each executor to shut down
16/05/30 16:47:49 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
16/05/30 16:47:49 INFO MemoryStore: MemoryStore cleared
16/05/30 16:47:49 INFO BlockManager: BlockManager stopped
16/05/30 16:47:49 INFO BlockManagerMaster: BlockManagerMaster stopped
16/05/30 16:47:49 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
16/05/30 16:47:49 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
16/05/30 16:47:49 INFO SparkContext: Successfully stopped SparkContext
16/05/30 16:47:49 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
16/05/30 16:47:49 INFO ShutdownHookManager: Shutdown hook called
16/05/30 16:47:49 INFO ShutdownHookManager: Deleting directory /tmp/spark-888c7abf-7cbd-4936-8d25-d3b8f0b875a0/httpd-93aa28a2-68f1-4414-bbc5-816e350a7466
16/05/30 16:47:49 INFO ShutdownHookManager: Deleting directory /tmp/spark-888c7abf-7cbd-4936-8d25-d3b8f0b875a0
The lenet_memory_train_text.prototxt is
name: "LeNet"
layer {
name: "data"
type: "MemoryData"
top: "data"
top: "label"
include {
phase: TRAIN
}
source_class: "com.yahoo.ml.caffe.LMDB"
memory_data_param {
source: "file:/home/caffe/Caffe/CaffeOnSpark/mnist_train_lmdb"
batch_size: 64
channels: 1
height: 28
width: 28
share_in_parallel: false
}
transform_param {
scale: 0.00390625
}
}
layer {
name: "data"
type: "MemoryData"
top: "data"
top: "label"
include {
phase: TEST
}
source_class: "com.yahoo.ml.caffe.LMDB"
memory_data_param {
source: "file:/home/caffe/Caffe/CaffeOnSpark/mnist_test_lmdb/"
batch_size: 100
channels: 1
height: 28
width: 28
share_in_parallel: false
}
transform_param {
scale: 0.00390625
}
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 20
kernel_size: 5
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
convolution_param {
num_output: 50
kernel_size: 5
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "pool2"
type: "Pooling"
bottom: "conv2"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "ip1"
type: "InnerProduct"
bottom: "pool2"
top: "ip1"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 500
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "ip1"
top: "ip1"
}
layer {
name: "ip2"
type: "InnerProduct"
bottom: "ip1"
top: "ip2"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 10
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
}
}
}
layer {
name: "accuracy"
type: "Accuracy"
bottom: "ip2"
bottom: "label"
top: "accuracy"
include {
phase: TEST
}
}
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "ip2"
bottom: "label"
top: "loss"
}
MakeFile.config is attached
Makefile.config.txt
Please help.
The text was updated successfully, but these errors were encountered: