Skip to content

Instructions to create ImageNet 2012 data

Junli Gu edited this page Sep 14, 2015 · 2 revisions

Note: you need in total 250GB of available memory on your hard disk

#Step1:

mkdir /hdd/ImageNet

cd /hdd/ImageNet

#Step2: Download ImageNet data

Download training images (about 50GB) wget -c http://www.image-net.org/challenges/LSVRC/2012/nonpub/ILSVRC2012_img_train.tar &

Download validation images: wget -c http://www.image-net.org/challenges/LSVRC/2012/nonpub/ILSVRC2012_img_val.tar &

#Step3: decompress ImageNet data

To extract training data mkdir train && mv ILSVRC2012_img_train.tar train/ && cd train

tar -xvf ILSVRC2012_img_train.tar && rm -f ILSVRC2012_img_train.tar

find . -name "*.tar" | while read NAME ; do mkdir -p "${NAME%.tar}"; tar -xvf "${NAME}" -C "${NAME%.tar}"; rm -f "${NAME}"; done

// Make sure to check the completeness of the decompression, you should have 1,281,167 images in train folder

To extract validation data

cd ../ && mkdir val && mv ILSVRC2012_img_val.tar val/ && cd val && tar -xvf ILSVRC2012_img_val.tar

#Step4: preprocess ImageNet data This step requires that you have built the caffe project (either the OpenCL caffe or original caffe in CPU_ONLY mode), because we are going to use some of the scripting tools provided by caffe.

cd data/ilsvrc2012

./get_ilsvrc.sh

cd ../../

vi /example/imagenet/create_imagenet.sh

modify the following variables to point to your ImageNet data dir

TRAIN_DATA_ROOT=/hdd/ImageNet/train

VAL_DATA_ROOT=/hdd/ImageNet/val

then set data resize bool to true:

RESIZE=true

then you are ready to create the lmdb format of ImageNet data, as needed by the trianing! ./examples/imagenet/create_imagenet.sh

Clone this wiki locally