rbgirshick · zeyuanxy · May 7, 2015 · May 7, 2015 · May 7, 2015 · May 7, 2015
diff --git a/README.md b/README.md
@@ -1,254 +1,3 @@
-# *Fast* R-CNN: Fast Region-based Convolutional Networks for object detection
-
-Created by Ross Girshick at Microsoft Research, Redmond.
-
-### Introduction
-
-**Fast R-CNN** is a fast framework for object detection with deep ConvNets. Fast R-CNN
- - trains state-of-the-art models, like VGG16, 9x faster than traditional R-CNN and 3x faster than SPPnet,
- - runs 200x faster than R-CNN and 10x faster than SPPnet at test-time,
- - has a significantly higher mAP on PASCAL VOC than both R-CNN and SPPnet,
- - and is written in Python and C++/Caffe.
-
-Fast R-CNN was initially described in an [arXiv tech report](http://arxiv.org/abs/1504.08083).
-
-### License
-
-Fast R-CNN is released under the MIT License (refer to the LICENSE file for details).
-
-### Citing Fast R-CNN
-
-If you find Fast R-CNN useful in your research, please consider citing:
-
-    @article{girshick15fastrcnn,
-        Author = {Ross Girshick},
-        Title = {Fast R-CNN},
-        Journal = {arXiv preprint arXiv:1504.08083},
-        Year = {2015}
-    }
-
-### Contents
-1. [Requirements: software](#requirements-software)
-2. [Requirements: hardware](#requirements-hardware)
-3. [Basic installation](#installation-sufficient-for-the-demo)
-4. [Demo](#demo)
-5. [Beyond the demo: training and testing](#beyond-the-demo-installation-for-training-and-testing-models)
-6. [Usage](#usage)
-7. [Extra downloads](#extra-downloads)
-
-### Requirements: software
-
-1. Requirements for `Caffe` and `pycaffe` (see: [Caffe installation instructions](http://caffe.berkeleyvision.org/installation.html))
-
-  **Note:** Caffe *must* be built with support for Python layers!
-
-  ```make
-  # In your Makefile.config, make sure to have this line uncommented
-  WITH_PYTHON_LAYER := 1
-  ```
-
-  You can download my [Makefile.config](http://www.cs.berkeley.edu/~rbg/fast-rcnn-data/Makefile.config) for reference.
-2. Python packages you might not have: `cython`, `python-opencv`, `easydict`
-3. [optional] MATLAB (required for PASCAL VOC evaluation only)
-
-### Requirements: hardware
-
-1. For training smaller networks (CaffeNet, VGG_CNN_M_1024) a good GPU (e.g., Titan, K20, K40, ...) with at least 3G of memory suffices
-2. For training with VGG16, you'll need a K40 (~11G of memory)
-
-### Installation (sufficient for the demo)
-
-1. Clone the Fast R-CNN repository
-  ```Shell
-  # Make sure to clone with --recursive
-  git clone --recursive https://github.com/rbgirshick/fast-rcnn.git
-  ```
-
-2. We'll call the directory that you cloned Fast R-CNN into `FRCN_ROOT`
-
-   *Ignore notes 1 and 2 if you followed step 1 above.*
-
-   **Note 1:** If you didn't clone Fast R-CNN with the `--recursive` flag, then you'll need to manually clone the `caffe-fast-rcnn` submodule:
-    ```Shell
-    git submodule update --init --recursive
-    ```
-    **Note 2:** The `caffe-fast-rcnn` submodule needs to be on the `fast-rcnn` branch (or equivalent detached state). This will happen automatically *if you follow these instructions*.
-
-3. Build the Cython modules
-    ```Shell
-    cd $FRCN_ROOT/lib
-    make
-    ```
-
-4. Build Caffe and pycaffe
-    ```Shell
-    cd $FRCN_ROOT/caffe-fast-rcnn
-    # Now follow the Caffe installation instructions here:
-    #   http://caffe.berkeleyvision.org/installation.html
-
-    # If you're experienced with Caffe and have all of the requirements installed
-    # and your Makefile.config in place, then simply do:
-    make -j8 && make pycaffe
-    ```
-
-5. Download pre-computed Fast R-CNN detectors
-    ```Shell
-    cd $FRCN_ROOT
-    ./data/scripts/fetch_fast_rcnn_models.sh
-    ```
-
-    This will populate the `$FRCN_ROOT/data` folder with `fast_rcnn_models`. See `data/README.md` for details.
-
-### Demo
-
-*After successfully completing [basic installation](#installation-sufficient-for-the-demo)*, you'll be ready to run the demo.
-
-**Python**
-
-To run the demo
-```Shell
-cd $FRCN_ROOT
-./tools/demo.py
-```
-The demo performs detection using a VGG16 network trained for detection on PASCAL VOC 2007. The object proposals are pre-computed in order to reduce installation requirements.
-
-**Note:** If the demo crashes Caffe because your GPU doesn't have enough memory, try running the demo with a small network, e.g., `./tools/demo.py --net caffenet` or with `--net vgg_cnn_m_1024`. Or run in CPU mode `./tools/demo.py --cpu`. Type `./tools/demo.py -h` for usage.
-
-**MATLAB**
-
-There's also a *basic* MATLAB demo, though it's missing some minor bells and whistles compared to the Python version.
-```Shell
-cd $FRCN_ROOT/matlab
-matlab # wait for matlab to start...
-
-# At the matlab prompt, run the script:
->> fast_rcnn_demo
-```
-
-Fast R-CNN training is implemented in Python only, but test-time detection functionality also exists in MATLAB.
-See `matlab/fast_rcnn_demo.m` and `matlab/fast_rcnn_im_detect.m` for details.
-
-**Computing object proposals**
-
-The demo uses pre-computed selective search proposals computed with [this code](https://github.com/rbgirshick/rcnn/blob/master/selective_search/selective_search_boxes.m).
-If you'd like to compute proposals on your own images, there are many options.
-Here are some pointers; if you run into trouble using these resources please direct questions to the respective authors.
-
-1. Selective Search: [original matlab code](http://disi.unitn.it/~uijlings/MyHomepage/index.php#page=projects1), [python wrapper](https://github.com/sergeyk/selective_search_ijcv_with_python)
-2. EdgeBoxes: [matlab code](https://github.com/pdollar/edges)
-3. GOP and LPO: [python code](http://www.philkr.net/)
-4. MCG: [matlab code](http://www.eecs.berkeley.edu/Research/Projects/CS/vision/grouping/mcg/)
-5. RIGOR: [matlab code](http://cpl.cc.gatech.edu/projects/RIGOR/)
-
-Apologies if I've left your method off this list. Feel free to contact me and ask for it to be included.
-
-### Beyond the demo: installation for training and testing models
-1. Download the training, validation, test data and VOCdevkit
-
-	```Shell
-	wget http://pascallin.ecs.soton.ac.uk/challenges/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
-	wget http://pascallin.ecs.soton.ac.uk/challenges/VOC/voc2007/VOCtest_06-Nov-2007.tar
-	wget http://pascallin.ecs.soton.ac.uk/challenges/VOC/voc2007/VOCdevkit_08-Jun-2007.tar
-	```
-
-2. Extract all of these tars into one directory named `VOCdevkit`
-
-	```Shell
-	tar xvf VOCtrainval_06-Nov-2007.tar
-	tar xvf VOCtest_06-Nov-2007.tar
-	tar xvf VOCdevkit_08-Jun-2007.tar
-	```
-
-3. It should have this basic structure
-
-	```Shell
-  	$VOCdevkit/                           # development kit
-  	$VOCdevkit/VOCcode/                   # VOC utility code
-  	$VOCdevkit/VOC2007                    # image sets, annotations, etc.
-  	# ... and several other directories ...
-  	```
-
-4. Create symlinks for the PASCAL VOC dataset
-
-	```Shell
-    cd $FRCN_ROOT/data
-    ln -s $VOCdevkit VOCdevkit2007
-    ```
-    Using symlinks is a good idea because you will likely want to share the same PASCAL dataset installation between multiple projects.
-5. [Optional] follow similar steps to get PASCAL VOC 2010 and 2012
-6. Follow the next sections to download pre-computed object proposals and pre-trained ImageNet models
-
-### Download pre-computed Selective Search object proposals
-
-Pre-computed selective search boxes can also be downloaded for VOC2007 and VOC2012.
-
-```Shell
-cd $FRCN_ROOT
-./data/scripts/fetch_selective_search_data.sh
-```
-
-This will populate the `$FRCN_ROOT/data` folder with `selective_selective_data`.
-
-### Download pre-trained ImageNet models
-
-Pre-trained ImageNet models can be downloaded for the three networks described in the paper: CaffeNet (model **S**), VGG_CNN_M_1024 (model **M**), and VGG16 (model **L**).
-
-```Shell
-cd $FRCN_ROOT
-./data/scripts/fetch_imagenet_models.sh
-```
-These models are all available in the [Caffe Model Zoo](https://github.com/BVLC/caffe/wiki/Model-Zoo), but are provided here for your convenience.
-
-### Usage
-
-**Train** a Fast R-CNN detector. For example, train a VGG16 network on VOC 2007 trainval:
-
-```Shell
-./tools/train_net.py --gpu 0 --solver models/VGG16/solver.prototxt \
-	--weights data/imagenet_models/VGG16.v2.caffemodel
-```
-
-If you see this error
-
-```
-EnvironmentError: MATLAB command 'matlab' not found. Please add 'matlab' to your PATH.
-```
-
-then you need to make sure the `matlab` binary is in your `$PATH`. MATLAB is currently required for PASCAL VOC evaluation.
-
-**Test** a Fast R-CNN detector. For example, test the VGG 16 network on VOC 2007 test:
-
-```Shell
-./tools/test_net.py --gpu 1 --def models/VGG16/test.prototxt \
-	--net output/default/voc_2007_trainval/vgg16_fast_rcnn_iter_40000.caffemodel
-```
-
-Test output is written underneath `$FRCN_ROOT/output`.
-
-**Compress** a Fast R-CNN model using truncated SVD on the fully-connected layers:
-
-```Shell
-./tools/compress_net.py --def models/VGG16/test.prototxt \
-	--def-svd models/VGG16/compressed/test.prototxt \
-    --net output/default/voc_2007_trainval/vgg16_fast_rcnn_iter_40000.caffemodel
-# Test the model you just compressed
-./tools/test_net.py --gpu 0 --def models/VGG16/compressed/test.prototxt \
-	--net output/default/voc_2007_trainval/vgg16_fast_rcnn_iter_40000_svd_fc6_1024_fc7_256.caffemodel
-```
-
-### Experiment scripts
-Scripts to reproduce the experiments in the paper (*up to stochastic variation*) are provided in `$FRCN_ROOT/experiments/scripts`. Log files for experiments are located in `experiments/logs`.
-
-**Note:** Until recently (commit a566e39), the RNG seed for Caffe was not fixed during training. Now it's fixed, unless `train_net.py` is called with the `--rand` flag.
-Results generated before this commit will have some stochastic variation.
-
-### Extra downloads
-
-- [Experiment logs](http://www.cs.berkeley.edu/~rbg/fast-rcnn-data/fast_rcnn_experiments.tgz)
-- PASCAL VOC test set detections
-    - [voc_2007_test_results_fast_rcnn_caffenet_trained_on_2007_trainval.tgz](http://www.cs.berkeley.edu/~rbg/fast-rcnn-data/voc_2007_test_results_fast_rcnn_caffenet_trained_on_2007_trainval.tgz)
-    - [voc_2007_test_results_fast_rcnn_vgg16_trained_on_2007_trainval.tgz](http://www.cs.berkeley.edu/~rbg/fast-rcnn-data/voc_2007_test_results_fast_rcnn_vgg16_trained_on_2007_trainval.tgz)
-    - [voc_2007_test_results_fast_rcnn_vgg_cnn_m_1024_trained_on_2007_trainval.tgz](http://www.cs.berkeley.edu/~rbg/fast-rcnn-data/voc_2007_test_results_fast_rcnn_vgg_cnn_m_1024_trained_on_2007_trainval.tgz)
-    - [voc_2012_test_results_fast_rcnn_vgg16_trained_on_2007_trainvaltest_2012_trainval.tgz](http://www.cs.berkeley.edu/~rbg/fast-rcnn-data/voc_2012_test_results_fast_rcnn_vgg16_trained_on_2007_trainvaltest_2012_trainval.tgz)
-    - [voc_2012_test_results_fast_rcnn_vgg16_trained_on_2012_trainval.tgz](http://www.cs.berkeley.edu/~rbg/fast-rcnn-data/voc_2012_test_results_fast_rcnn_vgg16_trained_on_2012_trainval.tgz)
-- [Fast R-CNN VGG16 model](http://www.cs.berkeley.edu/~rbg/fast-rcnn-data/voc12_submission.tgz) trained on VOC07 train,val,test union with VOC12 train,val
+### Train and Test on Another Dataset
+- [Train](https://github.com/zeyuanxy/fast-rcnn/blob/master/help/train/README.md)
+- [Test](https://github.com/zeyuanxy/fast-rcnn/blob/master/help/test/README.md)
diff --git a/commands.txt b/commands.txt
@@ -0,0 +1,5 @@
+./tools/train_net.py --gpu 0 --solver models/VGG_CNN_M_1024/solver.prototxt \
+    --weights data/imagenet_models/VGG_CNN_M_1024.v2.caffemodel --imdb inria_train
+
+./tools/test_net.py --gpu 1 --def models/VGG_CNN_M_1024/test.prototxt \
+    --net output/default/train/vgg_cnn_m_1024_fast_rcnn_iter_40000.caffemodel --imdb inria_test
diff --git a/help/INRIA/VOCcode/PASemptyobject.m b/help/INRIA/VOCcode/PASemptyobject.m
@@ -0,0 +1,11 @@
+function object=PASemptyobject
+  object.label='';
+  object.orglabel='';
+  object.bbox=[];
+  object.polygon=[];
+  object.mask='';
+  object.class='';
+  object.view='';
+  object.truncated=false;
+  object.difficult=false;
+return
diff --git a/help/INRIA/VOCcode/PASemptyrecord.m b/help/INRIA/VOCcode/PASemptyrecord.m
@@ -0,0 +1,6 @@
+function record=PASemptyrecord
+  record.imgname='';
+  record.imgsize=[];
+  record.database='';
+  record.objects=PASemptyobject;
+return
diff --git a/help/INRIA/VOCcode/PASerrmsg.m b/help/INRIA/VOCcode/PASerrmsg.m
@@ -0,0 +1,7 @@
+function PASerrmsg(PASerr,SYSerr)
+  fprintf('Pascal Error Message: %s\n',PASerr);
+  fprintf('System Error Message: %s\n',SYSerr);
+  k=input('Enter K for keyboard, any other key to continue or ^C to quit ...','s');
+  if (~isempty(k)), if (lower(k)=='k'), keyboard; end; end;
+  fprintf('\n');
+return
diff --git a/help/INRIA/VOCcode/PASreadrecord.m b/help/INRIA/VOCcode/PASreadrecord.m
@@ -0,0 +1,99 @@
+function record=PASreadrecord(filename)
+  [fd,syserrmsg]=fopen(filename,'rt');
+  if (fd==-1),
+    PASmsg=sprintf('Could not open %s for reading',filename);
+    PASerrmsg(PASmsg,syserrmsg); 
+  end;
+
+  matchstrs=initstrings;
+  record=PASemptyrecord;
+  notEOF=1;
+  while (notEOF),
+    line=fgetl(fd);
+    notEOF=ischar(line);
+    if (notEOF),
+      matchnum=match(line,matchstrs);
+      switch matchnum,
+    case 1, [imgname]=strread(line,matchstrs(matchnum).str);
+	        record.imgname=char(imgname);
+	case 2, [x,y,c]=strread(line,matchstrs(matchnum).str);
+	        record.imgsize=[x y c];
+	case 3, [database]=strread(line,matchstrs(matchnum).str);
+	        record.database=char(database);
+	case 4, [obj,lbl,xmin,ymin,xmax,ymax]=strread(line,matchstrs(matchnum).str);
+	        record.objects(obj).label=char(lbl);
+		record.objects(obj).bbox=[min(xmin,xmax),min(ymin,ymax),max(xmin,xmax),max(ymin,ymax)];
+	case 5, tmp=findstr(line,' : ');
+	        [obj,lbl]=strread(line(1:tmp),matchstrs(matchnum).str);
+	        record.objects(obj).label=char(lbl);
+    		record.objects(obj).polygon=sscanf(line(tmp+3:end),'(%d, %d) ')';
+	case 6, [obj,lbl,mask]=strread(line,matchstrs(matchnum).str);
+	        record.objects(obj).label=char(lbl);
+    		record.objects(obj).mask=char(mask);
+	case 7, [obj,lbl,orglbl]=strread(line,matchstrs(matchnum).str);
+            lbl=char(lbl);
+	        record.objects(obj).label=lbl;
+    		record.objects(obj).orglabel=char(orglbl);
+            if strcmp(lbl(max(end-8,1):end),'Difficult')
+                record.objects(obj).difficult=true;
+                lbl(end-8:end)=[];
+            else
+                record.objects(obj).difficult=false;
+            end
+            if strcmp(lbl(max(end-4,1):end),'Trunc')
+                record.objects(obj).truncated=true;
+                lbl(end-4:end)=[];
+            else
+                record.objects(obj).truncated=false;
+            end
+            t=find(lbl>='A'&lbl<='Z');
+            t=t(t>=4);
+            if ~isempty(t)
+                record.objects(obj).view=lbl(t(1):end);
+                lbl(t(1):end)=[];
+            else
+                record.objects(obj).view='';                
+            end
+            record.objects(obj).class=lbl(4:end);
+
+	otherwise, %fprintf('Skipping: %s\n',line);
+      end;
+    end;
+  end;
+  fclose(fd);
+return
+
+function matchnum=match(line,matchstrs)
+  for i=1:length(matchstrs),
+    matched(i)=strncmp(line,matchstrs(i).str,matchstrs(i).matchlen);
+  end;
+  matchnum=find(matched);
+  if isempty(matchnum), matchnum=0; end;
+  if (length(matchnum)~=1), 
+    PASerrmsg('Multiple matches while parsing','');
+  end;
+return
+
+function s=initstrings
+  s(1).matchlen=14;
+  s(1).str='Image filename : %q';
+
+  s(2).matchlen=10;
+  s(2).str='Image size (X x Y x C) : %d x %d x %d';
+
+  s(3).matchlen=8;
+  s(3).str='Database : %q';
+
+  s(4).matchlen=8;
+  s(4).str='Bounding box for object %d %q (Xmin, Ymin) - (Xmax, Ymax) : (%d, %d) - (%d, %d)';
+
+  s(5).matchlen=7;
+  s(5).str='Polygon for object %d %q (X, Y)';
+
+  s(6).matchlen=5;
+  s(6).str='Pixel mask for object %d %q : %q';
+
+  s(7).matchlen=8;
+  s(7).str='Original label for object %d %q : %q';
+
+return