Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Train and Test R-CNN on Another Dataset #21

Open
wants to merge 26 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
257 changes: 3 additions & 254 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,254 +1,3 @@
# *Fast* R-CNN: Fast Region-based Convolutional Networks for object detection

Created by Ross Girshick at Microsoft Research, Redmond.

### Introduction

**Fast R-CNN** is a fast framework for object detection with deep ConvNets. Fast R-CNN
- trains state-of-the-art models, like VGG16, 9x faster than traditional R-CNN and 3x faster than SPPnet,
- runs 200x faster than R-CNN and 10x faster than SPPnet at test-time,
- has a significantly higher mAP on PASCAL VOC than both R-CNN and SPPnet,
- and is written in Python and C++/Caffe.

Fast R-CNN was initially described in an [arXiv tech report](http://arxiv.org/abs/1504.08083).

### License

Fast R-CNN is released under the MIT License (refer to the LICENSE file for details).

### Citing Fast R-CNN

If you find Fast R-CNN useful in your research, please consider citing:

@article{girshick15fastrcnn,
Author = {Ross Girshick},
Title = {Fast R-CNN},
Journal = {arXiv preprint arXiv:1504.08083},
Year = {2015}
}

### Contents
1. [Requirements: software](#requirements-software)
2. [Requirements: hardware](#requirements-hardware)
3. [Basic installation](#installation-sufficient-for-the-demo)
4. [Demo](#demo)
5. [Beyond the demo: training and testing](#beyond-the-demo-installation-for-training-and-testing-models)
6. [Usage](#usage)
7. [Extra downloads](#extra-downloads)

### Requirements: software

1. Requirements for `Caffe` and `pycaffe` (see: [Caffe installation instructions](http://caffe.berkeleyvision.org/installation.html))

**Note:** Caffe *must* be built with support for Python layers!

```make
# In your Makefile.config, make sure to have this line uncommented
WITH_PYTHON_LAYER := 1
```

You can download my [Makefile.config](http://www.cs.berkeley.edu/~rbg/fast-rcnn-data/Makefile.config) for reference.
2. Python packages you might not have: `cython`, `python-opencv`, `easydict`
3. [optional] MATLAB (required for PASCAL VOC evaluation only)

### Requirements: hardware

1. For training smaller networks (CaffeNet, VGG_CNN_M_1024) a good GPU (e.g., Titan, K20, K40, ...) with at least 3G of memory suffices
2. For training with VGG16, you'll need a K40 (~11G of memory)

### Installation (sufficient for the demo)

1. Clone the Fast R-CNN repository
```Shell
# Make sure to clone with --recursive
git clone --recursive https://github.com/rbgirshick/fast-rcnn.git
```

2. We'll call the directory that you cloned Fast R-CNN into `FRCN_ROOT`

*Ignore notes 1 and 2 if you followed step 1 above.*

**Note 1:** If you didn't clone Fast R-CNN with the `--recursive` flag, then you'll need to manually clone the `caffe-fast-rcnn` submodule:
```Shell
git submodule update --init --recursive
```
**Note 2:** The `caffe-fast-rcnn` submodule needs to be on the `fast-rcnn` branch (or equivalent detached state). This will happen automatically *if you follow these instructions*.

3. Build the Cython modules
```Shell
cd $FRCN_ROOT/lib
make
```

4. Build Caffe and pycaffe
```Shell
cd $FRCN_ROOT/caffe-fast-rcnn
# Now follow the Caffe installation instructions here:
# http://caffe.berkeleyvision.org/installation.html

# If you're experienced with Caffe and have all of the requirements installed
# and your Makefile.config in place, then simply do:
make -j8 && make pycaffe
```

5. Download pre-computed Fast R-CNN detectors
```Shell
cd $FRCN_ROOT
./data/scripts/fetch_fast_rcnn_models.sh
```

This will populate the `$FRCN_ROOT/data` folder with `fast_rcnn_models`. See `data/README.md` for details.

### Demo

*After successfully completing [basic installation](#installation-sufficient-for-the-demo)*, you'll be ready to run the demo.

**Python**

To run the demo
```Shell
cd $FRCN_ROOT
./tools/demo.py
```
The demo performs detection using a VGG16 network trained for detection on PASCAL VOC 2007. The object proposals are pre-computed in order to reduce installation requirements.

**Note:** If the demo crashes Caffe because your GPU doesn't have enough memory, try running the demo with a small network, e.g., `./tools/demo.py --net caffenet` or with `--net vgg_cnn_m_1024`. Or run in CPU mode `./tools/demo.py --cpu`. Type `./tools/demo.py -h` for usage.

**MATLAB**

There's also a *basic* MATLAB demo, though it's missing some minor bells and whistles compared to the Python version.
```Shell
cd $FRCN_ROOT/matlab
matlab # wait for matlab to start...

# At the matlab prompt, run the script:
>> fast_rcnn_demo
```

Fast R-CNN training is implemented in Python only, but test-time detection functionality also exists in MATLAB.
See `matlab/fast_rcnn_demo.m` and `matlab/fast_rcnn_im_detect.m` for details.

**Computing object proposals**

The demo uses pre-computed selective search proposals computed with [this code](https://github.com/rbgirshick/rcnn/blob/master/selective_search/selective_search_boxes.m).
If you'd like to compute proposals on your own images, there are many options.
Here are some pointers; if you run into trouble using these resources please direct questions to the respective authors.

1. Selective Search: [original matlab code](http://disi.unitn.it/~uijlings/MyHomepage/index.php#page=projects1), [python wrapper](https://github.com/sergeyk/selective_search_ijcv_with_python)
2. EdgeBoxes: [matlab code](https://github.com/pdollar/edges)
3. GOP and LPO: [python code](http://www.philkr.net/)
4. MCG: [matlab code](http://www.eecs.berkeley.edu/Research/Projects/CS/vision/grouping/mcg/)
5. RIGOR: [matlab code](http://cpl.cc.gatech.edu/projects/RIGOR/)

Apologies if I've left your method off this list. Feel free to contact me and ask for it to be included.

### Beyond the demo: installation for training and testing models
1. Download the training, validation, test data and VOCdevkit

```Shell
wget http://pascallin.ecs.soton.ac.uk/challenges/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
wget http://pascallin.ecs.soton.ac.uk/challenges/VOC/voc2007/VOCtest_06-Nov-2007.tar
wget http://pascallin.ecs.soton.ac.uk/challenges/VOC/voc2007/VOCdevkit_08-Jun-2007.tar
```

2. Extract all of these tars into one directory named `VOCdevkit`

```Shell
tar xvf VOCtrainval_06-Nov-2007.tar
tar xvf VOCtest_06-Nov-2007.tar
tar xvf VOCdevkit_08-Jun-2007.tar
```

3. It should have this basic structure

```Shell
$VOCdevkit/ # development kit
$VOCdevkit/VOCcode/ # VOC utility code
$VOCdevkit/VOC2007 # image sets, annotations, etc.
# ... and several other directories ...
```

4. Create symlinks for the PASCAL VOC dataset

```Shell
cd $FRCN_ROOT/data
ln -s $VOCdevkit VOCdevkit2007
```
Using symlinks is a good idea because you will likely want to share the same PASCAL dataset installation between multiple projects.
5. [Optional] follow similar steps to get PASCAL VOC 2010 and 2012
6. Follow the next sections to download pre-computed object proposals and pre-trained ImageNet models

### Download pre-computed Selective Search object proposals

Pre-computed selective search boxes can also be downloaded for VOC2007 and VOC2012.

```Shell
cd $FRCN_ROOT
./data/scripts/fetch_selective_search_data.sh
```

This will populate the `$FRCN_ROOT/data` folder with `selective_selective_data`.

### Download pre-trained ImageNet models

Pre-trained ImageNet models can be downloaded for the three networks described in the paper: CaffeNet (model **S**), VGG_CNN_M_1024 (model **M**), and VGG16 (model **L**).

```Shell
cd $FRCN_ROOT
./data/scripts/fetch_imagenet_models.sh
```
These models are all available in the [Caffe Model Zoo](https://github.com/BVLC/caffe/wiki/Model-Zoo), but are provided here for your convenience.

### Usage

**Train** a Fast R-CNN detector. For example, train a VGG16 network on VOC 2007 trainval:

```Shell
./tools/train_net.py --gpu 0 --solver models/VGG16/solver.prototxt \
--weights data/imagenet_models/VGG16.v2.caffemodel
```

If you see this error

```
EnvironmentError: MATLAB command 'matlab' not found. Please add 'matlab' to your PATH.
```

then you need to make sure the `matlab` binary is in your `$PATH`. MATLAB is currently required for PASCAL VOC evaluation.

**Test** a Fast R-CNN detector. For example, test the VGG 16 network on VOC 2007 test:

```Shell
./tools/test_net.py --gpu 1 --def models/VGG16/test.prototxt \
--net output/default/voc_2007_trainval/vgg16_fast_rcnn_iter_40000.caffemodel
```

Test output is written underneath `$FRCN_ROOT/output`.

**Compress** a Fast R-CNN model using truncated SVD on the fully-connected layers:

```Shell
./tools/compress_net.py --def models/VGG16/test.prototxt \
--def-svd models/VGG16/compressed/test.prototxt \
--net output/default/voc_2007_trainval/vgg16_fast_rcnn_iter_40000.caffemodel
# Test the model you just compressed
./tools/test_net.py --gpu 0 --def models/VGG16/compressed/test.prototxt \
--net output/default/voc_2007_trainval/vgg16_fast_rcnn_iter_40000_svd_fc6_1024_fc7_256.caffemodel
```

### Experiment scripts
Scripts to reproduce the experiments in the paper (*up to stochastic variation*) are provided in `$FRCN_ROOT/experiments/scripts`. Log files for experiments are located in `experiments/logs`.

**Note:** Until recently (commit a566e39), the RNG seed for Caffe was not fixed during training. Now it's fixed, unless `train_net.py` is called with the `--rand` flag.
Results generated before this commit will have some stochastic variation.

### Extra downloads

- [Experiment logs](http://www.cs.berkeley.edu/~rbg/fast-rcnn-data/fast_rcnn_experiments.tgz)
- PASCAL VOC test set detections
- [voc_2007_test_results_fast_rcnn_caffenet_trained_on_2007_trainval.tgz](http://www.cs.berkeley.edu/~rbg/fast-rcnn-data/voc_2007_test_results_fast_rcnn_caffenet_trained_on_2007_trainval.tgz)
- [voc_2007_test_results_fast_rcnn_vgg16_trained_on_2007_trainval.tgz](http://www.cs.berkeley.edu/~rbg/fast-rcnn-data/voc_2007_test_results_fast_rcnn_vgg16_trained_on_2007_trainval.tgz)
- [voc_2007_test_results_fast_rcnn_vgg_cnn_m_1024_trained_on_2007_trainval.tgz](http://www.cs.berkeley.edu/~rbg/fast-rcnn-data/voc_2007_test_results_fast_rcnn_vgg_cnn_m_1024_trained_on_2007_trainval.tgz)
- [voc_2012_test_results_fast_rcnn_vgg16_trained_on_2007_trainvaltest_2012_trainval.tgz](http://www.cs.berkeley.edu/~rbg/fast-rcnn-data/voc_2012_test_results_fast_rcnn_vgg16_trained_on_2007_trainvaltest_2012_trainval.tgz)
- [voc_2012_test_results_fast_rcnn_vgg16_trained_on_2012_trainval.tgz](http://www.cs.berkeley.edu/~rbg/fast-rcnn-data/voc_2012_test_results_fast_rcnn_vgg16_trained_on_2012_trainval.tgz)
- [Fast R-CNN VGG16 model](http://www.cs.berkeley.edu/~rbg/fast-rcnn-data/voc12_submission.tgz) trained on VOC07 train,val,test union with VOC12 train,val
### Train and Test on Another Dataset
- [Train](https://github.com/zeyuanxy/fast-rcnn/blob/master/help/train/README.md)
- [Test](https://github.com/zeyuanxy/fast-rcnn/blob/master/help/test/README.md)
5 changes: 5 additions & 0 deletions commands.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
./tools/train_net.py --gpu 0 --solver models/VGG_CNN_M_1024/solver.prototxt \
--weights data/imagenet_models/VGG_CNN_M_1024.v2.caffemodel --imdb inria_train

./tools/test_net.py --gpu 1 --def models/VGG_CNN_M_1024/test.prototxt \
--net output/default/train/vgg_cnn_m_1024_fast_rcnn_iter_40000.caffemodel --imdb inria_test
11 changes: 11 additions & 0 deletions help/INRIA/VOCcode/PASemptyobject.m
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
function object=PASemptyobject
object.label='';
object.orglabel='';
object.bbox=[];
object.polygon=[];
object.mask='';
object.class='';
object.view='';
object.truncated=false;
object.difficult=false;
return
6 changes: 6 additions & 0 deletions help/INRIA/VOCcode/PASemptyrecord.m
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
function record=PASemptyrecord
record.imgname='';
record.imgsize=[];
record.database='';
record.objects=PASemptyobject;
return
7 changes: 7 additions & 0 deletions help/INRIA/VOCcode/PASerrmsg.m
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
function PASerrmsg(PASerr,SYSerr)
fprintf('Pascal Error Message: %s\n',PASerr);
fprintf('System Error Message: %s\n',SYSerr);
k=input('Enter K for keyboard, any other key to continue or ^C to quit ...','s');
if (~isempty(k)), if (lower(k)=='k'), keyboard; end; end;
fprintf('\n');
return
99 changes: 99 additions & 0 deletions help/INRIA/VOCcode/PASreadrecord.m
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
function record=PASreadrecord(filename)
[fd,syserrmsg]=fopen(filename,'rt');
if (fd==-1),
PASmsg=sprintf('Could not open %s for reading',filename);
PASerrmsg(PASmsg,syserrmsg);
end;

matchstrs=initstrings;
record=PASemptyrecord;
notEOF=1;
while (notEOF),
line=fgetl(fd);
notEOF=ischar(line);
if (notEOF),
matchnum=match(line,matchstrs);
switch matchnum,
case 1, [imgname]=strread(line,matchstrs(matchnum).str);
record.imgname=char(imgname);
case 2, [x,y,c]=strread(line,matchstrs(matchnum).str);
record.imgsize=[x y c];
case 3, [database]=strread(line,matchstrs(matchnum).str);
record.database=char(database);
case 4, [obj,lbl,xmin,ymin,xmax,ymax]=strread(line,matchstrs(matchnum).str);
record.objects(obj).label=char(lbl);
record.objects(obj).bbox=[min(xmin,xmax),min(ymin,ymax),max(xmin,xmax),max(ymin,ymax)];
case 5, tmp=findstr(line,' : ');
[obj,lbl]=strread(line(1:tmp),matchstrs(matchnum).str);
record.objects(obj).label=char(lbl);
record.objects(obj).polygon=sscanf(line(tmp+3:end),'(%d, %d) ')';
case 6, [obj,lbl,mask]=strread(line,matchstrs(matchnum).str);
record.objects(obj).label=char(lbl);
record.objects(obj).mask=char(mask);
case 7, [obj,lbl,orglbl]=strread(line,matchstrs(matchnum).str);
lbl=char(lbl);
record.objects(obj).label=lbl;
record.objects(obj).orglabel=char(orglbl);
if strcmp(lbl(max(end-8,1):end),'Difficult')
record.objects(obj).difficult=true;
lbl(end-8:end)=[];
else
record.objects(obj).difficult=false;
end
if strcmp(lbl(max(end-4,1):end),'Trunc')
record.objects(obj).truncated=true;
lbl(end-4:end)=[];
else
record.objects(obj).truncated=false;
end
t=find(lbl>='A'&lbl<='Z');
t=t(t>=4);
if ~isempty(t)
record.objects(obj).view=lbl(t(1):end);
lbl(t(1):end)=[];
else
record.objects(obj).view='';
end
record.objects(obj).class=lbl(4:end);

otherwise, %fprintf('Skipping: %s\n',line);
end;
end;
end;
fclose(fd);
return

function matchnum=match(line,matchstrs)
for i=1:length(matchstrs),
matched(i)=strncmp(line,matchstrs(i).str,matchstrs(i).matchlen);
end;
matchnum=find(matched);
if isempty(matchnum), matchnum=0; end;
if (length(matchnum)~=1),
PASerrmsg('Multiple matches while parsing','');
end;
return

function s=initstrings
s(1).matchlen=14;
s(1).str='Image filename : %q';

s(2).matchlen=10;
s(2).str='Image size (X x Y x C) : %d x %d x %d';

s(3).matchlen=8;
s(3).str='Database : %q';

s(4).matchlen=8;
s(4).str='Bounding box for object %d %q (Xmin, Ymin) - (Xmax, Ymax) : (%d, %d) - (%d, %d)';

s(5).matchlen=7;
s(5).str='Polygon for object %d %q (X, Y)';

s(6).matchlen=5;
s(6).str='Pixel mask for object %d %q : %q';

s(7).matchlen=8;
s(7).str='Original label for object %d %q : %q';

return
Loading