Skip to content

Commit

Permalink
import from upstream
Browse files Browse the repository at this point in the history
  • Loading branch information
ijkguo committed Jul 13, 2018
1 parent a02850b commit e005b38
Show file tree
Hide file tree
Showing 23 changed files with 82 additions and 1,806 deletions.
74 changes: 24 additions & 50 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,72 +2,46 @@

![example detections](https://cloud.githubusercontent.com/assets/13162287/22101032/92085dc0-de6c-11e6-9228-67e72606ddbc.png)

Region Proposal Network solves object detection as a regression problem
from the objectness perspective. Bounding boxes are predicted by applying
learned bounding box deltas to base boxes, namely anchor boxes across
different positions in feature maps. Training process directly learns a
mapping from raw image intensities to bounding box transformation targets.
### Set up environment
* Require latest MXNet. Set environment variable by `export MXNET_CUDNN_AUTOTUNE_DEFAULT=0`.
* Install Python package `mxnet` (cpu inference only) or `mxnet-cu90` (gpu training), `cython` then `opencv-python matplotlib pycocotools tqdm`.

Fast R-CNN treats general object detection as a classification problem and
bounding box prediction as a regression problem. Classifying cropped region
feature maps and predicting bounding box displacements together yields
detection results. Cropping feature maps instead of image input accelerates
computation utilizing shared convolution maps. Bounding box displacements
are simultaneously learned in the training process.
### Out-of-box inference models
Download any of the following models to the current directory and run `python3 demo.py --dataset $Dataset$ --network $Network$ --params $MODEL_FILE$ --image $YOUR_IMAGE$` to get single image inference.
For example `python3 demo.py --dataset voc --network vgg16 --params vgg16_voc0712.params --image myimage.jpg`, add `--gpu 0` to use GPU optionally.
Different network has different configuration. Different dataset has different object class names. You must pass them explicitly as command line arguments.

Faster R-CNN utilize an alternate optimization training process between RPN
and Fast R-CNN. Fast R-CNN weights are used to initiate RPN for training.
The approximate joint training scheme does not backpropagate rcnn training
error to rpn training.
| Network | Dataset | Imageset | Reference | Result | Link |
| :------ | :------------ | :----------- | :-------: | :----: | :---: |
| vgg16 | voc | 07/07 | 69.9 | 70.23 | [Dropbox](https://www.dropbox.com/s/gfxnf1qzzc0lzw2/vgg_voc07-0010.params?dl=0) |
| vgg16 | voc | 07++12/07 | 73.2 | 75.97 | [Dropbox](https://www.dropbox.com/s/rvktx65s48cuyb9/vgg_voc0712-0010.params?dl=0) |
| resnet101 | voc | 07++12/07 | 76.4 | 79.35 | [Dropbox](https://www.dropbox.com/s/ge2wl0tn47xezdf/resnet_voc0712-0010.params?dl=0) |
| vgg16 | coco | train2017/val2017 | 21.2 | 22.8 | [Dropbox](https://www.dropbox.com/s/e0ivvrc4pku3vj7/vgg_coco-0010.params?dl=0) |
| resnet101 | coco | train2017/val2017 | 27.2 | 26.1 | [Dropbox](https://www.dropbox.com/s/bfuy2uo1q1nwqjr/resnet_coco-0010.params?dl=0) |

## Experiments

| Indicator | py-faster-rcnn (caffe resp.) | mx-rcnn (this reproduction) |
| :-------- | :--------------------------- | :-------------------------- |
| Training speed [1] | 2.5 img/s training, 5 img/s testing | 3.8 img/s in training, 12.5 img/s testing |
| Valset performance [2] | mAP 73.2 | mAP 75.97 |
| Memory usage [3] | 11G for Fast R-CNN | 4.6G for Fast R-CNN |
| Parallelization [4] | None | 3.8 img/s to 6 img/s for 2 GPUs |
| Extensibility [5] | Old framework and base networks | ResNet |

[1] On Ubuntu 14.04.5 with device Titan X, cuDNN enabled.
The experiment is VGG-16 end-to-end training.
[2] VGG network. Trained end-to-end on VOC07trainval+12trainval, tested on VOC07 test.
[3] VGG network. Fast R-CNN is the most memory expensive process.
[4] VGG network (parallelization limited by bandwidth).
ResNet-101 speeds up from 2 img/s to 3.5 img/s.
[5] py-faster-rcnn does not support ResNet or recent caffe version.

| Method | Network | Training Data | Testing Data | Reference | Result | Link |
| :----- | :------ | :------------ | :----------- | :-------: | :----: | :---: |
| Faster R-CNN end-to-end | VGG16 | VOC07 | VOC07test | 69.9 | 70.23 | [Dropbox](https://www.dropbox.com/s/gfxnf1qzzc0lzw2/vgg_voc07-0010.params?dl=0) |
| Faster R-CNN end-to-end | VGG16 | VOC07+12 | VOC07test | 73.2 | 75.97 | [Dropbox](https://www.dropbox.com/s/rvktx65s48cuyb9/vgg_voc0712-0010.params?dl=0) |
| Faster R-CNN end-to-end | ResNet-101 | VOC07+12 | VOC07test | 76.4 | 79.35 | [Dropbox](https://www.dropbox.com/s/ge2wl0tn47xezdf/resnet_voc0712-0010.params?dl=0) |
| Faster R-CNN end-to-end | VGG16 | COCO train | COCO val | 21.2 | 22.8 | [Dropbox](https://www.dropbox.com/s/e0ivvrc4pku3vj7/vgg_coco-0010.params?dl=0) |
| Faster R-CNN end-to-end | ResNet-101 | COCO train | COCO val | 27.2 | 26.1 | [Dropbox](https://www.dropbox.com/s/bfuy2uo1q1nwqjr/resnet_coco-0010.params?dl=0) |

The above experiments were conducted on [this version of repository](https://github.com/precedenceguo/mx-rcnn/tree/6a1ab0eec5035a10a1efb5fc8c9d6c54e101b4d0)
with [a modified MXNet based on 0.9.1 nnvm pre-release](https://github.com/precedenceguo/mxnet/tree/simple).

## Set up environment
* Install Python package `mxnet` or `mxnet-cu90`, `cython` and `opencv-python matplotlib pycocotools tqdm`.

## Download data and label
Follow `py-faster-rcnn` for data preparation instructions.
### Download data and label
Make a directory `data` and follow `py-faster-rcnn` for data preparation instructions.
* [Pascal VOC](http://host.robots.ox.ac.uk/pascal/VOC/) should be in `data/VOCdevkit` containing `VOC2007`, `VOC2012` and `annotations`.
* [MSCOCO](http://mscoco.org/dataset/) should be in `data/coco` containing `train2017`, `val2017` and `annotations/instances_train2017.json`, `annotations/instances_val2017.json`.

## Download pretrained ImageNet models
### Download pretrained ImageNet models
* [VGG16](http://www.robots.ox.ac.uk/~vgg/research/very_deep/) should be at `model/vgg16-0000.params` from [MXNet model zoo](http://data.dmlc.ml/models/imagenet/vgg/).
* [ResNet](https://github.com/tornadomeet/ResNet) should be at `model/resnet-101-0000.params` from [MXNet model zoo](http://data.dmlc.ml/models/imagenet/resnet/).

### Training and evaluation
Use `python3 train.py --dataset $Dataset$ --network $Network$ --pretrained $IMAGENET_MODEL_FILE$ --gpus $GPUS$` to train,
for example, `python3 train.py --dataset voc --network vgg16 --pretrained model/vgg16-0000.params --gpus 0,1`.
Use `python3 test.py --dataset $Dataset$ --network $Network$ --params $MODEL_FILE$ --gpu $GPU$` to evaluate,
for example, `python3 test.py --dataset voc --network vgg16 --params model/vgg16-0010.params --gpu 0`.

### History
* May 25, 2016: We released Fast R-CNN implementation.
* July 6, 2016: We released Faster R-CNN implementation.
* July 23, 2016: We updated to MXNet module solver.
* Oct 10, 2016: tornadomeet released approximate end-to-end training.
* Oct 30, 2016: We updated to MXNet module inference.
* Jan 19, 2017: We accelerated our pipeline and supported ResNet training.
* Jun 22, 2018: We simplified code.

### Disclaimer
This repository used code from [MXNet](https://github.com/dmlc/mxnet),
Expand Down
33 changes: 24 additions & 9 deletions sym_demo.py → demo.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,12 @@
import pprint

import mxnet as mx
from mxnet.module import Module

from symdata.bbox import im_detect
from symdata.loader import load_test, generate_batch
from symdata.vis import vis_detection
from symnet.model import get_net
from symnet.model import load_param, check_shape


def demo_net(sym, class_names, args):
Expand All @@ -27,14 +28,29 @@ def demo_net(sym, class_names, args):
# generate data batch
data_batch = generate_batch(im_tensor, im_info)

# assemble executor
predictor = get_net(sym, args.params, ctx, short=args.img_short_side, max_size=args.img_long_side)
# load params
arg_params, aux_params = load_param(args.params, ctx=ctx)

# produce shape max possible
data_names = ['data', 'im_info']
label_names = None
data_shapes = [('data', (1, 3, args.img_long_side, args.img_long_side)), ('im_info', (1, 3))]
label_shapes = None

# check shapes
check_shape(sym, data_shapes, arg_params, aux_params)

# create and bind module
mod = Module(sym, data_names, label_names, context=ctx)
mod.bind(data_shapes, label_shapes, for_training=False)
mod.init_params(arg_params=arg_params, aux_params=aux_params)

# forward
output = predictor.predict(data_batch)
rois = output['rois_output'][:, 1:]
scores = output['cls_prob_reshape_output'][0]
bbox_deltas = output['bbox_pred_reshape_output'][0]
mod.forward(data_batch)
rois, scores, bbox_deltas = mod.get_outputs()
rois = rois[:, 1:]
scores = scores[0]
bbox_deltas = bbox_deltas[0]
im_info = im_info[0]

# decode detection
Expand All @@ -55,7 +71,7 @@ def demo_net(sym, class_names, args):
def parse_args():
parser = argparse.ArgumentParser(description='Demonstrate a Faster R-CNN network',
formatter_class=argparse.ArgumentDefaultsHelpFormatter)
parser.add_argument('--network', type=str, default='resnet50', help='base network')
parser.add_argument('--network', type=str, default='vgg16', help='base network')
parser.add_argument('--params', type=str, default='', help='path to trained model')
parser.add_argument('--dataset', type=str, default='voc', help='training dataset')
parser.add_argument('--image', type=str, default='', help='path to test image')
Expand Down Expand Up @@ -99,7 +115,6 @@ def get_voc_names(args):

def get_coco_names(args):
from symimdb.coco import coco
args.rpn_anchor_scales = (2, 4, 8, 16, 32)
args.rcnn_num_classes = len(coco.classes)
return coco.classes

Expand Down
150 changes: 0 additions & 150 deletions gluon_demo.py

This file was deleted.

Loading

0 comments on commit e005b38

Please sign in to comment.