Our work is based on mmdetection. mmdetection is an open source object detection toolbox based on PyTorch. It is a part of the open-mmlab project developed by Multimedia Laboratory, CUHK.
- PyTorch compatible GPU
- Python 3.7
- PyTorch >= 1.2.0
- libjpeg-turbo 2.0.3
- MMDetection
- jpeg2dct
- Please refer to INSTALL.md for installation and dataset preparation.
- Download pretrained models and extract to
work_dirs
. The folder structure should look like this:
work_dirs
├── mask_rcnn_r50_fpn_1x_dct_24_wofreeze
│ ├── 20191029_145538.log
│ └── latest.pth
└── mask_rcnn_r50_fpn_1x_dct_64_wofreeze
├── 20191029_151515.log
└── latest.pth
Run test.py
to start testing
python tools/test.py configs/mask_rcnn_r50_rpn_1x_DCT_static_24_wofreeze.py work_dirs/mask_rcnn_r50_fpn_1x_dct_24_wofreeze/latest.pth --out results.pkl --eval bbox segm
python tools/test.py configs/mask_rcnn_r50_rpn_1x_DCT_static_64_wofreeze.py work_dirs/mask_rcnn_r50_fpn_1x_dct_64_wofreeze/latest.pth --out results.pkl --eval bbox segm
Backbone | #Channels | Size Per Channel | bbox | |||||
---|---|---|---|---|---|---|---|---|
AP | AP@0.5 | AP@0.75 | APS | APM | APL | |||
ResNet-50-FPN (RGB) | 3 | 800x1333 | 37.3 | 59.0 | 40.2 | 21.9 | 40.9 | 48.1 |
DCT-24 (ours) | 24 | 200x334 | 37.7 | 59.2 | 40.9 | 21.7 | 41.4 | 49.1 |
DCT-64 (ours) | 64 | 200x334 | 38.1 | 59.6 | 41.1 | 22.5 | 41.6 | 49.7 |
Backbone | #Channels | Size Per Channel | mask | |||||
---|---|---|---|---|---|---|---|---|
AP | AP@0.5 | AP@0.75 | APS | APM | APL | |||
ResNet-50-FPN (RGB) | 3 | 800x1333 | 34.2 | 55.9 | 36.2 | 15.8 | 36.9 | 50.1 |
DCT-24 (ours) | 24 | 200x334 | 34.6 | 56.1 | 36.9 | 16.1 | 37.4 | 50.7 |
DCT-64 (ours) | 64 | 200x334 | 35.0 | 56.5 | 37.4 | 16.9 | 37.6 | 51.6 |
Run test.py
to start testing
python tools/test.py configs/faster_rcnn_r50_fpn_1x_static_24_wofreeze.py work_dirs/faster_rcnn_r50_fpn_1x_dct_24_wofreeze/latest.pth --out results.pkl --eval bbox segm
python tools/test.py configs/faster_rcnn_r50_fpn_1x_static_64_wofreeze.py work_dirs/faster_rcnn_r50_fpn_1x_dct_64_wofreeze/latest.pth --out results.pkl --eval bbox segm
Backbone | #Channels | Size Per Channel | bbox | |||||
---|---|---|---|---|---|---|---|---|
AP | AP@0.5 | AP@0.75 | APS | APM | APL | |||
ResNet-50-FPN (RGB) | 3 | 800x1333 | 36.4 | 58.4 | 39.1 | 21.5 | 40.0 | 46.6 |
DCT-24 (ours) | 24 | 200x334 | 37.2 | 58.8 | 39.9 | 21.9 | 40.7 | 48.9 |
DCT-64 (ours) | 64 | 200x334 | 37.2 | 58.5 | 40.6 | 21.9 | 40.9 | 48.3 |