This repository contains the code for the training and evaluation of our CVPR 2022 paper. We propose a novel interactive annotation method for multiple instances of tiny objects from multiple classes, based on a few point-based user inputs. Our approach, C3Det, relates the full image context with annotator inputs in a local and global manner via late-fusion and feature-correlation, respectively.
Note: This work was developed as part of work for Lunit Inc.
This codebase is based on AerialDetection. The master branch works with PyTorch 1.1 or higher. If you would like to use PyTorch 0.4.1, please checkout the pytorch-0.4.1 branch.
-
Tiny-DOTA dataset preparation
-
A training data synthesis and an evaluation procedure
-
Late fusion module (LF)
-
Class-wise collated correlation module (C3)
-
User-input enforcing loss (UEL)
cd docker
bash build.sh ubuntu18.04-cuda9.2-cudnn7.6-python3.7-pt1.4.0
bash docker_run.sh
If you need to map a local folder to the docker environment, please use the -v
option to request the mapping during docker run
. (e.g., -v /system/dir:/in/docker/dir
)
docker exec -it [DOCKER CONTAINER ID] bash
bash install.sh
You can download DOTA-v2.0 dataset here. You must first download DOTA-v1.0 images, and then download the extra images and annotations of DOTA-v2.0.
python DOTA_devkit/split_dataset_DOTA_Tiny.py --datapath ${original_datapath}
${original_datapath}: path of original DOTA-v2.0 dataset downloaded from here.
We will only use train and val dataset which have images and labels from original DOTA dataset. On the other hand, test will not be used due to absence of labels, which are required for the training-time user-input synthesis as well as the evaluation-time user-input sampling.
The script will map the original train
and val
folder to train_old
and val_old
.
It will make and divide the original dataset to new train, val, and test dataset (70%, 10%, and 20% proportions).
The detailed information is here (5.1. Datasets section in https://arxiv.org/abs/2203.15266)
The dataset will be changed like this:
Original DOTA-v2.0 | Splitted DOTA-v2.0 | Tiny-DOTA
train | train_old | train (70%)
val | val_old | val (10%)
| | test (20%)
However, you must contain original test folder.
python DOTA_devkit/prepare_dota2.py --srcpath ${original_datapath} --dstpath ${patch_datapath}
${original_datapath}: path of splitted DOTA-Tiny
${patch_datapath}: path of splitted DOTA-Tiny as 1k x 1k size patches
python DOTA_devkit/parse_tiny_objects.py --datapath ${patch_datapath}
${patch_datapath}: path of splitted DOTA-Tiny as 1k x 1k size patches
This script will parse and generate tiny-objects (e.g., ship
, small-vehicle
, etc.) dataset from DOTA objects.
File names will be DOTA2_{train1024, val1024, test1024}_tiny.json
.
We wrote configuration files for Tiny-DOTA in configs/DOTA2_Tiny
folder.
# Configuration
configs/DOTA2_Tiny
# C3 module
mmdet/models/C3/correlation.py
# Faster R-CNN HBB and OBB models
mmdet/models/detectors/faster_rcnn.py
mmdet/models/detectors/faster_rcnn_obb.py
# RetinaNet HBB and OBB models
mmdet/models/detectors/retina.py
mmdet/models/detectors/retina_obb.py
# UEL (User Enforcing Loss)
mmdet/models/rbbox_heads/convfc_rbbox_head.py (SharedFCBBoxHeadRbboxUserinput)
mmdet.models.losses.uel.py
You can obtain a copy of the pre-trained weights for C3Det and baseline methods here.
You have to make sure the configuration file such as model_name
or user_input_loss_enable
.
- C3Det with UEL (
model_name
:FasterRCNNOBBC3Det
anduser_input_loss_enable
:True
)
bash tools/dist_test_noc.sh configs/DOTA2_Tiny/faster_rcnn_obb_r50_fpn_1x_dota2_tiny.py checkpoints/Tiny_DOTA_C3Det/Tiny_DOTA_C3Det.pth 1 --out checkpoints/Tiny_DOTA_C3Det/results.pkl --eval bbox
- EarlyFusion (
model_name
:FasterRCNNOBBEarlyFusion
anduser_input_loss_enable
:False
)
bash tools/dist_test_noc.sh configs/DOTA2_Tiny/faster_rcnn_obb_r50_fpn_1x_dota2_tiny.py checkpoints/Tiny_DOTA_Early_Fusion/Tiny_DOTA_Early_Fusion.pth 1 --out checkpoints/Tiny_DOTA_Early_Fusion/results.pkl --eval bbox
- LateFusion (
model_name
:FasterRCNNOBBLateFusion
anduser_input_loss_enable
:False
)
bash tools/dist_test_noc.sh configs/DOTA2_Tiny/faster_rcnn_obb_r50_fpn_1x_dota2_tiny.py checkpoints/Tiny_DOTA_Late_Fusion/Tiny_DOTA_Late_Fusion.pth 1 --out checkpoints/Tiny_DOTA_Late_Fusion/results.pkl --eval bbox
Before you try to train a model, you have to change the some line of configuration file such as model name, etc.
bash tools/dist_train.sh configs/DOTA2_Tiny faster_rcnn_obb_r50_fpn_1x_dota2_tiny.py [NUM_OF_GPUS]
(e.g., bash tools/dist_train.sh configs/DOTA2_Tiny/faster_rcnn_obb_r50_fpn_1x_dota2_tiny.py 8)
bash tools/dist_test_noc.sh [CONFIGURATION_FILE_PATH] [CHECKPOINT_FILE_PATH] [NUM_OF_GPUS] --out [OUTPUT_PATH] -eval bbox
(e.g., bash tools/dist_test_noc.sh configs/DOTA2_Tiny/faster_rcnn_obb_r50_fpn_1x_dota2_tiny.py work_dirs/faster_rcnn_obb_r50_fpn_1x_dota2_tiny_FasterRCNNOBBC3Det_CrossEntropyLoss_0.01_0.0001/best.pth 8 --out work_dirs/faster_rcnn_obb_r50_fpn_1x_dota2_tiny_FasterRCNNOBBC3Det_CrossEntropyLoss_0.01_0.0001/results.pkl --eval bbox)
It will simulate user inputs up to 20 points for evaluating your model and drawing NoC curve.
bash tools/dist_test.sh [CONFIGURATION_FILE_PATH] [CHECKPOINT_FILE_PATH] [NUM_OF_GPUS] --out [OUTPUT_PATH] -eval bbox
(e.g., bash tools/dist_test.sh configs/DOTA2_Tiny/faster_rcnn_obb_r50_fpn_1x_dota2_tiny.py work_dirs/faster_rcnn_obb_r50_fpn_1x_dota2_tiny_FasterRCNNOBBC3Det_CrossEntropyLoss_0.01_0.0001/best.pth 8 --out work_dirs/faster_rcnn_obb_r50_fpn_1x_dota2_tiny_FasterRCNNOBBC3Det_CrossEntropyLoss_0.01_0.0001/results.pkl --eval bbox)
It will simulate user inputs random number of user input from 0 to 20 and evaluate only once.
See RESULTS.md.
@InProceedings{lee2022interactive,
title={Interactive Multi-Class Tiny-Object Detection},
author={Lee, Chunggi and Park, Seonwook and Song, Heon and Ryu, Jeongun and Kim, Sanghoon and Kim, Haejoon and Pereira, S{\'e}rgio and Yoo, Donggeun},
booktitle={The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year={2022}
}
This project is released under the Apache 2.0 license.