Skip to content

Latest commit

 

History

History
168 lines (130 loc) · 7.25 KB

README.md

File metadata and controls

168 lines (130 loc) · 7.25 KB

[TIP2024] CalibNet: Dual-branch Cross-modal Calibration for RGB-D Salient Instance Segmentation

CalibNet

Official Implementation of "CalibNet: Dual-branch Cross-modal Calibration for RGB-D Salient Instance Segmentation"

Jialun Pei, Tao Jiang, He Tang, Nian Liu, Yueming Jin, Deng-Ping Fan✉, and Pheng-Ann Heng

👀 [Paper]; [Chinese Version]; [Official Version]

Contact: dengpfan@gmail.com, peijialun@gmail.com

🔧 Environment Preparation

Requirements

  • Linux with python ≥ 3.8
  • Pytorch ≥ 1.9 and torchvison that matches the Pytorch installation.
  • Detectron2: follow Detectron2 installation instructions.
  • OpenCV is optional but needed by demo and visualization.
  • pip install -r requirements.txt

CUDA Kernel for MSDeformAttn

After preparing the required environment, run the following command to compile CUDA kernel for MSDeformAttn: CUDA_HOME must be defined and points to the directory of the installed CUDA toolkit.

cd calibnet/trans_encoder/ops
sh make.sh

Conda Environment Setup

Our project is built upon detectron2. In order to accommodate the RGB-D SIS task, we have made some modifacations to the framework to handle its dual modality inputs. You could replace the c2_model_loading.py in the framework with the one we provide calibnet/c2_model_loading.py.

We give an example to setup the environment. The commands are verified on CUDA 11.1, pytorch 1.9.1 and detectron2 0.6.0.

# an example
# create virtual environment
conda create -n calibnet python=3.8 -y
conda activate calibnet

# install pytorch
pip install torch==1.9.1+cu111 torchvision==0.10.1+cu111 torchaudio==0.9.1 -f https://download.pytorch.org/whl/torch_stable.html

# install detectron2, use the pre-built detectron2
python -m pip install detectron2 -f \
  https://dl.fbaipublicfiles.com/detectron2/wheels/cu111/torch1.9/index.html

# install requierement packages
pip install -r requirements.txt

# build CUDA kernel for MSDeformAttn
cd calibnet/trans_encoder/ops
sh make.sh

# replace the model loading code in detectron2. You should specify your own detectron2 path.
cp -i calibnet/c2_model_loading.py detectron2/checkpoint/c2_model_loading.py

📈 Dataset Preparation

Download and Unzip Datasets and Annotation Files

Register Datasets

  1. Download the datasets and put them in the same folder. To match the folder name in the dataset mappers, you'd better not change the folder names, its structure may be:
    DATASET_ROOT/
    ├── COME15K
       ├── train
          ├── imgs_right
          ├── depths
          ├── ...
       ├── COME-E
          ├── RGB
          ├── depth
          ├── ...
       ├── COME-H
          ├── RGB
          ├── depth
          ├── ...
       ├── annotations
       ├── ...
    ├── DSIS
       ├── RGB
       ├── depth
       ├── DSIS.json
       ├── ...
    ├── SIP
       ├── RGB
       ├── depth
       ├── SIP.json
       ├── ...
  1. Change the dataset root in calibnet/register_rgbdsis_datasets.py
# calibnet/register_rgbdsis_datasets.py   line 28
_root = os.getenv("DETECTRON2_DATASETS", "path/to/dataset/root")

🚀 Pre-trained Models

Model weights: Google Drive

Model Config COME15K-E-test AP COME15K-H-test AP
ResNet-50 config 58.0 50.7
ResNet-101 config 58.5 51.5
Swin-T config 60.0 52.6
PVT-v2 config 60.7 53.7
P2T-Large config 61.8 54.4

⚙️ Usage

Train

To train our CalibNet on single GPU, you should specify the config file <CONFIG>.

python tools/train_net.py --config-file <CONFIG> --num-gpus 1
# example:
python tools/train_net.py --config-file configs/CalibNet_R50_50e_50q_320size.yaml --num-gpus 1 OUTPUT_DIR output/train

Evaluation

Before evaluating, you should specify the config file <CONFIG> and the model weights <WEIGHT_PATH>. In addition, the input size is set to 320 by default.

python tools/train_net.py --config-file <CONFIG> --num-gpus 1 --eval-only MODEL.WEIGHTS <WEIGHT_PATH>
# example:
python tools/train_net.py --config-file configs/CalibNet_R50_50e_50q_320size.yaml --num-gpus 1 --eval-only MODEL.WEIGHTS weights/calibnet_r50_50e.pth OUTPUT_DIR output/eval

Model Statistics

We provide tools to validate the efficiency of CalibNet, including Parameters, GFLOPS and inference fps. The usages are as follow.

# fps
python tools/count_fps.py --config-file <CONFIG> INPUT.MIN_SIZE_TEST 320 MODEL.WEIGHTS <WEIGHT_PATH>
# Parameters
python tools/get_flops.py --tasks parameter --config-file <CONFIG> MODEL.WEIGHTS <WEIGHTS_PATH>
# GFLOPS
python tools/get_flops.py --tasks flop --config-file <CONFIG> MODEL.WEIGHTS <WEIGHT_PATH>

Acknowledgement

This work is based on detectron2 and SparseInst. We sincerely thanks for their great work and contributions to the community!

📚 Citation

If this helps you, please cite this work:

@article{pei2024calibnet,
  title={CalibNet: Dual-branch Cross-modal Calibration for RGB-D Salient Instance Segmentation},
  author={Pei, Jialun and Jiang, Tao and Tang, He and Liu, Nian and Jin, Yueming and Fan, Deng-Ping and Heng, Pheng-Ann},
  booktitle={IEEE Transactions on Image Processing},
  volume={33},
  pages={4348-4362},
  year={2024},
  organization={IEEE}
}