[TIP2024] CalibNet: Dual-branch Cross-modal Calibration for RGB-D Salient Instance Segmentation

Official Implementation of "CalibNet: Dual-branch Cross-modal Calibration for RGB-D Salient Instance Segmentation"

Jialun Pei, Tao Jiang, He Tang, Nian Liu, Yueming Jin, Deng-Ping Fan✉, and Pheng-Ann Heng

👀 [Paper]; [Chinese Version]; [Official Version]

Contact: dengpfan@gmail.com, peijialun@gmail.com

🔧 Environment Preparation

Requirements

Linux with python ≥ 3.8
Pytorch ≥ 1.9 and torchvison that matches the Pytorch installation.
Detectron2: follow Detectron2 installation instructions.
OpenCV is optional but needed by demo and visualization.
pip install -r requirements.txt

CUDA Kernel for MSDeformAttn

After preparing the required environment, run the following command to compile CUDA kernel for MSDeformAttn: CUDA_HOME must be defined and points to the directory of the installed CUDA toolkit.

cd calibnet/trans_encoder/ops
sh make.sh

Conda Environment Setup

Our project is built upon detectron2. In order to accommodate the RGB-D SIS task, we have made some modifacations to the framework to handle its dual modality inputs. You could replace the c2_model_loading.py in the framework with the one we provide calibnet/c2_model_loading.py.

We give an example to setup the environment. The commands are verified on CUDA 11.1, pytorch 1.9.1 and detectron2 0.6.0.

# an example
# create virtual environment
conda create -n calibnet python=3.8 -y
conda activate calibnet

# install pytorch
pip install torch==1.9.1+cu111 torchvision==0.10.1+cu111 torchaudio==0.9.1 -f https://download.pytorch.org/whl/torch_stable.html

# install detectron2, use the pre-built detectron2
python -m pip install detectron2 -f \
  https://dl.fbaipublicfiles.com/detectron2/wheels/cu111/torch1.9/index.html

# install requierement packages
pip install -r requirements.txt

# build CUDA kernel for MSDeformAttn
cd calibnet/trans_encoder/ops
sh make.sh

# replace the model loading code in detectron2. You should specify your own detectron2 path.
cp -i calibnet/c2_model_loading.py detectron2/checkpoint/c2_model_loading.py

📈 Dataset Preparation

Download and Unzip Datasets and Annotation Files

COME15K: Google Drive
DSIS (Ours): Google Drive; Xunlei Drive (password: 7dif)
SIP: Google Drive

Register Datasets

Download the datasets and put them in the same folder. To match the folder name in the dataset mappers, you'd better not change the folder names, its structure may be:

    DATASET_ROOT/
    ├── COME15K
       ├── train
          ├── imgs_right
          ├── depths
          ├── ...
       ├── COME-E
          ├── RGB
          ├── depth
          ├── ...
       ├── COME-H
          ├── RGB
          ├── depth
          ├── ...
       ├── annotations
       ├── ...
    ├── DSIS
       ├── RGB
       ├── depth
       ├── DSIS.json
       ├── ...
    ├── SIP
       ├── RGB
       ├── depth
       ├── SIP.json
       ├── ...

Change the dataset root in calibnet/register_rgbdsis_datasets.py

# calibnet/register_rgbdsis_datasets.py   line 28
_root = os.getenv("DETECTRON2_DATASETS", "path/to/dataset/root")

🚀 Pre-trained Models

Model weights: Google Drive

Model	Config	COME15K-E-test AP	COME15K-H-test AP
ResNet-50	config	58.0	50.7
ResNet-101	config	58.5	51.5
Swin-T	config	60.0	52.6
PVT-v2	config	60.7	53.7
P2T-Large	config	61.8	54.4

⚙️ Usage

Train

To train our CalibNet on single GPU, you should specify the config file <CONFIG>.

python tools/train_net.py --config-file <CONFIG> --num-gpus 1
# example:
python tools/train_net.py --config-file configs/CalibNet_R50_50e_50q_320size.yaml --num-gpus 1 OUTPUT_DIR output/train

Evaluation

Before evaluating, you should specify the config file <CONFIG> and the model weights <WEIGHT_PATH>. In addition, the input size is set to 320 by default.

python tools/train_net.py --config-file <CONFIG> --num-gpus 1 --eval-only MODEL.WEIGHTS <WEIGHT_PATH>
# example:
python tools/train_net.py --config-file configs/CalibNet_R50_50e_50q_320size.yaml --num-gpus 1 --eval-only MODEL.WEIGHTS weights/calibnet_r50_50e.pth OUTPUT_DIR output/eval

Model Statistics

We provide tools to validate the efficiency of CalibNet, including Parameters, GFLOPS and inference fps. The usages are as follow.

# fps
python tools/count_fps.py --config-file <CONFIG> INPUT.MIN_SIZE_TEST 320 MODEL.WEIGHTS <WEIGHT_PATH>
# Parameters
python tools/get_flops.py --tasks parameter --config-file <CONFIG> MODEL.WEIGHTS <WEIGHTS_PATH>
# GFLOPS
python tools/get_flops.py --tasks flop --config-file <CONFIG> MODEL.WEIGHTS <WEIGHT_PATH>

Acknowledgement

This work is based on detectron2 and SparseInst. We sincerely thanks for their great work and contributions to the community!

📚 Citation

If this helps you, please cite this work:

@article{pei2024calibnet,
  title={CalibNet: Dual-branch Cross-modal Calibration for RGB-D Salient Instance Segmentation},
  author={Pei, Jialun and Jiang, Tao and Tang, He and Liu, Nian and Jin, Yueming and Fan, Deng-Ping and Heng, Pheng-Ann},
  booktitle={IEEE Transactions on Image Processing},
  volume={33},
  pages={4348-4362},
  year={2024},
  organization={IEEE}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

[TIP2024] CalibNet: Dual-branch Cross-modal Calibration for RGB-D Salient Instance Segmentation

🔧 Environment Preparation

Requirements

CUDA Kernel for MSDeformAttn

Conda Environment Setup

📈 Dataset Preparation

Download and Unzip Datasets and Annotation Files

Register Datasets

🚀 Pre-trained Models

⚙️ Usage

Train

Evaluation

Model Statistics

Acknowledgement

📚 Citation

Files

README.md

Latest commit

History

README.md

File metadata and controls

[TIP2024] CalibNet: Dual-branch Cross-modal Calibration for RGB-D Salient Instance Segmentation

🔧 Environment Preparation

Requirements

CUDA Kernel for MSDeformAttn

Conda Environment Setup

📈 Dataset Preparation

Download and Unzip Datasets and Annotation Files

Register Datasets

🚀 Pre-trained Models

⚙️ Usage

Train

Evaluation

Model Statistics

Acknowledgement

📚 Citation