DSA^2 F: Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion (CVPR'2021, Oral)
This repo is the official implementation of "DSA^2 F: Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion"
by Peng Sun, Wenhu Zhang, Huanyu Wang, Songyuan Li, and Xi Li.
- Ubuntu 18
- PyTorch 1.7.0
- CUDA 10.1
- Cudnn 7.5.1
- Python 3.7
- Numpy 1.17.3
Please see launch_train.sh
and launch_pretrain.sh
for imagenet pretraining and sod training, respectively.
Please see launch_test.sh
for testing on the sod benchmarks.
Dataset | Er | Sλmean | Fβmean | M |
---|---|---|---|---|
DUT-RGBD | 0.950 | 0.921 | 0.926 | 0.030 |
NJUD | 0.923 | 0.903 | 0.901 | 0.039 |
NLPR | 0.950 | 0.918 | 0.897 | 0.024 |
SSD | 0.904 | 0.876 | 0.852 | 0.045 |
STEREO | 0.933 | 0.904 | 0.898 | 0.036 |
LFSD | 0.923 | 0.882 | 0.882 | 0.054 |
RGBD135 | 0.962 | 0.920 | 0.896 | 0.021 |
All of the saliency maps mentioned in the paper are available on GoogleDrive or BaiduYun(code:juc2).
You can use the toolbox provided by jiwei0921 for evaluation.
Additionally, we also provide the saliency maps of the STERE-1000 and SIP dataset on BaiduYun(code:qxfw) for easy comparison.
Dataset | Er | Sλmean | Fβmean | M |
---|---|---|---|---|
STERE-1000 | 0.928 | 0.897 | 0.895 | 0.038 |
SIP | 0.908 | 0.861 | 0.868 | 0.057 |
@inproceedings{Sun2021DeepRS,
title={Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion},
author={P. Sun and Wenhu Zhang and Huanyu Wang and Songyuan Li and Xi Li},
journal={IEEE Conf. Comput. Vis. Pattern Recog.},
year={2021}
}
The code is released under MIT License (see LICENSE file for details).