Lingbo Liu, Zhilin Qiu, Guanbin Li , Shufan Liu, Wanli Ouyang, Liang Lin. Crowd Counting with Deep Structured Scale Integration Network, ICCV, 2019
Overview of our approach
This is the repo for Crowd Counting with Deep Structured Scale Integration Network in ICCV 2019, which delivered a state-of-the-art framework for crowd counting task and two effective module to cope with huge scale variant in the crowd.
CUDA 9.0 or higher
Python 2.7
opencv, PIL, scikit-learn
pytorch 0.4.2 or or higher
- ShanghaiTech partA and partB
- UCF_QNRF
- UCF_CC_50
- WorldExpo'10
- We implemented fix and adaptive gaussian kernel density map generation in python, and density maps are generated during training on the fly;
- During testing, no density map is generated and gt counts are the number of annotated points in ROI;
- Edit "/src/datasets.py" to change the path to your original dataset foldered as the released ShanghaiTech dataset and set the density maps setting including, sigma for gaussian kernel, train_val split and mean_std;
python nowtrain.py --dataset 'the dataset to train'
--model 'network to train; CRFVGG\CRFVGG_prune'
--loss 'default: MSE, MSE/NORMMSSSIM'
Please refer to /src/train_options.py for more options; Default scripts for training ShanghaiTech PartA avaliable on /scripts/train.sh
python nowtest.py --dataset 'the dataset to test'
--model 'network to train; CRFVGG\CRFVGG_prune'
--model_path 'the path to the saved model to test'
Please refer to /nowtest.py for more options; Default scripts for training ShanghaiTech PartA avaliable on /scripts/test.sh We will release the model reported on our paper, links on the performance session.
We train and test the UCF-QNRF dataset with its original resolution. During training, to fit in memory, we pre-crop images to non-overlap or less-overlap image patches(in high resolution) and iterate through images via randomly choose one patch with prior to dense patches, follow by other data augment on the fly. During testing, images are croped to strictly non-overlap patches and add up the predicted count as the final estimation.
Dataset | MAE | MSE |
---|---|---|
ShanghaiTech Part A | 60.63 | 96.04 |
ShanghaiTech Part A(pruned-vgg) | 61.16 | 102.91 |
ShanghaiTech Part B | 6.85 | 10.34 |
UCF-QNRF | 99.1 | 159.2 |
UCF-CC-50 | 216.9 | 302.4 |
WorldExpo'10 | 6.67(average) | |
TRANCOS | 2.72 |
If you use this code for your research, please cite our papers.
@inproceedings{liu2019crowd,
title={Crowd Counting with Deep Structured Scale Integration Network},
author={Liu, Lingbo and Qiu, Zhilin and Li, Guanbin and Liu, Shufan and Ouyang, Wanli and Lin, Liang},
booktitle={Proceedings of the IEEE Conference on Computer Vision (ICCV)},
year={2019}
}