Thanks to @singularity-s0 and @Dash Kev.
Our model is based on CounTR. The main changes we implemented are as follows:
- Contour-based Counting: We discovered that sometimes the contours in the density map are clearly visible to the naked eye, but the sum of each object does not reach a count of one. Therefore, we used OpenCV contour-based counting to assist in counting when the contours in the density map are relatively clear.
- Removed Exemplar: We found that the exemplar interfered with the results in some cases, so we deleted it and turned it into a zero-shot problem.
- Validation Set:
- Zero-shot MAE/MSE: 15.90/58.46
- Test Set:
- Zero-shot MAE/MSE: 13.86/91.51
- Contour-based Counting MAE/MSE: 13.81/91.49 (Slightly better than the original CounTR)
Our repository includes the following branches, each addressing different aspects and improvements of the CounTR model:
- FSC: Baseline provided by the teaching assistant
- CounTR and CounTR-BackUpVersion: Original CounTR model with modified environment configuration
- Vit-encoder: Baseline using Vit as encoder
- counting-convnet: Zero-shot + hybrid counting (Key improvement branch)
- countr-clip: Uses CLIP as text encoder for multimodal
- countr-clip-full: Changes both the image and text encoder to CLIP
- countr-finetune-zs: Fine-tunes the zero-shot model
- countr-textonly-regression: Based on countr-textonly, uses convolutional network to regress
- countr-zeroshot: Zero-shot model (Key improvement branch)
- exemplar-resnet: Replaces CounTR's exemplar encoder with pretrained ResNet18
- resnet: Uses residual connections between encoder and decoder
Details can be found in the paper.
- Preparation
- CounTR train
- CounTR inference
- Fine-tuned weights
- Visualisation
- Citation
- Acknowledgements
In our project, the following datasets are used. Please visit following links to download datasets:
In fact, we use CARPK by importing hub package. Please click here for more information.
The following packages are suitable for NVIDIA GeForce RTX 3090.
pip install torch==1.10.0+cu111 torchvision==0.11.0+cu111 torchaudio==0.10.0 -f https://download.pytorch.org/whl/torch_stable.html
pip install timm==0.3.2
pip install numpy
pip install matplotlib tqdm
pip install tensorboard
pip install scipy
pip install imgaug
pip install opencv-python
pip3 install hub
- This repo is based on
timm==0.3.2
, for which a fix is needed to work with PyTorch 1.8.1+.
Please modify your work directory and dataset directory in the following train files.
Task | model file | train file |
---|---|---|
Pretrain on FSC147 | models_mae_noct.py | FSC_pretrain.py |
Finetune on FSC147 | models_mae_cross.py | FSC_finetune_cross.py |
Finetune on CARPK | models_mae_cross.py | FSC_finetune_CARPK.py |
Pretrain on FSC147
CUDA_VISIBLE_DEVICES=0 python FSC_pretrain.py \
--epochs 500 \
--warmup_epochs 10 \
--blr 1.5e-4 --weight_decay 0.05
Finetune on FSC147
CUDA_VISIBLE_DEVICES=0 nohup python -u FSC_finetune_cross.py \
--epochs 1000 \
--blr 2e-4 --weight_decay 0.05 >>./train.log 2>&1 &
Finetune on CARPK
CUDA_VISIBLE_DEVICES=0 nohup python -u FSC_finetune_CARPK.py \
--epochs 1000 \
--blr 2e-4 --weight_decay 0.05 >>./train.log 2>&1 &
Please modify your work directory and dataset directory in the following test files.
Task | model file | test file |
---|---|---|
Test on FSC147 | models_mae_cross.py | FSC_test_cross.py |
Test on CARPK | models_mae_cross.py | FSC_test_CARPK.py |
Test on FSC147
CUDA_VISIBLE_DEVICES=0 nohup python -u FSC_test_cross.py >>./test.log 2>&1 &
Test on CARPK
CUDA_VISIBLE_DEVICES=0 nohup python -u FSC_test_CARPK.py >>./test.log 2>&1 &
Also, demo.py is a small demo used for testing on a single image.
CUDA_VISIBLE_DEVICES=0 python demo.py
benchmark | MAE | RMSE | link |
---|---|---|---|
FSC147 | 11.95 (Test set) | 91.23 (Test set) | weights |
CARPK | 5.75 | 7.45 | weights |
@article{liu2022countr,
author = {Chang, Liu and Yujie, Zhong and Andrew, Zisserman and Weidi, Xie},
title = {CounTR: Transformer-based Generalised Visual Counting},
journal = {arXiv:2208.13721},
year = {2022}
}
We borrowed the code from
Thanks @GioFic95 for adding the function of using external exemplars, more predictions images, more parametrized inference and so on.
If you have any questions about our code implementation, please contact us at liuchang666@sjtu.edu.cn