Name		Name	Last commit message	Last commit date
parent directory ..
configs		configs
tools		tools
dist_train_distill.sh		dist_train_distill.sh
dist_train_teacher.sh		dist_train_teacher.sh
readme.md		readme.md

readme.md

Dynamic Low-Resolution Distillation

This code repository contains the implementations of the paper Dynamic Low-Resolution Distillation for Cost-Efficient End-to-End Text Spotting (ECCV 2022).

Preparing Dataset

Original images can be downloaded from: Total-Text , ICDAR2013 , ICDAR2015, ICDAR2017_MLT.

The formatted training datalists can be found in demo/text_spotting/datalist

Train From Scratch

If you want to re-implement the model's performance from scratch, please following these steps:

1.Download the pre-trained model, which was well trained on SynthText & COCO-Text (pth (Access Code: yu09)). See demo/text_spotting/mask_rcnn_spot/readme.md for more details.

2.Train the multi-scale teacher model using the ICDAR2013, ICDAR2015, ICDAR2017-MLT and Total-Text based on the pre-trained model in step-1 (L307 in mask_rcnn_pretrain_teacher.py). The teacher model is also used as the Vanilla Multi-Scale competitors. See demo/text_spotting/dld/configs/mask_rcnn_pretrain_teacher.py for more details.

Just modify the required path in the config file (img_prefixes, ann_files, work_dir, load_from, etc.) and then run the following script:

cd $DAVAR_LAB_OCR_ROOT$/demo/text_spotting/dld/
bash dist_train_teacher.sh

3.Initialize teacher and student models with the trained models obtained in step-2 (L360-361 in mask_rcnn_distill.py), and then end-to-end distill student model on the mixed real dataset (include:ICADR2013, ICDAR2015, ICDAR2017-MLT and Total-Text). The results on separate testing dataset are reported based on the same model. See demo/text_spotting/dld/configs/mask_rcnn_distill.py for more details.

Just modify the required path in the config file (img_prefixes, ann_files, work_dir, load_from, etc.) and then run the following script:

cd $DAVAR_LAB_OCR_ROOT$/demo/text_spotting/dld/
bash dist_train_distill.sh

Notice:We provide the implementation of online validation, if you want to close it to save training time, you may modify the startup script to add --no-validate command.

Offline Inference and Evaluation

We provide a demo of forward inference and evaluation. You can modify the parameter (iou_constraint, lexicon_type, etc..) in the testing script, and start testing. For example:

cd $DAVAR_LAB_OCR_ROOT$/demo/text_spotting/mask_rcnn_spot/tools/
bash test_ic13.sh

The offline evaluation tool can be found in davarocr/demo/text_spotting/evaluation/.

Trained Model Download

All of the models are re-implemented and well trained in the based on the opensourced framework mmdetection.

Results on various datasets and trained models download:

Dataset	Training Method	Input Size	End-to-End			Word Spotting			FLOPS	Links
Dataset	Training Method	Input Size	General	Weak	Strong	General	Weak	Strong	FLOPS	Links
ICDAR2013	Vanilla Multi-Scale	S-768	82.9	86.6	86.9	86.3	91.0	91.4	142.9G	cfg , pth (Access Code: BD63)
ICDAR2013	DLD (γ=0.1)	Dynamic	82.7	85.7	86.5	86.1	89.9	90.9	71.5G	cfg , pth (Access Code: 32Y9)
ICDAR2013	DLD (γ=0.3)	Dynamic	81.6	84.4	85.6	84.9	88.6	90.0	41.6G	cfg , pth (Access Code: Vi12)
ICDAR2015	Vanilla Multi-Scale	S-1280	69.5	74.4	78.0	71.7	77.2	81.4	517.2G	cfg , pth (Access Code: BD63)
ICDAR2015	DLD (γ=0.1)	Dynamic	70.9	75.7	79.0	73.3	78.6	82.4	298.8G	cfg , pth (Access Code: 32Y9)
ICDAR2015	DLD (γ=0.3)	Dynamic	69.3	73.5	78.1	71.2	76.4	81.1	148.3G	cfg , pth (Access Code: Vi12)

Dataset	Training Method	Input Size	End-to-End		Word Spotting		Links
Dataset	Training Method	Input Size	None	Full	None	Full	Links
Total-Text	Vanilla Multi-Scale	S-896	62.3	71.4	65.2	75.9	206.7G	cfg , pth (Access Code: BD63)
Total-Text	DLD (γ=0.1)	Dynamic	63.9	73.7	66.4	77.8	103.0G	cfg , pth (Access Code: 32Y9)
Total-Text	DLD (γ=0.3)	Dynamic	61.9	71.9	64.0	75.9	62.1G	cfg , pth (Access Code: Vi12)

Citation:

@inproceedings{chen2022dynamic,
  title={Dynamic Low-Resolution Distillation for Cost-Efficient End-to-End Text Spotting},
  author={Chen, Ying and Qiao, Liang and Cheng, Zhanzhan and Pu, Shiliang and Niu, Yi and Li, Xi},
  booktitle={ECCV},
  year={2022}
}

License

This project is released under the Apache 2.0 license

Contact

If there is any suggestion and problem, please feel free to contact the author with qiaoliang6@hikvision.com.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dld

dld

readme.md

Dynamic Low-Resolution Distillation

Preparing Dataset

Train From Scratch

Offline Inference and Evaluation

Trained Model Download

Citation:

License

Contact

Files

dld

Directory actions

More options

Directory actions

More options

Latest commit

History

dld

Folders and files

parent directory

readme.md

Dynamic Low-Resolution Distillation

Preparing Dataset

Train From Scratch

Offline Inference and Evaluation

Trained Model Download

Citation:

License

Contact