This code repository contains an implementation of (CRNN:An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition)(TPAMI) and (Res-Bilstm-Attn:What Is Wrong With Scene Text Recognition Model Comparisons? Dataset and Model Analysis) (ICCV 2019).
Dataset | Samples | Description | Release |
---|---|---|---|
MJSynth | 8919257 | Scene text recognition synthetic data set | Link |
SynText | 7266164 | A synthesized by scene text dataset, and the text is cropped from the large image | Link |
Testset | Instance Number | Note |
---|---|---|
IIIT5K | 3000 | regular |
SVT | 647 | regular |
IC03_860 | 860 | regular |
IC13_857 | 857 | regular |
IC15_1811 | 1811 | irregular |
SVTP | 645 | irregular |
CUTE80 | 288 | irregular |
Testset | Instance Number | Note |
---|---|---|
IIIT5K | 3000 | regular |
SVT | 647 | regular |
IC03_860 | 860 | regular |
IC13_857 | 857 | regular |
IC15_1811 | 1811 | irregular |
SVTP | 645 | irregular |
CUTE80 | 288 | irregular |
A quick start is to use above lmdb-formatted datasets that contain the full benchmarks for scene text recognition tasks as belows.
Data Type: LMDB
File storage format:
|-- train
| |-- MJ
| |-- ST
|-- validation
| |-- mixture
|-- evaluation
| |-- mixture
Run the following bash command in the command line,
cd .
bash ./train_script/train_att.sh
cd .
bash ./train_script/train_crnn.sh
We provide the implementation of online validation. If you want to close it to save training time, you may modify the startup script to add
--no-validate
command.
cd .
bash ./train.sh
Methods | Regular Text | Irregular Text | Download | ||||||
Name | IIIT5K | SVT | IC03 | IC13 | IC15 | SVTP | CUTE80 | Config | Model |
CRNN(Report) | 86.2 | 86.0 | 94.4 | 92.6 | 73.6 | 76.0 | 72.2 | - |
- |
CRNN | 93.3 | 87.5 | 92.6 | 92.4 | 78.1 | 78.9 | 80.6 | pth [Link] (Access Code: 05IZ) |
|
Attention(Report) | 86.6 | 86.2 | 94.1 | 92.8 | 75.6 | 76.4 | 72.6 | - |
- |
Attention | 94.5 | 89.0 | 94.5 | 94.1 | 81.7 | 82.5 | 81.9 | pth [Link] (Access Code: r6C7) |
|
Here is the picture for result visualization.
@article{CRNN,
author={Baoguang Shi and Xiang Bai and Cong Yao},
title={An End-to-End Trainable Neural Network for Image-Based Sequence Recognition and Its Application to Scene Text Recognition},
journal={TPAMI},
volume={39},
number={11},
pages={2298--2304},
year={2017},
}
@inproceedings{Wrong,
author={Jeonghun Baek and Geewook Kim and Junyeop Lee and Sungrae Park and Dongyoon Han and Sangdoo Yun and Seong Joon Oh and Hwalsuk Lee},
title={What Is Wrong With Scene Text Recognition Model Comparisons? Dataset and Model Analysis},
booktitle={ICCV 2019},
pages={4714--4722},
publisher={{IEEE}},
year={2019},
}
This project is released under the Apache 2.0 license
If there is any suggestion and problem, please feel free to contact the author with qiaoliang6@hikvision.com.