We propose a novel Adversarial Sequence-to-sequence Domain Adaptation Network dubbed ASSDA for robust text image recognition, which could adaptively transfer coarse global-level and fine-grained character-level knowledge.
-
This code is test in the environment with
cuda==10.1, python==3.6.8
. -
Install Requirements
pip3 install torch==1.2.0 pillow==6.2.1 torchvision==0.4.0 lmdb nltk natsort
-
The prepared synthetic and real scene dataset can be downloaded from here, which are created by NAVER Corp.
-
The prepared handwritten text dataset can be downloaded from here
- Handwritten text: IAM
-
For a toy example, you can download the pretrained model from here
- Add model files to test into
data/
- Add model files to test into
-
Training model
CUDA_VISIBLE_DEVICES=1 python train_da_global_local_selected.py --Transformation TPS --FeatureExtraction ResNet --SequenceModeling BiLSTM --Prediction Attn \ --src_train_data ./data/data_lmdb_release/training/ \ --tar_train_data ./data/IAM/test --tar_select_data IAM --tar_batch_ratio 1 --valid_data ../data/IAM/test/ \ --continue_model ./data/TPS-ResNet-BiLSTM-Attn.pth \ --batch_size 128 --lr 1 \ --experiment_name _adv_global_local_synth2iam_pc_0.1 --pc 0.1
-
Test model
-
Test the baseline model
CUDA_VISIBLE_DEVICES=0 python test.py --Transformation TPS --FeatureExtraction ResNet --SequenceModeling BiLSTM --Prediction Attn \ --eval_data ./data/IAM/test \ --saved_model ./data/TPS-ResNet-BiLSTM-Attn.pth
-
Test the adaptation model
CUDA_VISIBLE_DEVICES=0 python test.py --Transformation TPS --FeatureExtraction ResNet --SequenceModeling BiLSTM --Prediction Attn \ --eval_data ./data/IAM/test \ --saved_model saved_models/TPS-ResNet-BiLSTM-Attn-Seed1111_adv_global_local_selected/best_accuracy.pth
-
If you use this code for a paper please cite:
@inproceedings{zhang2019sequence,
title={Sequence-to-sequence domain adaptation network for robust text image recognition},
author={Zhang, Yaping and Nie, Shuai and Liu, Wenju and Xu, Xing and Zhang, Dongxiang and Shen, Heng Tao},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={2740--2749},
year={2019}
}
@article{zhang2021robust,
title={Robust Text Image Recognition via Adversarial Sequence-to-Sequence Domain Adaptation},
author={Zhang, Yaping and Nie, Shuai and Liang, Shan and Liu, Wenju},
journal={IEEE Transactions on Image Processing},
volume={30},
pages={3922--3933},
year={2021},
publisher={IEEE}
}
This implementation has been based on this repository deep-text-recognition-benchmark