Skip to content

Latest commit

 

History

History
79 lines (67 loc) · 4.15 KB

README.md

File metadata and controls

79 lines (67 loc) · 4.15 KB

EnsNet: Ensconce Text in the Wild

A synthetic benchmark database for scene text removal is now released by Deep Learning and Vision Computing Lab of South China University of Technology. The database can be downloaded through the following links:

Description

The training set of synthetic database consists of a total of 8000 images and the test set contains 800 images; all the training and test samples are resized to 512 × 512. The code for generating synthetic dataset and more synthetic text images as described in “Ankush Gupta, Andrea Vedaldi, Andrew Zisserman, Synthetic Data for Text localisation in Natural Images, CVPR 2016", and can be found in (https://github.com/ankush-me/SynthText). Besides, all the real scene text images are also resized to 512 × 512.

For more details, please refer to our AAAI 2019 paper. arXiv: http://arxiv.org/abs/1812.00723

Requirements

  1. Mxnet==1.3.1
  2. Python2.
  3. NVIDA GPU+ CUDA 8.0.
  4. Matplotlib.
  5. Numpy.

Installation

  1. Clone this respository.
    git clone https://github.com/HCIILAB/Scene-Text-Removal
    

Running

1. Image Prepare

 You can refer to our given example to put data.

2. Training

To train our model, you may need to change the path of dataset or the parameters of the network etc. Then run the following code:

python train.py \
--trainset_path=[the path of dataset] \
--checkpoint=[path save the model] \
--gpu=[use gpu] \
--lr=[Learning Rate] \
--n_epoch=[Number of iterations]

3. Testing

To output the generated results of the inputs, you can use the test.py. Please run the following code:

python test.py \
--test_image=[the path of test images] \
--model=[which model to be test] \
--vis=[ vis images] \
--result=[path to save the output images]

To evalution the model performace over a dataset, you can find the evaluation metrics in this website PythonCode.zip

4. Pretrained models

Please download the ImageNet pretrained models vgg16 PASSWORD:8tof, and put it under

root/.mxmet/models/

Paper

Please consider to cite our paper when you use our database:

@article{zhang2019EnsNet,
  title     = {EnsNet: Ensconce Text in the Wild},
  author    = {Shuaitao Zhang∗, Yuliang Liu∗, Lianwen Jin†, Yaoxiong Huang, Songxuan Lai
  joural    = {AAAI}
  year      = {2019}
}

Feedback

Suggestions and opinions of dataset of this dataset (both positive and negative) are greatly welcome. Please contact the authors by sending email to eestzhang@mail.scut.edu.cn.

Copyright

The synthetic database can be only used for non-commercial research purpose.

For commercial purpose usage, please contact Dr. Lianwen Jin: lianwen.jin@gmail.com.

Copyright 2018, Deep Learning and Vision Computing Lab, South China University of Teacnology.http://www.dlvc-lab.net