Scene Text Image Transformer

A tool for scene text data augmentation. We provide the tool to avoid overfitting and gain robustness of models.

We are now focusing on the shape of the cropped scene text image. The next version for both detection and recognition tasks will be released later.

Requirements

GCC 4.8.*
Python 2.7.*
Boost 1.67
OpenCV 2.4.*

We recommend Anaconda to manage the version of your dependencies. For example:

     conda install boost=1.67.0

Installation

Build library:

    mkdir build
    cd build
    cmake -D CUDA_USE_STATIC_CUDA_RUNTIME=OFF ..
    make

Copy the Augment.so to the target folder and follow demo.py to use the tool.

    cp Augment.so ..
    cd ..
    python demo.py

Demo

Distortion

Stretch

Perspective

Speed

To transform an image with size (H:64, W:200), it takes less than 3ms using a 2.0GHz CPU. It is possible to accelerate the process by calling multi-process batch samplers in an on-the-fly manner, such as setting "num_workers" in PyTorch.

Improvement for Recognition

We compare the accuracies of CRNN trained using only the corresponding small training set.

Dataset	IIIT5K	IC13	IC15
Without Data Augmentation	40.8%	6.8%	8.7%
With Data Augmentation	53.4%	9.6%	24.9%

Citation

@inproceedings{schaefer2006image,
  title={Image deformation using moving least squares},
  author={Schaefer, Scott and McPhail, Travis and Warren, Joe},
  booktitle={ACM transactions on graphics (TOG)},
  volume={25},
  number={3},
  pages={533--540},
  year={2006},
  organization={ACM}
}

Acknowledgment

The tool is the combination of @cxcxcxcx's imgwarp-opencv and @Yati Sagade's opencv-ndarray-conversion. Thanks for your contribution.

Attention

The tool is only free for academic research purposes.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
include		include
pic		pic
src		src
.travis.yml		.travis.yml
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md
demo.py		demo.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Scene Text Image Transformer

Requirements

Installation

Demo

Speed

Improvement for Recognition

Citation

Acknowledgment

Attention

About

Releases

Packages

Languages

License

10183308/Scene-Text-Image-Transformer

Folders and files

Latest commit

History

Repository files navigation

Scene Text Image Transformer

Requirements

Installation

Demo

Speed

Improvement for Recognition

Citation

Acknowledgment

Attention

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages