Toward Understanding WordArt: Corner-Guided Transformer for Scene Text Recognition (ECCV 2022 Oral)

The official code of CornerTransformer (ECCV 2022, Oral).

This work focuses on a new challenging task of artistic text recognition. To tackle the difficulties of this task, we introduce the corner point map as a robust representation for the artistic text image and present the corner-query cross-attention mechanism to make the model achieve more accurate attention. We also design a character contrastive loss to learn the invariant features of characters, leading to tight clustering of features. In order to benchmark the performance of different models, we provide the WordArt dataset.

Runtime Environment

This repo depends on PyTorch, MMCV, MMDetection and MMOCR. Below are quick steps for installation. Please refer to MMOCR 0.6 Install Guide for more detailed instruction.

conda create -n wordart python=3.7 -y
conda activate wordart
conda install pytorch==1.10 torchvision cudatoolkit=11.3 -c pytorch
pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu113/torch1.10.0/index.html
pip install mmdet
git clone https://github.com/xdxie/WordArt.git
cd WordArt
pip install -r requirements.txt
pip install -v -e .
export PYTHONPATH=$(pwd):$PYTHONPATH
pip install -r requirements/albu.txt

WordArt Dataset

The WordArt dataset consists of 6316 artistic text images with 4805 training images and 1511 testing images. The dataset is available at Google Drive.

Preparing Datasets

Please follow the steps in MMOCR 0.6 Dataset Zoo to prepare the text recognition datasets. Put all the datasets in data/mixture folder. In this repository, we use two synthetic datasets MJSynth and SynthText to train the model. We evaluate the model performance on IIIT5k, IC13, SVT, IC15, SVTP, CUTE, and our proposed WordArt.

Note: Please make sure to reprocess the two training datasets following the steps.

Training

For distributed training on multiple GPUs, please use

./tools/dist_train.sh ${CONFIG_FILE} ${WORK_DIR} ${GPU_NUM} [PY_ARGS]

For training on a single GPU, please use

python tools/train.py ${CONFIG_FILE} [ARGS]

For example, we use this script to train the model:

./tools/dist_train.sh configs/textrecog/corner_transformer/corner_transformer_academic.py outputs/corner_transformer/ 4

Evaluation

For distributed evaluating on multiple GPUs, please use

./tools/dist_test.sh ${CONFIG_FILE} ${CHECKPOINT_FILE} ${GPU_NUM} [PY_ARGS]

For evaluating on a single GPU, please use

python tools/test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [ARGS]

For example, we use this script to evaluate the model performance:

CUDA_VISIBLE_DEVICES=0 python tools/test.py outputs/corner_transformer/corner_transformer_academic.py outputs/corner_transformer/latest.pth --eval acc

Results

Method	IC13	SVT	IIIT	IC15	SVTP	CUTE	WordArt	download
CornerTransformer	96.4	94.6	95.9	86.3	91.5	92.0	70.8	model

Visualization

Each example is along with the results from ABINet-LV, our baseline and the proposed CornerTransformer. Hard examples are successfully recognized by CornerTransformer.

When decorative patterns from the background have exactly the same appearance and similar shape as the texts, CornerTransformer may fail to achieve correct results. Each image is along with our result and the ground truth.

Citation

Please cite the following paper when using the WordArt dataset or this repo.

@article{xie2022toward,
  title={Toward Understanding WordArt: Corner-Guided Transformer for Scene Text Recognition},
  author={Xie, Xudong and Fu, Ling and Zhang, Zhifei and Wang, Zhaowen and Bai, Xiang},
  booktitle={ECCV},
  year={2022}
}

Acknowledgement

This repo is based on MMOCR 0.6. We appreciate this wonderful open-source toolbox.

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
.circleci		.circleci
configs		configs
demo		demo
docker		docker
docs		docs
mmocr.egg-info		mmocr.egg-info
mmocr		mmocr
requirements		requirements
resources		resources
tests		tests
tools		tools
.codespellrc		.codespellrc
.coveragerc		.coveragerc
.owners.yml		.owners.yml
.pre-commit-config.yaml		.pre-commit-config.yaml
.pylintrc		.pylintrc
.readthedocs.yml		.readthedocs.yml
CITATION.cff		CITATION.cff
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
README_zh-CN.md		README_zh-CN.md
model-index.yml		model-index.yml
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Toward Understanding WordArt: Corner-Guided Transformer for Scene Text Recognition (ECCV 2022 Oral)

Runtime Environment

WordArt Dataset

Preparing Datasets

Training

Evaluation

Results

Visualization

Citation

Acknowledgement

About

Releases

Packages

Languages

License

xdxie/WordArt

Folders and files

Latest commit

History

Repository files navigation

Toward Understanding WordArt: Corner-Guided Transformer for Scene Text Recognition (ECCV 2022 Oral)

Runtime Environment

WordArt Dataset

Preparing Datasets

Training

Evaluation

Results

Visualization

Citation

Acknowledgement

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages