Skip to content

Commit

Permalink
release eval code and dataset
Browse files Browse the repository at this point in the history
  • Loading branch information
tyxsspa committed Feb 21, 2024
1 parent d29ce21 commit edb145d
Show file tree
Hide file tree
Showing 21 changed files with 1,785 additions and 7 deletions.
43 changes: 38 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
![sample](docs/sample.jpg "sample")

## 📌News
[2024.02.21] - The evaluation code and dataset(**AnyText-benchmark**) are released.
[2024.02.06] - Happy Lunar New Year Everyone! We've launched a fun app(表情包大师/MeMeMaster) on [ModelScope](https://modelscope.cn/studios/iic/MemeMaster/summary) and [HuggingFace](https://huggingface.co/spaces/martinxm/MemeMaster) to create cute meme stickers. Come and have fun with it!
[2024.01.17] - 🎉AnyText has been accepted by ICLR 2024(**Spotlight**)!
[2024.01.04] - FP16 inference is available, 3x faster! Now the demo can be deployed on GPU with >8GB memory. Enjoy!
Expand All @@ -21,7 +22,7 @@ For more AIGC related works of our group, please visit [here](https://github.com
- [ ] Provide a free font file(🤔)
- [ ] Release tools for merging weights from community models or LoRAs
- [ ] Support AnyText in stable-diffusion-webui(🤔)
- [ ] Release AnyText-benchmark dataset and evaluation code
- [x] Release AnyText-benchmark dataset and evaluation code
- [ ] Release AnyWord-3M dataset and training code


Expand Down Expand Up @@ -67,14 +68,46 @@ export CUDA_VISIBLE_DEVICES=0 && python demo.py --font_path your/path/to/font/fi
![demo](docs/demo.jpg "demo")
**Please note** that when executing inference for the first time, the model files will be downloaded to: `~/.cache/modelscope/hub`. If you need to modify the download directory, you can manually specify the environment variable: `MODELSCOPE_CACHE`.

## 🌄Gallery
![gallery](docs/gallery.png "gallery")
## 📈Evaluation
### 1. Data Preparation

Download the AnyText-benchmark dataset from [ModelScope](https://modelscope.cn/datasets/iic/AnyText-benchmark/summary) or [GoogleDrive](https://drive.google.com/drive/folders/1Eesj6HTqT1kCi6QLyL5j0mL_ELYRp3GV) and unzip the files. In *benchmark* folder, *laion_word* and *wukong_word* are datasets for English and Chinese evaluation, respectively. Open each *test1k.json* and modify the `data_root` with your own path of *imgs* folder. The *FID* directory contains images that are used for calculating the FID (Fréchet Inception Distance) score.

## 📈Evaluation
We use Sentence Accuracy (Sen. ACC) and Normalized Edit Distance (NED) to evaluate the accuracy of generated text, and use the FID metric to assess the quality of generated images. Compared to existing methods, AnyText has a significant advantage in both Chinese and English text generation.
### 2. Generate Images

Before evaluation, we need to generate corresponding images for each method based on the evaluation set. We have also provided [pre-generated images](https://drive.google.com/file/d/1pGN35myilYY04ChFtgAosYr0oqeBy4NU/view?usp=drive_link) for all methods. Follow the instructions below to generate images on you own. Note that you need modify the paths and other parameters in the bash script accordingly.
- AnyText
```bash
bash ./eval/gen_imgs_anytext.sh
```
(If you encounter an error caused by huggingface being blocked, please uncomment line 98 of ./models_yaml/anytext_sd15.yaml, and replace the path of the *clip-vit-large-patch14* folder with a local one)
- ControlNet, Textdiffuser, GlyphControl
We use glyph images rendered from AnyText-benchmark dataset as conditional input for these methods:
```bash
bash eval/gen_glyph.sh
```
Next, please clone the official repositories of **ControlNet**, **Textdiffuser**, and **GlyphControl**, and follow their documentation to set up the environment, download the respective checkpoints, and ensure that inference can be executed normally. Then, copy the three files `<method>_singleGPU.py`, `<method>_multiGPUs.py`, and `gen_imgs_<method>.sh` from the *./eval* folder to the root directory of the corresponding codebases, and run:
```bash
bash gen_imgs_<method>.sh
```

### 3. Evaluate

We use Sentence Accuracy (Sen. ACC) and Normalized Edit Distance (NED) to evaluate the accuracy of generated text. Please run:
```bash
bash eval/eval_ocr.sh
```
We use the FID metric to assess the quality of generated images. Please run:
```bash
bash eval/eval_fid.sh
```

Compared to existing methods, AnyText has a significant advantage in both English and Chinese text generation.
![eval](docs/eval.jpg "eval")
Please note that we have reorganized the code and have further aligned the configuration for each method under evaluation. As a result, there may be minor numerical differences compared to those reported in the original paper.

## 🌄Gallery
![gallery](docs/gallery.png "gallery")

## Citation
```
Expand Down
1 change: 0 additions & 1 deletion dataset_util.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@

import json
import pathlib

Expand Down
Binary file modified docs/eval.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
118 changes: 118 additions & 0 deletions eval/anytext_multiGPUs.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,118 @@
import os
import shutil
import copy
import argparse
import pathlib
import json


def load(file_path: str):
file_path = pathlib.Path(file_path)
func_dict = {'.json': load_json}
assert file_path.suffix in func_dict
return func_dict[file_path.suffix](file_path)


def load_json(file_path: str):
with open(file_path, 'r', encoding='utf8') as f:
content = json.load(f)
return content


def save(data, file_path):
file_path = pathlib.Path(file_path)
func_dict = {'.json': save_json}
assert file_path.suffix in func_dict
return func_dict[file_path.suffix](data, file_path)


def save_json(data, file_path):
with open(file_path, 'w', encoding='utf-8') as json_file:
json.dump(data, json_file, ensure_ascii=False, indent=4)


def parse_args():
parser = argparse.ArgumentParser()
parser.add_argument(
"--model_path",
type=str,
default='models/anytext_v1.1.ckpt',
help='path of model'
)
parser.add_argument(
"--gpus",
type=str,
default='0,1,2,3,4,5,6,7',
help='gpus for inference'
)
parser.add_argument(
"--output_dir",
type=str,
default='./anytext_v1.1_laion_generated/',
help="output path"
)
parser.add_argument(
"--json_path",
type=str,
default='/data/vdb/yuxiang.tyx/AIGC/data/laion_word/test1k.json',
help="json path for evaluation dataset"
)
args = parser.parse_args()
return args


if __name__ == "__main__":
args = parse_args()
ckpt_path = args.model_path
gpus = args.gpus
output_dir = args.output_dir
json_path = args.json_path

USING_DLC = False
if USING_DLC:
json_path = json_path.replace('/data/vdb', '/mnt/data', 1)
output_dir = output_dir.replace('/data/vdb', '/mnt/data', 1)

exec_path = './eval/anytext_singleGPU.py'
continue_gen = True # if True, not clear output_dir, and generate rest images.
tmp_dir = './tmp_dir'
if os.path.exists(tmp_dir):
shutil.rmtree(tmp_dir)
os.makedirs(tmp_dir)

if not continue_gen:
if os.path.exists(output_dir):
shutil.rmtree(output_dir)
os.makedirs(output_dir)
else:
if not os.path.exists(output_dir):
os.makedirs(output_dir)

os.system('sleep 1')

gpu_ids = [int(i) for i in gpus.split(',')]
nproc = len(gpu_ids)
all_lines = load(json_path)
split_file = []
length = len(all_lines['data_list']) // nproc
cmds = []
for i in range(nproc):
start, end = i*length, (i+1)*length
if i == nproc - 1:
end = len(all_lines['data_list'])
temp_lines = copy.deepcopy(all_lines)
temp_lines['data_list'] = temp_lines['data_list'][start:end]
tmp_file = os.path.join(tmp_dir, f'tmp_list_{i}.json')
save(temp_lines, tmp_file)
os.system('sleep 1')
cmds += [f'export CUDA_VISIBLE_DEVICES={gpu_ids[i]} && python {exec_path} --input_json {tmp_file} --output_dir {output_dir} --ckpt_path {ckpt_path} && echo proc-{i} done!']
cmds = ' & '.join(cmds)
os.system(cmds)
print('Done.')
os.system('sleep 2')
shutil.rmtree(tmp_dir)

'''
command to kill the task after running:
$ps -ef | grep singleGPU | awk '{ print $2 }' | xargs kill -9 && ps -ef | grep multiproce | awk '{ print $2 }' | xargs kill -9
'''
Loading

0 comments on commit edb145d

Please sign in to comment.