Talk2Face: A Unified Sequence-based Framework for Diverse Face Generation and Analysis Tasks

ACM MM 2022

Yudong Li, Xianxu Hou, Zhe Zhao, Linlin Shen, Xuefeng Yang and Kimmo Yan

Useage

Our implementation is based on TencentPretrain.

git clone https://github.com/Tencent/TencentPretrain.git
cd TencentPretrain

Download the pre-trained VQGAN and Talk2face weights. Install the VQGAN requirements.

pip install taming-transformers-rom1504

There are three modes for Talk2face to use shared model weights, namely {to_attributes}, {to_caption} and {to_image}.

To image

Given a text description, ask the model to generate the corresponding face images. Write a text description to a file, e.g. "This guy doesn't have any beard and has no fringe. He is in his middle age and has no smile." to text.txt.

python3 scripts/generate_talk2face.py --load_model_path talk2face.bin \
                    --config_path models/talk2face/base_config.json \
                    --vocab_path models/google_uncased_en_vocab.txt \
                    --vqgan_config_path ffhq_transformer/configs/2021-04-23T18-19-01-project.yaml \
                    --vqgan_model_path ffhq_transformer/checkpoints/last.ckpt \
                    --prompt to_image --text_prefix_path text.txt

Note that the results presented in the paper were filtered by CLIP. In this project, multiple results are generated according to the parameter --samples_num. Users can choose with the help of CLIP or manually as the final output.

To caption

Given a face image, ask the model to generate the text descriptions.

python3 scripts/generate_talk2face.py --load_model_path talk2face.bin \
                    --config_path models/talk2face/base_config.json \
                    --vocab_path models/google_uncased_en_vocab.txt \
                    --vqgan_config_path ffhq_transformer/configs/2021-04-23T18-19-01-project.yaml \
                    --vqgan_model_path ffhq_transformer/checkpoints/last.ckpt \
                    --prompt to_caption --image_prefix_path img.jpg

To attributes

Given a face image and an attribute, ask the model to predict the value of the attribute. Write an attribute to a file, e.g. "gender: " to attribute.txt. The model will predict the value as male or female.

python3 scripts/generate_talk2face.py --load_model_path talk2face.bin \
                    --config_path models/talk2face/base_config.json \
                    --vocab_path models/google_uncased_en_vocab.txt \
                    --vqgan_config_path ffhq_transformer/configs/2021-04-23T18-19-01-project.yaml \
                    --vqgan_model_path ffhq_transformer/checkpoints/last.ckpt \
                    --prompt to_attributes --image_prefix_path img.jpg \
                    --text_prefix_path attr.txt

BibTeX

@inproceedings{li2022talk2face,
  title={Talk2Face: A Unified Sequence-based Framework for Diverse Face Generation and Analysis Tasks},
  author={Li, Yudong and Hou, Xianxu and Zhao, Zhe and Shen, Linlin and Yang, Xuefeng and Yan, Kimmo},
  booktitle={Proceedings of the 30th ACM International Conference on Multimedia},
  pages={4594--4604},
  year={2022}
}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Talk2Face: A Unified Sequence-based Framework for Diverse Face Generation and Analysis Tasks

Useage

To image

To caption

To attributes

BibTeX

About

Releases

Packages

ydli-ai/Talk2Face

Folders and files

Latest commit

History

Repository files navigation

Talk2Face: A Unified Sequence-based Framework for Diverse Face Generation and Analysis Tasks

Useage

To image

To caption

To attributes

BibTeX

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages