Skip to content

[ECCV 2024] HeadStudio: Text to Animatable Head Avatars with 3D Gaussian Splatting.

License

Notifications You must be signed in to change notification settings

ZhenglinZhou/HeadStudio

Repository files navigation

HeadStudio: Text to Animatable Head Avatars with 3D Gaussian Splatting

Zhenglin Zhou · Fan Ma · Hehe Fan · Zongxin Yang · Yi Yang*

ReLER, CCAI, Zhejiang University

*corresponding authors

GitHub

demo_arxiv.mp4

Text to Head Avatars Generation

Text-based animatable avatars generation by HeadStudio.

Installation

All the followings have been tested successfully in cuda 11.8.

# clone the github repo
git clone https://github.com/zhenglinzhou/HeadStudio-open.git
cd HeadStudio-open

Create a conda environment:

# make a new conda env (optional)
conda create -n headstudio python=3.9
conda activate headstudio

It may take some time to install:

# install necessary packages
pip install torch==2.0.1+cu118 torchvision==0.15.2+cu118 torchaudio==2.0.2 --index-url https://download.pytorch.org/whl/cu118

# install some packages using conda
bash packages.sh

# install packages using pip
pip install -r requirements.txt

# a modified gaussian splatting (+ depth, alpha rendering)
git clone --recursive https://github.com/ashawkey/diff-gaussian-rasterization
pip install ./diff-gaussian-rasterization
  • HeadStudio is built on the FLAME. Before you continue, please kindly register and agree to the license from https://flame.is.tue.mpg.de.
  • Download FLAME 2020 which contains FLAME_FEMALE.pkl, FLAME_GENERIC.pkl, FLAME_MAKE.pkl from https://flame.is.tue.mpg.de.
  • Download other ckpts and training/validation files from here.
  • Make the folder like this:
.
|-ckpts
    |-ControlNet-Mediapipe
        |-flame2facemsh.npy
        |-mediapipe_landmark_embedding.npz
    |-FLAME-2000
        |-FLAME_FEMALE.pkl
        |-FLAME_GENERIC.pkl
        |-FLAME_MAKE.pkl
        |-flame_static_embeddings.pkl
        |-flame_dynamic_embeddings.pkl
|-talkshow
    # for training with animation
    |-collection
        |-cemistry_exp.npy
    # for evaluation
    |-ExpressiveWholeBodyDatasetReleaseV1.0
...
  • Specify the talkshow_train_path and talkshow_val_path in ./configs/headstudio.yaml.

Usage

python3 launch.py \
--config configs/headstudio.yaml --train system.prompt_processor.prompt='a DSLR portrait of Joker in DC, masterpiece, Studio Quality, 8k, ultra-HD, next generation' \
system.guidance.use_nfsd=True system.max_grad=0.001 system.area_relax=True

More examples can be found in ./scripts/headstudio.sh

Prepare Animation Data

  1. Install TalkSHOW. You had better use another python environment for following animation, since TalkSHOW needs python 3.7.

please remember to install torchaudio~=0.13.1, torchvision~=0.14.1.

  1. Download SHOW_dataset_v1.0.zip following this.

Animation

Video-based Animation

Animate the avatar using .pkl file captured from video clip (SHOW_dataset_v1.0.zip).

python3 animation.py

Audio-based Animation

  • Copy the ./scripts/demo.py into TalkSHOW folder.
  • Specify the save_root in demo.py.
  • Given an audio clip, generate FLAME sequences via TalkSHOW as below, please specify path-to-wav-file.
cd TalkSHOW
python3 demo.py \
--config_file ./config/body_pixel.json --infer --audio_file path-to-wav-file \
--id 0 --only_face
  • Animate avatars using generated FLAME sequences via TalkSHOW.
python3 animation_TalkSHOW.py --audio path-to-audio --avatar path-to-avatar

Text-based Animation

  • Generate the audio with given text using PlayHT.
  • Transfer to audio-based animation.

Acknowledgements

  • HeadStudio is developed by ReLER at Zhejiang University, all copyright reserved.
  • Thanks Duochao and Xuancheng to fix bugs and further develop this work.
  • Thanks PlayHT, we use it for text to audio generation.
  • Thanks TalkSHOW, we use it for audio-based avatar driven.
  • Thanks threestudio, GaussianAvatars, HumanGaussian, TADA, this work is built on these amazing research works.

Notes

  • If you have questions or find bugs, feel free to open an issue or email the first author (zhenglinzhou@zju.edu.cn)!
  • If you encounter RuntimeError: an illegal memory access was encountered or numel: integer multiplication overflow errors during rasterization, try to reinstall diff-gaussian-rasterization with -fno-gnu-unique flag. For more details look here

Cite

If you find HeadStudio useful for your research and applications, please cite us using this BibTeX:

@inproceedings{zhou2024headstudio,
  title = {HeadStudio: Text to Animatable Head Avatars with 3D Gaussian Splatting},
  author = {Zhenglin Zhou and Fan Ma and Hehe Fan and Zongxin Yang and Yi Yang},
  booktile = {ECCV},
  year={2024},
}

About

[ECCV 2024] HeadStudio: Text to Animatable Head Avatars with 3D Gaussian Splatting.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages