Skip to content

Latest commit

 

History

History
212 lines (150 loc) · 6.39 KB

README.md

File metadata and controls

212 lines (150 loc) · 6.39 KB

AMG: Avatar Motion Guided Video Generation

Zhangsihao Yang1 · Mengyi Shan2 · Mohammad Farazi1 · Wenhui Zhu1 · Yanxi Chen1 · Xuanzhao Dong1 · Yalin Wang1
1Arizona State University     2University of Washington
Paper PDF Project Page

Human video generation is a challenging task due to the complexity of human body movements and the need for photorealism. While 2D methods excel in realism, they lack 3D control, and 3D avatar-based approaches struggle with seamless background integration. We introduce AMG, a method that merges 2D photorealism with 3D control by conditioning video diffusion models on 3D avatar renderings. AMG enables multi-person video generation with precise control over camera positions, human motions, and background style, outperforming existing methods in realism and adaptability.

0. Getting Started

0.1 Setting Up the vgen Virtual Environment

Two Methods

Method 1: With sudo Privileges

Follow the instructions from vgen.

You might need to install diffusers and other possible packages.

conda create -n vgen python=3.8
conda activate vgen

pip install torch==1.12.0+cu113 torchvision==0.13.0+cu113 torchaudio==0.12.0 --extra-index-url https://download.pytorch.org/whl/cu113

pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple

sudo apt-get update && apt-get install ffmpeg libsm6 libxext6  -y

Method 2: Using slurm and mamba

For slurm-based environments (mamba or conda), modified from vgen.

module load mamba/latest

module load cuda-11.3.1-gcc-12.1.0

mamba create -p /scratch/<user_id>/envs/vgen python=3.8 -y

mkdir -p /scratch/<user_id>/envs/vgen/opt
cd /scratch/<user_id>/envs/vgen/opt

git clone https://github.com/ali-vilab/VGen.git

pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple

pip install torch==2.2.0 torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu113

python -m pip install diffusers==0.23.0
python -m pip install ffmpeg-python

0.2 Download Initialization Models

Download model.zip to _runtime and unzip it.

1. Inference

1.1 Preparation

  1. Download weight

The weights for inference could be downloaded from here (5.28GB).

  1. Install amg package

Run the following command line for installing amg package.

pip install -e .

1.2 Try it out!

Once finish the steps above, you could try any of the following examples:

  1. change background
  2. move camera
  3. change motion
1.2.1 Change Background

Run the command below to get change background results:

python applications/change_background.py --cfg configs/applications/change_background/demo.yaml

The results are store under newly created folder _demo_results/change_background. You should be able to see exact same results like the following:

Input Reference Generated
GIF description
1.2.2 Move Camera

Run the command below to get move camera results:

python applications/move_camera.py --cfg configs/applications/move_camera/demo.yaml

The results are store under newly created folder _demo_results/move_camera. You should be able to see exact same results like the following:

Input Generated
GIF description
1.2.3 Change Motion

Run the command below to get change motion results:

python applications/change_motion.py --cfg configs/applications/change_motion/demo.yaml

The results are store under newly created folder _demo_results/change_motion. You should be able to see exact same results like the following:

Generated Input
GIF description

Traning

Download data

  1. Fill out this google form for reqeusting of the processed dataset in the paper.

  2. Put the downloaded data under _data.

Start training

python train_net_ddp.py --cfg configs/train.yaml

Folder Structure

  • configs
  • demo_data