Skip to content

[CVPR 2024] Official implementation of the paper "ReGenNet: Towards Human Action-Reaction Synthesis"

License

Notifications You must be signed in to change notification settings

liangxuy/ReGenNet

Repository files navigation

ReGenNet: Towards Human Action-Reaction Synthesis

This repository contains the content of the following paper:

ReGenNet: Towards Human Action-Reaction Synthesis
Liang Xu1,2, Yizhou Zhou3, Yichao Yan1, Xin Jin2, Wenhan Zhu, Fengyun Rao3, Xiaokang Yang1, Wenjun Zeng2
1 Shanghai Jiao Tong University 2 Eastern Institute of Technology, Ningbo 3WeChat, Tencent Inc.

News

  • [2024.07.14] We release the training, evaluating codes, and the trained models.
  • [2024.03.18] We release the paper and project page of ReGenNet.

Framework

Installation

  1. First, please clone the repository by the following command:

    git clone https://github.com/liangxuy/ReGenNet.git
    cd ReGenNet
    
  2. Setup the environment

    1. Setup the conda environment with the following commands:
    • Install ffmpeg (if not already installed)
      sudo apt update
      sudo apt install ffmpeg
      
    • Setup conda environment
      conda env create -f environment.yml
      conda activate regennet
      python -m spacy download en_core_web_sm
      pip install git+https://github.com/openai/CLIP.git
      
    • Install mpi4py (multiple GPUs)
      sudo apt-get install libopenmpi-dev openmpi-bin
      pip install mpi4py
      

    We also provide a Dockerfile (docker/Dockerfile) if you want to build your own docker environment.

  3. Download other required files

    • You can download the pretrained models at Google drive and move them to save folder to reproduce the results.

    • You need to download the action recognition models at Google drive and move them to recognition_training for evaluation.

    • Download the SMPL neutral models from the SMPL website and the SMPL-X models from the SMPL-X website and then move them to body_models/smpl and body_models/smplx. We also provide a copy here for the convenience.

Data Preparation

NTU RGB+D 120

Since the license of NTU RGB+D 120 dataset does not allow us to distribute its data and annotations, we cannot release the processed NTU RGB+D 120 dataset publicly. If someone is interested at the processed data, please email me.

Chi3D

You can download the original dataset here and the actor-reactor order annotations here.

You can also download the processed dataset at Google Drive and put them under the folder of dataset/chi3d.

InterHuman

You can download the original dataset here and the actor-reactor order annotations here and put them under the folder of dataset/interhuman.

Training

We provide the script to train the model of the online and unconstrained setting for human action-reaction synthesis on the NTU120-AS dataset. --arch, --unconstrained and --dataset can be customized for different settings.

  • Training with 1 GPU:

    # NTU RGB+D 120 Dataset
    python -m train.train_mdm --setting cmdm --save_dir save/cmdm/ntu_smplx --dataset ntu --cond_mask_prob 0 --num_person 2 --layers 8 --num_frames 60 --arch online --overwrite --pose_rep rot6d --body_model smplx --data_path PATH/TO/xsub.train.h5 --train_platform_type TensorboardPlatform --vel_threshold 0.03 --unconstrained
    
    # Chi3D dataset
    python -m train.train_mdm --setting cmdm --save_dir save/cmdm/chi3d_smplx --dataset chi3d --cond_mask_prob 0 --num_person 2 --layers 8 --num_frames 150 --arch online --overwrite --pose_rep rot6d --body_model smplx --data_path PATH/TO/chi3d_smplx_train.h5 --train_platform_type TensorboardPlatform --vel_threshold 0.01 --unconstrained
    
  • Training with multiple GPUs (4 GPUs in the example):

    mpiexec -n 4 --allow-run-as-root python -m train.train_mdm --setting cmdm --save_dir save/cmdm/ntu_smplx --dataset ntu --cond_mask_prob 0 --num_person 2 --layers 8 --num_frames 60 --arch online --overwrite --pose_rep rot6d --body_model smplx --data_path PATH/TO/xsub.train.h5 --train_platform_type TensorboardPlatform --vel_threshold 0.03 --unconstrained
    

Evaluation

For the action recognition model, you can

  1. Directly download the trained action recognition model here;

  2. Or you can train your action recognition model:

    The code of training the action recognition model is based on the ACTOR repository.

    Commands for training your own action recognition model:
    cd actor-x;
    # Before training, you need to set up the `dataset` and folder of the `SMPL-X models`
    ### NTU RGB+D 120 ###
    python -m src.train.train_stgcn --dataset ntu120_2p_smplx --pose_rep rot6d --num_epochs 100 --snapshot 10 --batch_size 64 --lr 0.0001 --num_frames 60 --sampling conseq --sampling_step 1 --glob --translation --folder recognition_training/ntu_smplx --datapath dataset/ntu120/smplx/conditioned/xsub.train.h5 --num_person 2 --body_model smplx
    
    ### Chi3D ###
    python -m src.train.train_stgcn --dataset chi3d --pose_rep rot6d --num_epochs 100 --snapshot 10 --batch_size 64 --lr 0.0001 --num_frames 150 --sampling conseq --sampling_step 1 --glob --translation --folder recognition_training/chi3d_smplx --datapath dataset/chi3d/smplx/conditioned/chi3d_smplx_train.h5 --num_person 2 --body_model smplx

The following script will evaluate the trained model of PATH/TO/model_XXXX.pt, the rec_model_path is the action recognition model. The results will be written to PATH/TO/evaluation_results_XXXX_full.yaml. We use ddim5 to accelerate the evaluation process.

python -m eval.eval_cmdm --model PATH/TO/model_XXXX.pt --eval_mode full --rec_model_path PATH/TO/checkpoint_0100.pth.tar --use_ddim --timestep_respacing ddim5

If you want to get a table with mean and interval, you can use this script:

python -m eval.easy_table PATH/TO/evaluation_results_XXXX_full.yaml

Motion Synthesis and Visualize

  1. Generate the results, and the results will be saved to results.npy.

    python -m sample.cgenerate --model_path PATH/TO/model_XXXX.pt --action_file assets/action_names_XXX.txt --num_repetitions 10 --dataset ntu --body_model smplx --num_person 2 --pose_rep rot6d --data_path PATH/TO/xsub.test.h5 --output_dir XXX
    
  2. Render the results

    Install additional dependencies

    pip install trimesh
    pip install pyrender
    pip install imageio-ffmpeg
    
    python -m render.crendermotion --data_path PATH/TO/results.npy --num_person 2 --setting cmdm --body_model smplx
    

TODO

  • Release the training, evaluating codes, and the trained models.
  • Release the annotation results.

Acknowledgments

We want to thank the following contributors that our code is based on:

ACTOR, motion diffusion model, guided diffusion, text-to-motion, HumanML3D

License

This code is distributed under an MIT LICENSE.

Note that our code depends on other libraries, including CLIP, SMPL, SMPL-X, PyTorch3D, and uses datasets that each have their own respective licenses that must also be followed.

Citation

If you find ReGenNet is useful for your research, please cite us:

@inproceedings{xu2024regennet,
  title={ReGenNet: Towards Human Action-Reaction Synthesis},
  author={Xu, Liang and Zhou, Yizhou and Yan, Yichao and Jin, Xin and Zhu, Wenhan and Rao, Fengyun and Yang, Xiaokang and Zeng, Wenjun},
  booktitle={CVPR},
  pages={1759--1769},
  year={2024}
}

About

[CVPR 2024] Official implementation of the paper "ReGenNet: Towards Human Action-Reaction Synthesis"

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published