BeLFusion

Latent Diffusion for Behavior-Driven Human Motion Prediction (ICCV'23)

This repository contains the official PyTorch implementation of the paper:

BeLFusion: Latent Diffusion for Behavior-Driven Human Motion Prediction
German Barquero, Sergio Escalera, and Cristina Palmero
ICCV 2023
[website] [paper] [demo]

Note: our data loaders consider an extra dimension for the number of people in the scene. Since the project aims at single-human motion prediction, this dimension is always 1.

Installation

1. Environment

OPTION 1 - Python/conda environment

conda create -n belfusion python=3.9.5
conda activate belfusion
pip install -r requirements.txt

OPTION 2 - Docker

We also provide a DockerFile to build a Docker image with all the required dependencies.

IMPORTANT: This option will not let you launch the visualization script, as it requires a GUI. You will be able though to train and evaluate the models.

To build and launch the Docker image, run the following commands from the root of the repository:

docker build . -t belfusion
docker run -it --gpus all --rm --name belfusion \
-v ${PWD}:/project \
belfusion

You should now be in the container, ready to run the code.

2. Datasets

> Human3.6M

Extract the Poses-D3Positions* folders for S1, S5, S6, S7, S8, S9, S11 into ./datasets/Human36M. Then, run:

python -m data_loader.parsers.h36m

> AMASS

Download the SMPL+H G files for 22 datasets: ACCAD, BMLhandball, BMLmovi, BMLrub, CMU, DanceDB, DFaust, EKUT, EyesJapanDataset, GRAB, HDM05, HUMAN4D, HumanEva, KIT, MoSh, PosePrior (MPI_Limits), SFU, SOMA, SSM, TCDHands, TotalCapture, and Transitions. Then, move the tar.bz2 files to ./datasets/AMASS (DO NOT extract them).

Now, download the 'DMPLs for AMASS' from here, and the 'Extended SMPL+H model' from here. Move both extracted folders (dmpls, smplh) to ./auxiliar/body_models. Then, run:

python -m data_loader.parsers.amass --gpu

Note 1: remove the --gpu flag if you do not have a GPU.

Note 2: this step could take a while (~2 hours in CPU, ~20-30 minutes in GPU).

3. Checkpoints (link)

Replace the folder 'checkpoints' in the root of the repository with the downloaded one. If you want to train the models from scratch, you can skip this step and go to the training section.

Evaluation

Run the following scripts to evaluate BeLFusion and the other state-of-the-art methods.

Human3.6M:

# BeLFusion 
python eval_belfusion.py -c checkpoints/ours/h36m/BeLFusion/final_model/ -i 217 --ema --mode stats --batch_size 512

# Baselines --> {ThePoseKnows, DLow, GSPS, DiverseSampling}
python eval_baseline.py -c checkpoints/baselines/h36m/<BASELINE_NAME>/exp -m stats --batch_size 512

AMASS:

# BeLFusion
python eval_belfusion.py -c checkpoints/ours/amass/BeLFusion/final_model/ -i 1262 --multimodal_threshold 0.4 --ema --mode stats --batch_size 512

# Baselines --> {ThePoseKnows, DLow, GSPS, DiverseSampling}
python eval_baseline.py -c checkpoints/baselines/amass/<BASELINE_NAME>/exp -m stats --batch_size 512 --multimodal_threshold 0.4

Add --stats_mode all to also compute the MMADE, MMFDE (increased computation time).
Add -cpu to run the evaluation in CPU (recommended for low-memory GPUs).
(only for BeLFusion) Use --dstride S to compute the evaluation metrics every S denoising steps (increased computation time). If S=10, the metrics will be computed for step 1 (BeLFusion_D), and 10 (BeLFusion).

Visualization

Run the following scripts to visualize the results of BeLFusion and the other state-of-the-art methods (<DATASET> in {h36m, amass}).

# BeLFusion with Human3.6M (press '0' to visualize BeLFusion_D)
python eval_belfusion.py -c checkpoints/ours/h36m/BeLFusion/final_model/ -i 217 --ema --mode vis --batch_size 64 --dstride 10

# BeLFusion with AMASS (press '0' to visualize BeLFusion_D)
python eval_belfusion.py -c checkpoints/ours/amass/BeLFusion/final_model/ -i 1262 --ema --mode vis --batch_size 64 --dstride 10

# Baselines --> {ThePoseKnows, DLow, GSPS, DiverseSampling}
python eval_baseline.py -c checkpoints/baselines/<DATASET>/<BASELINE_NAME>/exp -m vis --batch_size 64

Press n to navigate between the samples.
Set --samples N to generate N samples. Set the columns in the visualization grid with --ncols N.
During visualization, press h to show only the future motion (without observation).
(only for BeLFusion) When --dstride S for S != -1, you can visualize the output of BeLFusion every S denoising steps (press keys 0, 1, 2, ..., to navigate from 1, 1+S, 1+2S, ...).

Note: Replace --mode vis with --mode gen to generate the gif animations instead of visualizing them. In this mode, set the argument --store_idx I to store the gifs for denoising step I. For example, set I to 1 for BeLFusion_D's outputs.

Training

For training BeLFusion from scratch, you need to first train the Behavioral Latent Space (BLS) and the observation autoencoder (<DATASET> in {h36m, amass}). Both models can be trained in parallel:

# Observation autoencoder --> 500 epochs
python train_auto.py -c checkpoints/ours/<DATASET>/BeLFusion/final_model/autoencoder_obs/config.json

# BLS --> 2x500 epochs
python train_bls.py -c checkpoints/ours/<DATASET>/BeLFusion/final_model/behavioral_latent_space/config.json

Once they finish, you can train the Latent Diffusion Model (LDM):

# BeLFusion --> 217/1262 epochs for H36M/AMASS
python train_belfusion.py -c checkpoints/ours/<DATASET>/BeLFusion/final_model/config.json

Citation

If you find our work useful in your research, please consider citing our paper:

@inproceedings{barquero2023belfusion,
  title={BeLFusion: Latent Diffusion for Behavior-Driven Human Motion Prediction},
  author={Barquero, German and Escalera, Sergio and Palmero, Cristina},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  year={2023}
}

License

The software in this repository is freely available for free non-commercial use (see license for further details).

Note 1: project structure borrowed from @victoresque's template.

Note 2: code under ./models/sota is based on the original implementations of the corresponding papers (Dlow, DiverseSampling, and GSPS).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BeLFusion

Latent Diffusion for Behavior-Driven Human Motion Prediction (ICCV'23)

Installation

1. Environment

2. Datasets

> Human3.6M

> AMASS

3. Checkpoints (link)

Evaluation

Visualization

Training

Citation

License

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
assets		assets
auxiliar		auxiliar
base		base
checkpoints/ours		checkpoints/ours
data_loader		data_loader
logger		logger
metrics		metrics
models		models
trainer		trainer
utils		utils
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
eval_baseline.py		eval_baseline.py
eval_belfusion.py		eval_belfusion.py
parse_config.py		parse_config.py
requirements.txt		requirements.txt
train_auto.py		train_auto.py
train_belfusion.py		train_belfusion.py
train_bls.py		train_bls.py

License

BarqueroGerman/BeLFusion

Folders and files

Latest commit

History

Repository files navigation

BeLFusion

Latent Diffusion for Behavior-Driven Human Motion Prediction (ICCV'23)

Installation

1. Environment

2. Datasets

> Human3.6M

> AMASS

3. Checkpoints (link)

Evaluation

Visualization

Training

Citation

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages