Created by Jinglin Xu, Yijie Guo, Yuxin Peng
This repository contains the PyTorch implementation for FinePOSE. (CVPR 2024, Highlight)
Make sure you have the following dependencies installed (python):
- pytorch >= 0.4.0
- matplotlib=3.1.0
- einops
- timm
- tensorboard
- CLIP
pip install git+https://github.com/openai/CLIP.git
You should download MATLAB if you want to evaluate our model on MPI-INF-3DHP dataset.
Our model is evaluated on Human3.6M and MPI-INF-3DHP datasets.
We set up the Human3.6M dataset in the same way as VideoPose3D. You can download the processed data from here. data_2d_h36m_gt.npz
is the ground truth of 2D keypoints. data_2d_h36m_cpn_ft_h36m_dbb.npz
is the 2D keypoints obatined by CPN. data_3d_h36m.npz
is the ground truth of 3D human joints. Put them in the ./data
directory.
We set up the MPI-INF-3DHP dataset following P-STMO. However, our training/testing data is different from theirs. They train and evaluate on 3D poses scaled to the height of the universal skeleton used by Human3.6M (officially called "univ_annot3"), while we use the ground truth 3D poses (officially called "annot3"). The former does not guarantee that the reprojection (used by the proposed JPMA) of the rescaled 3D poses is consistent with the 2D inputs, while the latter does. You can download our processed data from here. Put them in the ./data
directory.
To evaluate our FinePOSE with JPMA using the 2D keypoints obtained by CPN as inputs, please run:
python main.py -k cpn_ft_h36m_dbb -c checkpoint/model_h36m -gpu 0,1 --nolog --evaluate best_epoch_20_10.bin -num_proposals 20 -sampling_timesteps 10 -b 4
To evaluate our FinePOSE with JPMA using the ground truth 2D poses as inputs, please run:
python main_3dhp.py -c checkpoint/model_3dhp -gpu 0,1 --nolog --evaluate best_epoch_20_10.bin -num_proposals 20 -sampling_timesteps 10 -b 4
After that, the predicted 3D poses under P-Best, P-Agg, J-Best, J-Agg settings are saved as four files (.mat
) in ./checkpoint
. To get the MPJPE, AUC, PCK metrics, you can evaluate the predictions by running a Matlab script ./3dhp_test/test_util/mpii_test_predictions_ori_py.m
(you can change 'aggregation_mode' in line 29 to get results under different settings). Then, the evaluation results are saved in ./3dhp_test/test_util/mpii_3dhp_evaluation_sequencewise_ori_{setting name}_t{iteration index}.csv
. You can manually average the three metrics in these files over six sequences to get the final results.
Trained on 2*NVIDIA RTX 4090.
To train our model using the 2D keypoints obtained by CPN as inputs, please run:
python main.py -k cpn_ft_h36m_dbb -c checkpoint/model_h36m -gpu 0,1 --nolog
To train our model using the ground truth 2D poses as inputs, please run:
python main_3dhp.py -c checkpoint/model_3dhp -gpu 0,1 --nolog
@InProceedings{Xu_2024_CVPR_finepose,
author = {Xu, Jinglin and Guo, Yijie and Peng, Yuxin},
title = {FinePOSE: Fine-Grained Prompt-Driven 3D Human Pose Estimation via Diffusion Models},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2024},
pages = {561-570}
}
Our code refers to the following repositories.
We thank the authors for releasing their codes.