forked from open-mmlab/mmsegmentation
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[Feature] Voxelpose (open-mmlab#1050)
* [Enhancement] inference speed and flops tools. (open-mmlab#986) * add the function to test the dummy forward speed of models. * add tools to test the flops and inference speed of multiple models. * [Feature] Add ViPNAS models for wholebody keypoint detection (open-mmlab#1009) * add configs * add dark configs * add checkpoint and readme * update webcam demo * fix model path in webcam demo * fix unittest * update model metafiles (open-mmlab#1001) * [Feature] Add ViPNAS mbv3 (open-mmlab#1025) * add vipnas mbv3 * test other variants * submission for mmpose * add unittest * add readme * update .yml * fix lint * rebase * fix pytest Co-authored-by: jin-s13 <jinsheng13@foxmail.com> * add cfg file for flops and speed test, change the bulid_posenet to init_pose_model and fix some typo in cfg (open-mmlab#1028) * Skip CI when some specific files were changed (open-mmlab#1041) * add voxelpose * unit test * unit test * unit test * add docs/ckpts * del unnecessary comments * correct typos in comments and docs * Add or modify docs * change variable names * reduce memory cost in test * get person_id * rebase * resolve comments * rebase master * rename cfg files * fix typos in comments Co-authored-by: zengwang430521 <zengwang430521@gmail.com> Co-authored-by: Yining Li <liyining0712@gmail.com> Co-authored-by: Lumin <30328525+luminxu@users.noreply.github.com> Co-authored-by: jin-s13 <jinsheng13@foxmail.com> Co-authored-by: Qikai Li <87690686+liqikai9@users.noreply.github.com> Co-authored-by: QwQ2000 <396707050@qq.com>
- Loading branch information
1 parent
5fcb34a
commit 1621255
Showing
37 changed files
with
26,684 additions
and
16 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,160 @@ | ||
dataset_info = dict( | ||
dataset_name='panoptic_pose_3d', | ||
paper_info=dict( | ||
author='Joo, Hanbyul and Simon, Tomas and Li, Xulong' | ||
'and Liu, Hao and Tan, Lei and Gui, Lin and Banerjee, Sean' | ||
'and Godisart, Timothy and Nabbe, Bart and Matthews, Iain' | ||
'and Kanade, Takeo and Nobuhara, Shohei and Sheikh, Yaser', | ||
title='Panoptic Studio: A Massively Multiview System ' | ||
'for Interaction Motion Capture', | ||
container='IEEE Transactions on Pattern Analysis' | ||
' and Machine Intelligence', | ||
year='2017', | ||
homepage='http://domedb.perception.cs.cmu.edu', | ||
), | ||
keypoint_info={ | ||
0: | ||
dict(name='neck', id=0, color=[51, 153, 255], type='upper', swap=''), | ||
1: | ||
dict(name='nose', id=1, color=[51, 153, 255], type='upper', swap=''), | ||
2: | ||
dict(name='mid_hip', id=2, color=[0, 255, 0], type='lower', swap=''), | ||
3: | ||
dict( | ||
name='left_shoulder', | ||
id=3, | ||
color=[0, 255, 0], | ||
type='upper', | ||
swap='right_shoulder'), | ||
4: | ||
dict( | ||
name='left_elbow', | ||
id=4, | ||
color=[0, 255, 0], | ||
type='upper', | ||
swap='right_elbow'), | ||
5: | ||
dict( | ||
name='left_wrist', | ||
id=5, | ||
color=[0, 255, 0], | ||
type='upper', | ||
swap='right_wrist'), | ||
6: | ||
dict( | ||
name='left_hip', | ||
id=6, | ||
color=[0, 255, 0], | ||
type='lower', | ||
swap='right_hip'), | ||
7: | ||
dict( | ||
name='left_knee', | ||
id=7, | ||
color=[0, 255, 0], | ||
type='lower', | ||
swap='right_knee'), | ||
8: | ||
dict( | ||
name='left_ankle', | ||
id=8, | ||
color=[0, 255, 0], | ||
type='lower', | ||
swap='right_ankle'), | ||
9: | ||
dict( | ||
name='right_shoulder', | ||
id=9, | ||
color=[255, 128, 0], | ||
type='upper', | ||
swap='left_shoulder'), | ||
10: | ||
dict( | ||
name='right_elbow', | ||
id=10, | ||
color=[255, 128, 0], | ||
type='upper', | ||
swap='left_elbow'), | ||
11: | ||
dict( | ||
name='right_wrist', | ||
id=11, | ||
color=[255, 128, 0], | ||
type='upper', | ||
swap='left_wrist'), | ||
12: | ||
dict( | ||
name='right_hip', | ||
id=12, | ||
color=[255, 128, 0], | ||
type='lower', | ||
swap='left_hip'), | ||
13: | ||
dict( | ||
name='right_knee', | ||
id=13, | ||
color=[255, 128, 0], | ||
type='lower', | ||
swap='left_knee'), | ||
14: | ||
dict( | ||
name='right_ankle', | ||
id=14, | ||
color=[255, 128, 0], | ||
type='lower', | ||
swap='left_ankle'), | ||
15: | ||
dict( | ||
name='left_eye', | ||
id=15, | ||
color=[51, 153, 255], | ||
type='upper', | ||
swap='right_eye'), | ||
16: | ||
dict( | ||
name='left_ear', | ||
id=16, | ||
color=[51, 153, 255], | ||
type='upper', | ||
swap='right_ear'), | ||
17: | ||
dict( | ||
name='right_eye', | ||
id=17, | ||
color=[51, 153, 255], | ||
type='upper', | ||
swap='left_eye'), | ||
18: | ||
dict( | ||
name='right_ear', | ||
id=18, | ||
color=[51, 153, 255], | ||
type='upper', | ||
swap='left_ear') | ||
}, | ||
skeleton_info={ | ||
0: dict(link=('nose', 'neck'), id=0, color=[51, 153, 255]), | ||
1: dict(link=('neck', 'left_shoulder'), id=1, color=[0, 255, 0]), | ||
2: dict(link=('neck', 'right_shoulder'), id=2, color=[255, 128, 0]), | ||
3: dict(link=('left_shoulder', 'left_elbow'), id=3, color=[0, 255, 0]), | ||
4: dict( | ||
link=('right_shoulder', 'right_elbow'), id=4, color=[255, 128, 0]), | ||
5: dict(link=('left_elbow', 'left_wrist'), id=5, color=[0, 255, 0]), | ||
6: | ||
dict(link=('right_elbow', 'right_wrist'), id=6, color=[255, 128, 0]), | ||
7: dict(link=('left_ankle', 'left_knee'), id=7, color=[0, 255, 0]), | ||
8: dict(link=('left_knee', 'left_hip'), id=8, color=[0, 255, 0]), | ||
9: dict(link=('right_ankle', 'right_knee'), id=9, color=[255, 128, 0]), | ||
10: dict(link=('right_knee', 'right_hip'), id=10, color=[255, 128, 0]), | ||
11: dict(link=('mid_hip', 'left_hip'), id=11, color=[0, 255, 0]), | ||
12: dict(link=('mid_hip', 'right_hip'), id=12, color=[255, 128, 0]), | ||
13: dict(link=('mid_hip', 'neck'), id=13, color=[51, 153, 255]), | ||
}, | ||
joint_weights=[ | ||
1.0, 1.0, 1.0, 1.0, 1.2, 1.5, 1.0, 1.2, 1.5, 1.0, 1.2, 1.5, 1.0, 1.2, | ||
1.5, 1.0, 1.0, 1.0, 1.0 | ||
], | ||
sigmas=[ | ||
0.026, 0.026, 0.107, 0.079, 0.072, 0.062, 0.107, 0.087, 0.089, 0.079, | ||
0.072, 0.062, 0.107, 0.087, 0.089, 0.025, 0.035, 0.025, 0.035 | ||
]) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
# Multi-view 3D Human Body Pose Estimation | ||
|
||
Multi-view 3D human body pose estimation targets at predicting the X, Y, Z coordinates of human body joints from multi-view RGB images. | ||
For this task, we currently support [VoxelPose](configs/body/3d_kpt_mview_rgb_img/voxelpose). | ||
|
||
## Data preparation | ||
|
||
Please follow [DATA Preparation](/docs/tasks/3d_body_keypoint.md) to prepare data. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
# VoxelPose: Towards Multi-Camera 3D Human Pose Estimation in Wild Environment | ||
|
||
<!-- [ALGORITHM] --> | ||
|
||
<details> | ||
<summary align="right"><a href="https://www.ecva.net/papers/eccv_2020/papers_ECCV/papers/123460188.pdf">VoxelPose (ECCV'2020)</a></summary> | ||
|
||
```bibtex | ||
@inproceedings{tumultipose, | ||
title={VoxelPose: Towards Multi-Camera 3D Human Pose Estimation in Wild Environment}, | ||
author={Tu, Hanyue and Wang, Chunyu and Zeng, Wenjun}, | ||
booktitle={ECCV}, | ||
year={2020} | ||
} | ||
``` | ||
|
||
</details> | ||
|
||
VoxelPose proposes to break down the task of 3d human pose estimation into 2 stages: (1) Human center detection by Cuboid Proposal Network | ||
(2) Human pose regression by Pose Regression Network. | ||
|
||
The networks in the two stages are all based on 3D convolution. And the input feature volumes are generated by projecting each voxel to | ||
multi-view images and sampling at the projected location on the 2D heatmaps. |
37 changes: 37 additions & 0 deletions
37
...w_rgb_img/voxelpose/panoptic/voxelpose_prn64x64x64_cpn80x80x20_panoptic_cam5.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,37 @@ | ||
<!-- [ALGORITHM] --> | ||
|
||
<details> | ||
<summary align="right"><a href="https://www.ecva.net/papers/eccv_2020/papers_ECCV/papers/123460188.pdf">VoxelPose (ECCV'2020)</a></summary> | ||
|
||
```bibtex | ||
@inproceedings{tumultipose, | ||
title={VoxelPose: Towards Multi-Camera 3D Human Pose Estimation in Wild Environment}, | ||
author={Tu, Hanyue and Wang, Chunyu and Zeng, Wenjun}, | ||
booktitle={ECCV}, | ||
year={2020} | ||
} | ||
``` | ||
|
||
</details> | ||
|
||
<!-- [DATASET] --> | ||
|
||
<details> | ||
<summary align="right"><a href="https://openaccess.thecvf.com/content_iccv_2015/html/Joo_Panoptic_Studio_A_ICCV_2015_paper.html">CMU Panoptic (ICCV'2015)</a></summary> | ||
|
||
```bibtex | ||
@Article = {joo_iccv_2015, | ||
author = {Hanbyul Joo, Hao Liu, Lei Tan, Lin Gui, Bart Nabbe, Iain Matthews, Takeo Kanade, Shohei Nobuhara, and Yaser Sheikh}, | ||
title = {Panoptic Studio: A Massively Multiview System for Social Motion Capture}, | ||
booktitle = {ICCV}, | ||
year = {2015} | ||
} | ||
``` | ||
|
||
</details> | ||
|
||
Results on CMU Panoptic dataset. | ||
|
||
| Arch | mAP | mAR | MPJPE | Recall@500mm| ckpt | log | | ||
| :--- | :---: | :---: | :---: | :---: | :---: | :---: | | ||
| [prn64_cpn80_res50](/configs/body/3d_kpt_mview_rgb_img/voxelpose/panoptic/voxelpose_prn64x64x64_cpn80x80x20_panoptic_cam5.py) | 97.31 | 97.99 | 17.57| 99.85| [ckpt](https://download.openmmlab.com/mmpose/body3d/voxelpose/voxelpose_prn64x64x64_cpn80x80x20_panoptic_cam5-545c150e_20211103.pth) | [log](https://download.openmmlab.com/mmpose/body3d/voxelpose/voxelpose_prn64x64x64_cpn80x80x20_panoptic_cam5_20211103.log.json) | |
Oops, something went wrong.