Skip to content

Commit 1621255

Browse files
wusizezengwang430521ly015luminxujin-s13
committed
[Feature] Voxelpose (open-mmlab#1050)
* [Enhancement] inference speed and flops tools. (open-mmlab#986) * add the function to test the dummy forward speed of models. * add tools to test the flops and inference speed of multiple models. * [Feature] Add ViPNAS models for wholebody keypoint detection (open-mmlab#1009) * add configs * add dark configs * add checkpoint and readme * update webcam demo * fix model path in webcam demo * fix unittest * update model metafiles (open-mmlab#1001) * [Feature] Add ViPNAS mbv3 (open-mmlab#1025) * add vipnas mbv3 * test other variants * submission for mmpose * add unittest * add readme * update .yml * fix lint * rebase * fix pytest Co-authored-by: jin-s13 <jinsheng13@foxmail.com> * add cfg file for flops and speed test, change the bulid_posenet to init_pose_model and fix some typo in cfg (open-mmlab#1028) * Skip CI when some specific files were changed (open-mmlab#1041) * add voxelpose * unit test * unit test * unit test * add docs/ckpts * del unnecessary comments * correct typos in comments and docs * Add or modify docs * change variable names * reduce memory cost in test * get person_id * rebase * resolve comments * rebase master * rename cfg files * fix typos in comments Co-authored-by: zengwang430521 <zengwang430521@gmail.com> Co-authored-by: Yining Li <liyining0712@gmail.com> Co-authored-by: Lumin <30328525+luminxu@users.noreply.github.com> Co-authored-by: jin-s13 <jinsheng13@foxmail.com> Co-authored-by: Qikai Li <87690686+liqikai9@users.noreply.github.com> Co-authored-by: QwQ2000 <396707050@qq.com>
1 parent 5fcb34a commit 1621255

37 files changed

+26684
-16
lines changed

.dev_scripts/github/update_model_index.py

+1
Original file line numberDiff line numberDiff line change
@@ -146,6 +146,7 @@ def parse_config_path(path):
146146
'2d_kpt_sview_rgb_img': '2D Keypoint',
147147
'2d_kpt_sview_rgb_vid': '2D Keypoint',
148148
'3d_kpt_sview_rgb_img': '3D Keypoint',
149+
'3d_kpt_mview_rgb_img': '3D Keypoint',
149150
'3d_kpt_sview_rgb_vid': '3D Keypoint',
150151
'3d_mesh_sview_rgb_img': '3D Mesh',
151152
None: None
+160
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,160 @@
1+
dataset_info = dict(
2+
dataset_name='panoptic_pose_3d',
3+
paper_info=dict(
4+
author='Joo, Hanbyul and Simon, Tomas and Li, Xulong'
5+
'and Liu, Hao and Tan, Lei and Gui, Lin and Banerjee, Sean'
6+
'and Godisart, Timothy and Nabbe, Bart and Matthews, Iain'
7+
'and Kanade, Takeo and Nobuhara, Shohei and Sheikh, Yaser',
8+
title='Panoptic Studio: A Massively Multiview System '
9+
'for Interaction Motion Capture',
10+
container='IEEE Transactions on Pattern Analysis'
11+
' and Machine Intelligence',
12+
year='2017',
13+
homepage='http://domedb.perception.cs.cmu.edu',
14+
),
15+
keypoint_info={
16+
0:
17+
dict(name='neck', id=0, color=[51, 153, 255], type='upper', swap=''),
18+
1:
19+
dict(name='nose', id=1, color=[51, 153, 255], type='upper', swap=''),
20+
2:
21+
dict(name='mid_hip', id=2, color=[0, 255, 0], type='lower', swap=''),
22+
3:
23+
dict(
24+
name='left_shoulder',
25+
id=3,
26+
color=[0, 255, 0],
27+
type='upper',
28+
swap='right_shoulder'),
29+
4:
30+
dict(
31+
name='left_elbow',
32+
id=4,
33+
color=[0, 255, 0],
34+
type='upper',
35+
swap='right_elbow'),
36+
5:
37+
dict(
38+
name='left_wrist',
39+
id=5,
40+
color=[0, 255, 0],
41+
type='upper',
42+
swap='right_wrist'),
43+
6:
44+
dict(
45+
name='left_hip',
46+
id=6,
47+
color=[0, 255, 0],
48+
type='lower',
49+
swap='right_hip'),
50+
7:
51+
dict(
52+
name='left_knee',
53+
id=7,
54+
color=[0, 255, 0],
55+
type='lower',
56+
swap='right_knee'),
57+
8:
58+
dict(
59+
name='left_ankle',
60+
id=8,
61+
color=[0, 255, 0],
62+
type='lower',
63+
swap='right_ankle'),
64+
9:
65+
dict(
66+
name='right_shoulder',
67+
id=9,
68+
color=[255, 128, 0],
69+
type='upper',
70+
swap='left_shoulder'),
71+
10:
72+
dict(
73+
name='right_elbow',
74+
id=10,
75+
color=[255, 128, 0],
76+
type='upper',
77+
swap='left_elbow'),
78+
11:
79+
dict(
80+
name='right_wrist',
81+
id=11,
82+
color=[255, 128, 0],
83+
type='upper',
84+
swap='left_wrist'),
85+
12:
86+
dict(
87+
name='right_hip',
88+
id=12,
89+
color=[255, 128, 0],
90+
type='lower',
91+
swap='left_hip'),
92+
13:
93+
dict(
94+
name='right_knee',
95+
id=13,
96+
color=[255, 128, 0],
97+
type='lower',
98+
swap='left_knee'),
99+
14:
100+
dict(
101+
name='right_ankle',
102+
id=14,
103+
color=[255, 128, 0],
104+
type='lower',
105+
swap='left_ankle'),
106+
15:
107+
dict(
108+
name='left_eye',
109+
id=15,
110+
color=[51, 153, 255],
111+
type='upper',
112+
swap='right_eye'),
113+
16:
114+
dict(
115+
name='left_ear',
116+
id=16,
117+
color=[51, 153, 255],
118+
type='upper',
119+
swap='right_ear'),
120+
17:
121+
dict(
122+
name='right_eye',
123+
id=17,
124+
color=[51, 153, 255],
125+
type='upper',
126+
swap='left_eye'),
127+
18:
128+
dict(
129+
name='right_ear',
130+
id=18,
131+
color=[51, 153, 255],
132+
type='upper',
133+
swap='left_ear')
134+
},
135+
skeleton_info={
136+
0: dict(link=('nose', 'neck'), id=0, color=[51, 153, 255]),
137+
1: dict(link=('neck', 'left_shoulder'), id=1, color=[0, 255, 0]),
138+
2: dict(link=('neck', 'right_shoulder'), id=2, color=[255, 128, 0]),
139+
3: dict(link=('left_shoulder', 'left_elbow'), id=3, color=[0, 255, 0]),
140+
4: dict(
141+
link=('right_shoulder', 'right_elbow'), id=4, color=[255, 128, 0]),
142+
5: dict(link=('left_elbow', 'left_wrist'), id=5, color=[0, 255, 0]),
143+
6:
144+
dict(link=('right_elbow', 'right_wrist'), id=6, color=[255, 128, 0]),
145+
7: dict(link=('left_ankle', 'left_knee'), id=7, color=[0, 255, 0]),
146+
8: dict(link=('left_knee', 'left_hip'), id=8, color=[0, 255, 0]),
147+
9: dict(link=('right_ankle', 'right_knee'), id=9, color=[255, 128, 0]),
148+
10: dict(link=('right_knee', 'right_hip'), id=10, color=[255, 128, 0]),
149+
11: dict(link=('mid_hip', 'left_hip'), id=11, color=[0, 255, 0]),
150+
12: dict(link=('mid_hip', 'right_hip'), id=12, color=[255, 128, 0]),
151+
13: dict(link=('mid_hip', 'neck'), id=13, color=[51, 153, 255]),
152+
},
153+
joint_weights=[
154+
1.0, 1.0, 1.0, 1.0, 1.2, 1.5, 1.0, 1.2, 1.5, 1.0, 1.2, 1.5, 1.0, 1.2,
155+
1.5, 1.0, 1.0, 1.0, 1.0
156+
],
157+
sigmas=[
158+
0.026, 0.026, 0.107, 0.079, 0.072, 0.062, 0.107, 0.087, 0.089, 0.079,
159+
0.072, 0.062, 0.107, 0.087, 0.089, 0.025, 0.035, 0.025, 0.035
160+
])
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
# Multi-view 3D Human Body Pose Estimation
2+
3+
Multi-view 3D human body pose estimation targets at predicting the X, Y, Z coordinates of human body joints from multi-view RGB images.
4+
For this task, we currently support [VoxelPose](configs/body/3d_kpt_mview_rgb_img/voxelpose).
5+
6+
## Data preparation
7+
8+
Please follow [DATA Preparation](/docs/tasks/3d_body_keypoint.md) to prepare data.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
# VoxelPose: Towards Multi-Camera 3D Human Pose Estimation in Wild Environment
2+
3+
<!-- [ALGORITHM] -->
4+
5+
<details>
6+
<summary align="right"><a href="https://www.ecva.net/papers/eccv_2020/papers_ECCV/papers/123460188.pdf">VoxelPose (ECCV'2020)</a></summary>
7+
8+
```bibtex
9+
@inproceedings{tumultipose,
10+
title={VoxelPose: Towards Multi-Camera 3D Human Pose Estimation in Wild Environment},
11+
author={Tu, Hanyue and Wang, Chunyu and Zeng, Wenjun},
12+
booktitle={ECCV},
13+
year={2020}
14+
}
15+
```
16+
17+
</details>
18+
19+
VoxelPose proposes to break down the task of 3d human pose estimation into 2 stages: (1) Human center detection by Cuboid Proposal Network
20+
(2) Human pose regression by Pose Regression Network.
21+
22+
The networks in the two stages are all based on 3D convolution. And the input feature volumes are generated by projecting each voxel to
23+
multi-view images and sampling at the projected location on the 2D heatmaps.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
<!-- [ALGORITHM] -->
2+
3+
<details>
4+
<summary align="right"><a href="https://www.ecva.net/papers/eccv_2020/papers_ECCV/papers/123460188.pdf">VoxelPose (ECCV'2020)</a></summary>
5+
6+
```bibtex
7+
@inproceedings{tumultipose,
8+
title={VoxelPose: Towards Multi-Camera 3D Human Pose Estimation in Wild Environment},
9+
author={Tu, Hanyue and Wang, Chunyu and Zeng, Wenjun},
10+
booktitle={ECCV},
11+
year={2020}
12+
}
13+
```
14+
15+
</details>
16+
17+
<!-- [DATASET] -->
18+
19+
<details>
20+
<summary align="right"><a href="https://openaccess.thecvf.com/content_iccv_2015/html/Joo_Panoptic_Studio_A_ICCV_2015_paper.html">CMU Panoptic (ICCV'2015)</a></summary>
21+
22+
```bibtex
23+
@Article = {joo_iccv_2015,
24+
author = {Hanbyul Joo, Hao Liu, Lei Tan, Lin Gui, Bart Nabbe, Iain Matthews, Takeo Kanade, Shohei Nobuhara, and Yaser Sheikh},
25+
title = {Panoptic Studio: A Massively Multiview System for Social Motion Capture},
26+
booktitle = {ICCV},
27+
year = {2015}
28+
}
29+
```
30+
31+
</details>
32+
33+
Results on CMU Panoptic dataset.
34+
35+
| Arch | mAP | mAR | MPJPE | Recall@500mm| ckpt | log |
36+
| :--- | :---: | :---: | :---: | :---: | :---: | :---: |
37+
| [prn64_cpn80_res50](/configs/body/3d_kpt_mview_rgb_img/voxelpose/panoptic/voxelpose_prn64x64x64_cpn80x80x20_panoptic_cam5.py) | 97.31 | 97.99 | 17.57| 99.85| [ckpt](https://download.openmmlab.com/mmpose/body3d/voxelpose/voxelpose_prn64x64x64_cpn80x80x20_panoptic_cam5-545c150e_20211103.pth) | [log](https://download.openmmlab.com/mmpose/body3d/voxelpose/voxelpose_prn64x64x64_cpn80x80x20_panoptic_cam5_20211103.log.json) |

0 commit comments

Comments
 (0)