Application of DD-Net on OpenPose/AlphaPose/... results #26

JFMeyer2k · 2021-01-16T23:46:21Z

Hey,

thanks for the amazing paper.
As the number of state of the art models dealing with the issue that only 2D skeletons may be available for many real world applications, I wonder if it would be possible to train your model on results of OpenPose/AlphaPose that I apply to RGB-videos.

I read through the JHMDB docu, and went through the data-set you provided (thanks again).
It seems that there are 433 individuals, each represented by an (X, 15, 2) array, X being some number like 35,38, 40,....
Could you elaborate a little what that array contains?
It would help me to rearrange the outcome of OpenPose/AlphaPose accordingly to apply DD-Net.

My guess is that X is the number of frames and the 15 x 2 array gives the x,y-coordinates of the skeleton-keypoints.
Also I learned that before applying DD-Net, I need to normalize the input as:
pos_world(1,:,:) = (pos_img(1,:,:)/W-0.5)*W/H./scale;
pos_world(2,:,:) = (pos_img(2,:,:)/H-0.5)./scale;
where W and H are the width and height of the frame and scale is given as spine length correct?

What exactly is meant by pos_world(1,:,:), pos_img(1,:,:), pos_world(2,:,:), and pos_img(2,:,:) ?

Thanks alot,
JFM

JFMeyer2k · 2021-01-18T00:11:55Z

For reference, I plotted a couple of those skeletons and I think my assumption was correct.
The number behind "raw_" is the element number of the pose-element within the file Train.pkl.

raw_80.mp4