How to extract double depth map in .npy format from 2D images #8

sanazsab · 2022-02-01T18:06:09Z

Thanks for your interesting paper called "M3D-VTON: A Monocular-to-3D Virtual Try-On Network". I'm also involved in such a similar area and I need to know how could I extract the ground truth depth (.npy files) for 2D images. Could you please guide me in this regard?
My next question is that how you catch the camera specification from 2D images?

fyviezhao · 2022-02-02T01:37:16Z

Hi, we run PIFuHD on 2D images and then orthographically project the predicted mesh to obtain the GT depths. The camera parameters can be found here.

sanazsab · 2022-02-02T08:48:14Z

Thank you so much for your answer.
How do you obtain the camera parameters? Because my pictures are of different widths and heights, May I also use these parameters?
I also refer to Pyrender:
https://pyrender.readthedocs.io/en/latest/generated/pyrender.camera.OrthographicCamera.html?highlight=pyrender.camera.OrthographicCamera#pyrender.camera.OrthographicCamera

But the inputs are passive to me that how could I know about them?
Could you please guide me on how could I extract the inputs for Pyrender?

sanazsab · 2022-02-02T09:14:06Z

Hi, we run PIFuHD on 2D images and then orthographically project the predicted mesh to obtain the GT depths. The camera parameters can be found here.

Also, the output of Pyrender is not .npy? How could I deal with it?

sanazsab · 2022-02-08T10:29:39Z

Hi, we run PIFuHD on 2D images and then orthographically project the predicted mesh to obtain the GT depths. The camera parameters can be found here.

Thanks for your attention. Why did you change the background of the images to black? Because my mesh results are not good with Pifuhd and I'm wondering how could I improve it. Thanks

fyviezhao · 2022-02-09T03:01:42Z

Thank you so much for your answer. How do you obtain the camera parameters? Because my pictures are of different widths and heights, May I also use these parameters? I also refer to Pyrender: https://pyrender.readthedocs.io/en/latest/generated/pyrender.camera.OrthographicCamera.html?highlight=pyrender.camera.OrthographicCamera#pyrender.camera.OrthographicCamera

But the inputs are passive to me that how could I know about them? Could you please guide me on how could I extract the inputs for Pyrender?

We select the camera parameters based on the fact that the PIFuHD meshes always reside in a unit box. Therefore you can use the same parameters for non-square images. The following code snippet may help you obtain the ground truth depth from the estimated PIFuHD meshes:

import numpy as np
import pyrender
import trimesh
import os
os.environ['PYOPENGL_PLATFORM'] = 'egl'  # for headless server

def render_depth(mesh_path, camera_pose, im_height, im_width):
    camera = pyrender.camera.OrthographicCamera(xmag=1.0, ymag=1.0, znear=1.0, zfar=3.0)
    mesh = pyrender.Mesh.from_trimesh(trimesh.load(mesh_path))
    light = pyrender.PointLight(color=[1.0, 0.0, 0.0], intensity=2.0)

    scene = pyrender.Scene()
    scene.add(mesh, pose=np.eye(4))
    scene.add(camera, pose=camera_pose)
    scene.add(light, pose=camera_pose)

    r = pyrender.OffscreenRenderer(viewport_width=im_width, viewport_height=im_height, point_size=1.0)
    color, depth, depth_glwin = r.render(scene)
    r.delete()
       
    return color, depth, depth_glwin

if __name__ == '__main__':
    cam_pose_front = np.eye(4)
    cam_pose_front[2,3] = 2.

    cam_pose_back = np.eye(4)
    cam_pose_back[2,3] = 2.
    cam_pose_back[0,0] *= -1.
    cam_pose_back[2,2] *= -1.
    cam_pose_back[2,3] *= -1.

    mesh_path = '/path/to/pifuhd/mesh'
    assert mesh_path.endswith('.obj')
    # render front depth map
    color, depth, depth_glwin_front = ortho.render(mesh_path, camera_pose_front, im_height=512, im_width=320)
    np.save('front_depth.npy', depth_glwin_front)
    # render back depth map
    color, depth, depth_glwin_front = ortho.render(mesh_path, camera_pose_back, im_height=512, im_width=320)
    np.save('back_depth.npy', depth_glwin_front)

NOTE: The current master branch of pyrender fails at recovering raw depth for orthographic cameras (see here and here). We provide a modified rendering script for use of M3D-VTON here. Please first pip install pyrender and then replace the pyrender/renderer.py script with our modified renderer.py script (see differences between line 1151-1195). Now you can save the returned depth_glwin as a .npy file.

fyviezhao · 2022-02-09T03:11:33Z

Hi, we run PIFuHD on 2D images and then orthographically project the predicted mesh to obtain the GT depths. The camera parameters can be found here.

Thanks for your attention. Why did you change the background of the images to black? Because my mesh results are not good with Pifuhd and I'm wondering how could I improve it. Thanks

We found that PIFu-HD performs more stable when the person is centered in the image with black background. What are your images and the estimated PIFuHD meshes look like?

sanazsab · 2022-02-09T07:03:02Z

Hi, we run PIFuHD on 2D images and then orthographically project the predicted mesh to obtain the GT depths. The camera parameters can be found here.

Thanks for your attention. Why did you change the background of the images to black? Because my mesh results are not good with Pifuhd and I'm wondering how could I improve it. Thanks

We found that PIFu-HD performs more stable when the person is centered in the image with black background. What are your images and the estimated PIFuHD meshes look like?

Thank you so much for your quick response and your support.

Did you use Demo and its original height and width? or you change it based on your images? Because when I change the size, it will not be great. and may I ask you how you convert it to the black background?
My images are also MPV but with a lower resolution as 192*256. I will attach it here

fyviezhao · 2022-02-15T06:36:46Z

Sorry for the late reply. Yes, I pad the MPV 512*320 images to 512*512 and then use the original PIFuHD demo with its default setting of image size (i.e., 512*512). It might be due to the small size of your input image that PIFuHD fails. Have you tried changing the --resolution option in this line to 256 after padding your 256*192 images to 256*256?

It is easy to blackout the image background by either simply using remove.bg or by human parsing (such as this or this).

sanazsab · 2022-02-15T08:20:29Z

No worries, Thanks a bunch for your attention.

Yes, I have tried changing, but it does not affect the results. Maybe because I did not use padding to the same size. The quality of pictures is so important.

Thanks for the suggestion. The first app needs to apply each image and take time for muti images. The second Github links are for parsing. I do not know how could I find the related module to blackout. I used the grabcut and python cv2 to blackout but some parts are still white.

fyviezhao · 2022-02-15T08:47:21Z

Is there some reason that you choose to use the 256*192 MPV instead of 512*320? PIFuHD performs well on 512*512 images but may not fit to 256*256 (which is not that "HD"?). Padding might be a problem, but the low image resolution can also harm the 3D reconstruction quality.

Moreover, I would not recommend using grabcut to segment the person images. For most person images, the human parsing methods are good enough for obtaining and blackouting their background by:

person_img = cv2.imread(person_img_path)
human_parsing_result = parsing_model(person_img) # the aforementioned github links for parsing
backgroud = np.where(human_parsing_result==0) # obtain the background mask
person_img[background] = 0 # change background to black

sanazsab · 2022-02-15T10:13:56Z

parsing_model

Thanks a lot for your insight.

Yes, it's true.

That's a good suggestion. I used CHIP for parsing. But this part person_img[background] = 0 has error.

Is there any easiest way for blacking out?
I used
image = cv2.imread(path)
r = 150.0 / image.shape[1]
dim = (150, int(image.shape[0] * r))
resized = cv2.resize(image, dim, interpolation=cv2.INTER_AREA)
lower_white = np.array([220, 220, 220], dtype=np.uint8)
upper_white = np.array([255, 255, 255], dtype=np.uint8)
mask = cv2.inRange(resized, lower_white, upper_white) # could also use threshold
res = cv2.bitwise_not(resized, resized, mask)

cv2.imshow('res', res) # gives black background
cv2.imwrite('0A.png', res)

But for some pictures do not work.

aryacodez · 2022-03-07T17:18:24Z

parsing_model
Thanks a lot for your insight.

Yes, it's true.

That's a good suggestion. I used CHIP for parsing. But this part person_img[background] = 0 has error.

Is there any easiest way for blacking out?
I used
image = cv2.imread(path)
r = 150.0 / image.shape[1]
dim = (150, int(image.shape[0] * r))
resized = cv2.resize(image, dim, interpolation=cv2.INTER_AREA)
lower_white = np.array([220, 220, 220], dtype=np.uint8)
upper_white = np.array([255, 255, 255], dtype=np.uint8)
mask = cv2.inRange(resized, lower_white, upper_white) # could also use threshold
res = cv2.bitwise_not(resized, resized, mask)

cv2.imshow('res', res) # gives black background
cv2.imwrite('0A.png', res)

But for some pictures do not work.

Can u share me your colab notebook with the changes you made as I am also working on similar project and facing similar issue.

LogWell · 2022-10-19T10:31:25Z

Hi @fyviezhao , do you know how to restore a 3D point cloud from the rendered depth map in pyrender, in which the camera is any of three modes in camera?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to extract double depth map in .npy format from 2D images #8

How to extract double depth map in .npy format from 2D images #8

sanazsab commented Feb 1, 2022

fyviezhao commented Feb 2, 2022

sanazsab commented Feb 2, 2022

sanazsab commented Feb 2, 2022

sanazsab commented Feb 8, 2022

fyviezhao commented Feb 9, 2022

fyviezhao commented Feb 9, 2022

sanazsab commented Feb 9, 2022

fyviezhao commented Feb 15, 2022

sanazsab commented Feb 15, 2022

fyviezhao commented Feb 15, 2022

sanazsab commented Feb 15, 2022

aryacodez commented Mar 7, 2022 •

edited

Loading

LogWell commented Oct 19, 2022

How to extract double depth map in .npy format from 2D images #8

How to extract double depth map in .npy format from 2D images #8

Comments

sanazsab commented Feb 1, 2022

fyviezhao commented Feb 2, 2022

sanazsab commented Feb 2, 2022

sanazsab commented Feb 2, 2022

sanazsab commented Feb 8, 2022

fyviezhao commented Feb 9, 2022

fyviezhao commented Feb 9, 2022

sanazsab commented Feb 9, 2022

fyviezhao commented Feb 15, 2022

sanazsab commented Feb 15, 2022

fyviezhao commented Feb 15, 2022

sanazsab commented Feb 15, 2022

aryacodez commented Mar 7, 2022 • edited Loading

LogWell commented Oct 19, 2022

aryacodez commented Mar 7, 2022 •

edited

Loading