Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to extract double depth map in .npy format from 2D images #8

Open
sanazsab opened this issue Feb 1, 2022 · 13 comments
Open

How to extract double depth map in .npy format from 2D images #8

sanazsab opened this issue Feb 1, 2022 · 13 comments

Comments

@sanazsab
Copy link

sanazsab commented Feb 1, 2022

Thanks for your interesting paper called "M3D-VTON: A Monocular-to-3D Virtual Try-On Network". I'm also involved in such a similar area and I need to know how could I extract the ground truth depth (.npy files) for 2D images. Could you please guide me in this regard?
My next question is that how you catch the camera specification from 2D images?

@fyviezhao
Copy link
Owner

Hi, we run PIFuHD on 2D images and then orthographically project the predicted mesh to obtain the GT depths. The camera parameters can be found here.

@sanazsab
Copy link
Author

sanazsab commented Feb 2, 2022

Thank you so much for your answer.
How do you obtain the camera parameters? Because my pictures are of different widths and heights, May I also use these parameters?
I also refer to Pyrender:
https://pyrender.readthedocs.io/en/latest/generated/pyrender.camera.OrthographicCamera.html?highlight=pyrender.camera.OrthographicCamera#pyrender.camera.OrthographicCamera

But the inputs are passive to me that how could I know about them?
Could you please guide me on how could I extract the inputs for Pyrender?

@sanazsab
Copy link
Author

sanazsab commented Feb 2, 2022

Hi, we run PIFuHD on 2D images and then orthographically project the predicted mesh to obtain the GT depths. The camera parameters can be found here.

Also, the output of Pyrender is not .npy? How could I deal with it?

@sanazsab
Copy link
Author

sanazsab commented Feb 8, 2022

Hi, we run PIFuHD on 2D images and then orthographically project the predicted mesh to obtain the GT depths. The camera parameters can be found here.

Thanks for your attention. Why did you change the background of the images to black? Because my mesh results are not good with Pifuhd and I'm wondering how could I improve it. Thanks

@fyviezhao
Copy link
Owner

Thank you so much for your answer. How do you obtain the camera parameters? Because my pictures are of different widths and heights, May I also use these parameters? I also refer to Pyrender: https://pyrender.readthedocs.io/en/latest/generated/pyrender.camera.OrthographicCamera.html?highlight=pyrender.camera.OrthographicCamera#pyrender.camera.OrthographicCamera

But the inputs are passive to me that how could I know about them? Could you please guide me on how could I extract the inputs for Pyrender?

We select the camera parameters based on the fact that the PIFuHD meshes always reside in a unit box. Therefore you can use the same parameters for non-square images. The following code snippet may help you obtain the ground truth depth from the estimated PIFuHD meshes:

import numpy as np
import pyrender
import trimesh
import os
os.environ['PYOPENGL_PLATFORM'] = 'egl'  # for headless server

def render_depth(mesh_path, camera_pose, im_height, im_width):
    camera = pyrender.camera.OrthographicCamera(xmag=1.0, ymag=1.0, znear=1.0, zfar=3.0)
    mesh = pyrender.Mesh.from_trimesh(trimesh.load(mesh_path))
    light = pyrender.PointLight(color=[1.0, 0.0, 0.0], intensity=2.0)

    scene = pyrender.Scene()
    scene.add(mesh, pose=np.eye(4))
    scene.add(camera, pose=camera_pose)
    scene.add(light, pose=camera_pose)

    r = pyrender.OffscreenRenderer(viewport_width=im_width, viewport_height=im_height, point_size=1.0)
    color, depth, depth_glwin = r.render(scene)
    r.delete()
       
    return color, depth, depth_glwin

if __name__ == '__main__':
    cam_pose_front = np.eye(4)
    cam_pose_front[2,3] = 2.

    cam_pose_back = np.eye(4)
    cam_pose_back[2,3] = 2.
    cam_pose_back[0,0] *= -1.
    cam_pose_back[2,2] *= -1.
    cam_pose_back[2,3] *= -1.

    mesh_path = '/path/to/pifuhd/mesh'
    assert mesh_path.endswith('.obj')
    # render front depth map
    color, depth, depth_glwin_front = ortho.render(mesh_path, camera_pose_front, im_height=512, im_width=320)
    np.save('front_depth.npy', depth_glwin_front)
    # render back depth map
    color, depth, depth_glwin_front = ortho.render(mesh_path, camera_pose_back, im_height=512, im_width=320)
    np.save('back_depth.npy', depth_glwin_front)

NOTE: The current master branch of pyrender fails at recovering raw depth for orthographic cameras (see here and here). We provide a modified rendering script for use of M3D-VTON here. Please first pip install pyrender and then replace the pyrender/renderer.py script with our modified renderer.py script (see differences between line 1151-1195). Now you can save the returned depth_glwin as a .npy file.

@fyviezhao
Copy link
Owner

Hi, we run PIFuHD on 2D images and then orthographically project the predicted mesh to obtain the GT depths. The camera parameters can be found here.

Thanks for your attention. Why did you change the background of the images to black? Because my mesh results are not good with Pifuhd and I'm wondering how could I improve it. Thanks

We found that PIFu-HD performs more stable when the person is centered in the image with black background. What are your images and the estimated PIFuHD meshes look like?

@sanazsab
Copy link
Author

sanazsab commented Feb 9, 2022

Hi, we run PIFuHD on 2D images and then orthographically project the predicted mesh to obtain the GT depths. The camera parameters can be found here.

Thanks for your attention. Why did you change the background of the images to black? Because my mesh results are not good with Pifuhd and I'm wondering how could I improve it. Thanks

We found that PIFu-HD performs more stable when the person is centered in the image with black background. What are your images and the estimated PIFuHD meshes look like?

Thank you so much for your quick response and your support.

Did you use Demo and its original height and width? or you change it based on your images? Because when I change the size, it will not be great. and may I ask you how you convert it to the black background?
My images are also MPV but with a lower resolution as 192*256. I will attach it here

image

@fyviezhao
Copy link
Owner

Sorry for the late reply. Yes, I pad the MPV 512*320 images to 512*512 and then use the original PIFuHD demo with its default setting of image size (i.e., 512*512). It might be due to the small size of your input image that PIFuHD fails. Have you tried changing the --resolution option in this line to 256 after padding your 256*192 images to 256*256?

It is easy to blackout the image background by either simply using remove.bg or by human parsing (such as this or this).

@sanazsab
Copy link
Author

No worries, Thanks a bunch for your attention.

Yes, I have tried changing, but it does not affect the results. Maybe because I did not use padding to the same size. The quality of pictures is so important.

Thanks for the suggestion. The first app needs to apply each image and take time for muti images. The second Github links are for parsing. I do not know how could I find the related module to blackout. I used the grabcut and python cv2 to blackout but some parts are still white.

@fyviezhao
Copy link
Owner

Is there some reason that you choose to use the 256*192 MPV instead of 512*320? PIFuHD performs well on 512*512 images but may not fit to 256*256 (which is not that "HD"?). Padding might be a problem, but the low image resolution can also harm the 3D reconstruction quality.

Moreover, I would not recommend using grabcut to segment the person images. For most person images, the human parsing methods are good enough for obtaining and blackouting their background by:

person_img = cv2.imread(person_img_path)
human_parsing_result = parsing_model(person_img) # the aforementioned github links for parsing
backgroud = np.where(human_parsing_result==0) # obtain the background mask
person_img[background] = 0 # change background to black

@sanazsab
Copy link
Author

parsing_model

Thanks a lot for your insight.

Yes, it's true.

That's a good suggestion. I used CHIP for parsing. But this part person_img[background] = 0 has error.

Is there any easiest way for blacking out?
I used
image = cv2.imread(path)
r = 150.0 / image.shape[1]
dim = (150, int(image.shape[0] * r))
resized = cv2.resize(image, dim, interpolation=cv2.INTER_AREA)
lower_white = np.array([220, 220, 220], dtype=np.uint8)
upper_white = np.array([255, 255, 255], dtype=np.uint8)
mask = cv2.inRange(resized, lower_white, upper_white) # could also use threshold
res = cv2.bitwise_not(resized, resized, mask)

cv2.imshow('res', res) # gives black background
cv2.imwrite('0A.png', res)

But for some pictures do not work.

@aryacodez
Copy link

aryacodez commented Mar 7, 2022

parsing_model

Thanks a lot for your insight.

Yes, it's true.

That's a good suggestion. I used CHIP for parsing. But this part person_img[background] = 0 has error.

Is there any easiest way for blacking out?
I used
image = cv2.imread(path)
r = 150.0 / image.shape[1]
dim = (150, int(image.shape[0] * r))
resized = cv2.resize(image, dim, interpolation=cv2.INTER_AREA)
lower_white = np.array([220, 220, 220], dtype=np.uint8)
upper_white = np.array([255, 255, 255], dtype=np.uint8)
mask = cv2.inRange(resized, lower_white, upper_white) # could also use threshold
res = cv2.bitwise_not(resized, resized, mask)

cv2.imshow('res', res) # gives black background
cv2.imwrite('0A.png', res)

But for some pictures do not work.

Can u share me your colab notebook with the changes you made as I am also working on similar project and facing similar issue.

@LogWell
Copy link

LogWell commented Oct 19, 2022

Hi @fyviezhao , do you know how to restore a 3D point cloud from the rendered depth map in pyrender, in which the camera is any of three modes in camera?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants