-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to extract double depth map in .npy format from 2D images #8
Comments
Thank you so much for your answer. But the inputs are passive to me that how could I know about them? |
Thanks for your attention. Why did you change the background of the images to black? Because my mesh results are not good with Pifuhd and I'm wondering how could I improve it. Thanks |
We select the camera parameters based on the fact that the PIFuHD meshes always reside in a unit box. Therefore you can use the same parameters for non-square images. The following code snippet may help you obtain the ground truth depth from the estimated PIFuHD meshes: import numpy as np
import pyrender
import trimesh
import os
os.environ['PYOPENGL_PLATFORM'] = 'egl' # for headless server
def render_depth(mesh_path, camera_pose, im_height, im_width):
camera = pyrender.camera.OrthographicCamera(xmag=1.0, ymag=1.0, znear=1.0, zfar=3.0)
mesh = pyrender.Mesh.from_trimesh(trimesh.load(mesh_path))
light = pyrender.PointLight(color=[1.0, 0.0, 0.0], intensity=2.0)
scene = pyrender.Scene()
scene.add(mesh, pose=np.eye(4))
scene.add(camera, pose=camera_pose)
scene.add(light, pose=camera_pose)
r = pyrender.OffscreenRenderer(viewport_width=im_width, viewport_height=im_height, point_size=1.0)
color, depth, depth_glwin = r.render(scene)
r.delete()
return color, depth, depth_glwin
if __name__ == '__main__':
cam_pose_front = np.eye(4)
cam_pose_front[2,3] = 2.
cam_pose_back = np.eye(4)
cam_pose_back[2,3] = 2.
cam_pose_back[0,0] *= -1.
cam_pose_back[2,2] *= -1.
cam_pose_back[2,3] *= -1.
mesh_path = '/path/to/pifuhd/mesh'
assert mesh_path.endswith('.obj')
# render front depth map
color, depth, depth_glwin_front = ortho.render(mesh_path, camera_pose_front, im_height=512, im_width=320)
np.save('front_depth.npy', depth_glwin_front)
# render back depth map
color, depth, depth_glwin_front = ortho.render(mesh_path, camera_pose_back, im_height=512, im_width=320)
np.save('back_depth.npy', depth_glwin_front) NOTE: The current master branch of pyrender fails at recovering raw depth for orthographic cameras (see here and here). We provide a modified rendering script for use of M3D-VTON here. Please first |
We found that PIFu-HD performs more stable when the person is centered in the image with black background. What are your images and the estimated PIFuHD meshes look like? |
Thank you so much for your quick response and your support. Did you use Demo and its original height and width? or you change it based on your images? Because when I change the size, it will not be great. and may I ask you how you convert it to the black background? |
Sorry for the late reply. Yes, I pad the MPV 512*320 images to 512*512 and then use the original PIFuHD demo with its default setting of image size (i.e., 512*512). It might be due to the small size of your input image that PIFuHD fails. Have you tried changing the It is easy to blackout the image background by either simply using remove.bg or by human parsing (such as this or this). |
No worries, Thanks a bunch for your attention. Yes, I have tried changing, but it does not affect the results. Maybe because I did not use padding to the same size. The quality of pictures is so important. Thanks for the suggestion. The first app needs to apply each image and take time for muti images. The second Github links are for parsing. I do not know how could I find the related module to blackout. I used the grabcut and python cv2 to blackout but some parts are still white. |
Is there some reason that you choose to use the 256*192 MPV instead of 512*320? PIFuHD performs well on 512*512 images but may not fit to 256*256 (which is not that "HD"?). Padding might be a problem, but the low image resolution can also harm the 3D reconstruction quality. Moreover, I would not recommend using grabcut to segment the person images. For most person images, the human parsing methods are good enough for obtaining and blackouting their background by: person_img = cv2.imread(person_img_path)
human_parsing_result = parsing_model(person_img) # the aforementioned github links for parsing
backgroud = np.where(human_parsing_result==0) # obtain the background mask
person_img[background] = 0 # change background to black |
Thanks a lot for your insight. Yes, it's true. That's a good suggestion. I used CHIP for parsing. But this part person_img[background] = 0 has error. Is there any easiest way for blacking out? cv2.imshow('res', res) # gives black background But for some pictures do not work. |
Can u share me your colab notebook with the changes you made as I am also working on similar project and facing similar issue. |
Hi @fyviezhao , do you know how to restore a 3D point cloud from the rendered depth map in |
Thanks for your interesting paper called "M3D-VTON: A Monocular-to-3D Virtual Try-On Network". I'm also involved in such a similar area and I need to know how could I extract the ground truth depth (.npy files) for 2D images. Could you please guide me in this regard?
My next question is that how you catch the camera specification from 2D images?
The text was updated successfully, but these errors were encountered: