Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some questions about warping and test. #14

Open
whyygug opened this issue Apr 14, 2023 · 2 comments
Open

Some questions about warping and test. #14

whyygug opened this issue Apr 14, 2023 · 2 comments

Comments

@whyygug
Copy link

whyygug commented Apr 14, 2023

Your paper is impressive and insightful, and thanks for your excellent work.

I got some question while reading your paper.

  1. Why do you get the synthetic I_{t-1} by inverse warping? It seems that the synthetic I_{t-1} could be directly produced by filling the image grids with the RGB pixels that have closer depth, just like the way in which you obtain the forward-warped depth map.

  2. Is forward warpping non-differentiable?

  3. Is the evaluation for the dynamic objects’s depth on KITTI done on the Eigen test set? If so, does each image in the Eigen test set have a ground truth semantic mask label? Or do you test the dynamic objects’s depth on KITTI using other splits in which each image has a ground truth semantic mask label?

Thanks.

@fengziyue
Copy link
Member

Hi:

Thank you for your interest!

1, Could you specify which part of the paper/code are you referring to?

2, Forward warping is differentiable but is a little bit tricky, multiple pixels may warp to the same grid.

3, I tested on the KITTI Eigen test set, the semantic mask is from the off-the-shelf instance segmentation model "Efficient-PS".

Thank you!

Sincerely,
Ziyue Feng

@whyygug
Copy link
Author

whyygug commented Apr 15, 2023

Thanks for your quick response.

  1. I'm referring to the function located at:
    def forward_warp(img, depth, pose, intrinsics, upscale=None, rotation_mode='euler', padding_mode='zeros'):

where the forward-warped depth map is obtained by:

depth_w, fw_val = [], []
for coo, z in zip(pcoords, Z):
idx = coo.reshape(-1,2).permute(1,0).long()[[1,0]]
val = z.reshape(-1)
idx[0][idx[0]<0] = hh
idx[0][idx[0]>hh-1] = hh
idx[1][idx[1]<0] = ww
idx[1][idx[1]>ww-1] = ww
_idx, _val = coalesce(idx, 1/val, m=hh+1, n=ww+1, op='max') # Cast an index with maximum-inverse-depth: we do NOT interpolate points! >> errors near boundary
depth_w.append( 1/torch.sparse.FloatTensor(_idx, _val, torch.Size([hh+1,ww+1])).to_dense()[:-1,:-1] )
fw_val.append( 1- (torch.sparse.FloatTensor(_idx, _val, torch.Size([hh+1,ww+1])).to_dense()[:-1,:-1]==0).float() )
# pdb.set_trace()
depth_w = torch.stack(depth_w, dim=0)

but the forward-warped image is obtained by:

img_w, iw_val = inverse_warp(img, depth_w, pose_inv, intrinsics)

Why do you get the forward-warped image by inverse warping? It seems that the forward-warped image could also be directly produced by forward-warping method, i.e., filling the image grids with the RGB pixels that have closer depth, just like the way in which you obtain the forward-warped depth map depth_w. The inverse warping inside forward warping seems redundant and increases computational costs.

  1. I know that you need to use a segmentation mask from "Efficient-PS" to get the disentangled image when inferring. I mean, what mask do you use to find the GT depth of dynamic objects when you are evaluating only the depth of the dynamic objects? Do you use the GT semantic mask to filter depth? If not, why don't you use the GT mask? Is it because not every image in the Eigen test set has a GT semantic mask label?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants