Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions about megadepth training code #38

Open
sungho-choi opened this issue Feb 26, 2020 · 2 comments
Open

Questions about megadepth training code #38

sungho-choi opened this issue Feb 26, 2020 · 2 comments

Comments

@sungho-choi
Copy link

I've been using simple demo training code for the last few days and faced several problems. But there's no place to ask, so I'll write it here.

First is the question of training data. Datasets consist of an image and h5 pairs, some h5 files have depth values, and some h5 files have mask values of integers such as 0,1,2,3. I would like to know how you learned to use these data.

This is a question about training code. When reading the h5 file and calculating the loss for it,
d_gt_0 = torch.log(Variable(targets['gt_0'].cuda(), requires_grad = False))
d_gt_1 = torch.log (Variable (targets ['gt_1']. cuda (), requires_grad = False))
In this way, you can see the log take the ground truth depth value. As a result, the value of 0 is changed to the value of -infinity. If you enter the Loss function in this state, Loss cannot be obtained and the value of nan is returned. I want to know if this is correct.

The last question is related to the dataset and training code. In a dataset, an image pairs an h5 file with a depth value or an h5 file with a mask value. In the Loss function, Prediction_d, ground truth, and mask are entered as inputs. I don't have h5 file for ground truth and h5 file with mask value for one image. I'm curious how to train.

@zhengqili
Copy link
Owner

Hi, sorry for the very late reply.
Yes. the training code was a messy one I migrated from the original Torch code a few years ago, and sorry for any inconvenience I introduced.
The integers mean different foreground and background objects.
In terms of depth, I assume depth is large than 0, so anything close to zero will not be considered in the training stage.
The easiest way to go right now is not to use semantic mask first and directly train the code with scale-invariance loss we used in the paper and mask out any depth value that is <= 0 during loss calculation.

@ewrfcas
Copy link

ewrfcas commented Jul 16, 2021

Why the predicted depth is not processed with 'log' (but gt depth is log) before the gradient loss?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants