Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some problems of nusc-depth training code #5

Open
eliliu2233 opened this issue Jan 5, 2024 · 2 comments
Open

Some problems of nusc-depth training code #5

eliliu2233 opened this issue Jan 5, 2024 · 2 comments

Comments

@eliliu2233
Copy link

Thanks for your great work! I have run the training code for depth estimation and found following two problems:

  1. Sometimes get nan while using fp16.
    problem1

  2. The loss function does not decrease while training.
    problem2

Could you please give me some advice about these problems? I run the nusc-depth training code in 4 GPUS with the same setting of the release code (auxiliary_frame=True and use_fp16=True).

@LinShan-Bin
Copy link
Owner

  1. Thanks for your feedback! Since we didn't observe this particular issue in our initial experiments, could you kindly provide us with additional details (the training log)? We are actively working to replicate this error and are committed to enhancing the stability of the fp16 training process.

  2. It's a normal phenomenon when using the photometric loss. But the network is actually learning and you can wait for the result.

@eliliu2233
Copy link
Author

  1. Thanks for your feedback! Since we didn't observe this particular issue in our initial experiments, could you kindly provide us with additional details (the training log)? We are actively working to replicate this error and are committed to enhancing the stability of the fp16 training process.
  2. It's a normal phenomenon when using the photometric loss. But the network is actually learning and you can wait for the result.

Thanks for your reply. I found that nan always happen while calculating rendering weights, so I force the tensor type in weights calculation to fp32 and solve this problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants