Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loss_l : inf in training ? #35

Closed
transcendentsky opened this issue Sep 17, 2018 · 5 comments
Closed

Loss_l : inf in training ? #35

transcendentsky opened this issue Sep 17, 2018 · 5 comments

Comments

@transcendentsky
Copy link

Hello, first of all, thanks for your code releasing.
I got the training loss inf, acutally loss_l = inf, i use your original code (only fixed some bug), but i don't know why i got inf.
Parameters: lr:0.004, batchsize:32, base_model:vgg_reducedfc.pth
GPU: 1080ti

Any comments will be appreciated.
Thanks very much!

@GOATmessi8
Copy link
Owner

@transcendentsky You may look at this pull for details.

@transcendentsky
Copy link
Author

I have tried this way, but the problem still exists.

@GOATmessi8
Copy link
Owner

@transcendentsky Once you got the inf loss, what is your next batch loss? Does it still inf? I actually also meet some inf loss in training the COCO, but it seems OK for final convergence.

@transcendentsky
Copy link
Author

transcendentsky commented Sep 17, 2018

I tracked the code and find the problem comes to prior (prior_box.py) , the code s_k = self.min_sizes[k]/self.image_size get 0 because these are two integers. I compared it with code in https://github.com/amdegroot/ssd.pytorch , and There should have a from __future__ import division on the top and code goes right.
Cause I used py2.7 , in py3 there's no problem maybe.

@GOATmessi8
Copy link
Owner

@transcendentsky The current code has no been tested in Python2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants