-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
the loss is nan #8
Comments
Thanks for your interest. It's hard to diagnose the problem based on the provided information. It seems that the problem may be due to numerical issues. Please make sure all outputs are properly normalized. You can also provide more information so we can offer better help. |
oh, I have found the reason. The loss is nan, owing to the initialization. So I use your pre-trained model. But in your MaskRCNN-benchmark, the pre-trained model for Faster-RCNN and Mask-RCNN are different. Why the pre-trained models trained on ImageNet are different? My guess is that the params in the Resnet are trained on the Imagenet. Then you transfer the random initialization params(FPN and RCNNHead) to the weight standard type param? |
Good to know that you found the reason. I'm not sure I understand your question. The models pre-trained on ImageNet for Faster-RCNN and Mask-RCNN are the same -- they all point to "catalog://WeightStandardization/R-50-GN-WS", for example. The pre-trained models only contain the parameters of the backbones. Other parts such as heads are not included in the pre-trained models. |
From my experience, nan is caused by either too large learning rate or inappropriate batch norm layer statistics. Based on your screenshot, it's unlikely the first, as the loss is actually decreasing. I recommend writing some |
My reply in this issue might help, #1 |
It is a very nice work. But there are some problem in my experiments.
Training is easy to gradient explosion, the loss is nan, even if my learning rate is set 0. Could you give me some advice.
The text was updated successfully, but these errors were encountered: