-
Notifications
You must be signed in to change notification settings - Fork 249
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GPU vs CPU: NAN #3
Comments
"NaN Explosion" can be triggered from single NaN (it is contagious). Recurrent network may accelerate instability, so reducing some hyper parameters(including learning rate, momentum and initial weight distribution) by order of 10^-3 might stabilize the feedback. |
But, If I am loading a pretrained model and it sometimes work with GPU and little times doesn't work....is your comment still valid? Meanwhile, always with CPU works! |
Have you fixed random seed? AFAIK the results doesn't seem to be guaranteed to be same across architectures(CPU/GPU). |
It should pass all the tests. (takes a long time)
I have been extensively using this implementation in a video domain. |
Hi,
Sometimes, when i have some netowrk/its data, I run over CPU it works well. I run over GPU it gives NAN in soft max outputs and bad accuracy. This happen if network starts from a previous computed weights or random intalization.
Note, same netowkr may work after a while normally. When this concern happens, it happens in all my netowrks based on LSTM layer. In same time, If i removed the layer, caffe works well in GPU.
Is it possible the library has bug in GPU part? any help in that
The text was updated successfully, but these errors were encountered: