-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Learning stops early with reduced batch size #1557
Comments
This page may be of help to you: #430 |
@danielorf Honestly I had the same question. I was wondering how they did those plots. |
The tool is in tools/extra/parse_log.sh On Saturday, December 20, 2014, Zhenghao Gu notifications@github.com
Sergio |
Sorry, I clearly didn't read your question properly, my mistake. |
Whre might one find this "caffe.log" file? Edit: Found it - It's located in /tmp/ with the name "caffe.[pc name].[username].log.INFO.date |
Hello,
I am using Tesla c2050 which is of compute capability 2.0. It reports an error if I train imagenet with default setting batch_size = 256, like #629
So I reduced batch_size to 64, and correspondingly changed base_lr from 0.01 to 0.01414, stepsize from 100000 to 400000, max_iter from 450000 to 1800000. I also changed bias from 1 to 0.1 for some layers in file models/bvlc_reference_caffenet/train_val.prototxt, as suggested #430, otherwise it does not learn anything.
But my result is not as good as #430. The loss is not decreasing but oscillating after 20,000 iterations. I tried alexnet model with the same parameter change except base_lr = 0.02. The result was similar, if not worse.
Any idea what may cause this? Thanks.
The text was updated successfully, but these errors were encountered: