Again: Training imagenet: loss does not decrease #3243

dereyly · 2015-10-23T22:05:27Z

I have the same problem as #401
But have different story. Half year ago I sucsefully learned Imagenet caffe net with cuda 6.5 and cudnn_v2 (45000 iter with same resault as base model). And now i try to learn it again and failed -- loas doesn't decrease it is about 6.9 and accuracy =0. The sistem and driver is the same.
Test of old model is ok (expected accuracy).

I read the issues and there is no problem with shuffle the lmbd data is the same and same protobuff and build

Before write my question I rebuild new caffe with cuda 7.0 and cudnn_v3 and new driver 346.39 and have the same problem.

I run test about 10k-20k iterations. My old save have 16% acuracy on 10k iteratioons now its 0.

baiyancheng20 · 2016-01-19T01:37:50Z

hi, have you resolved the problem? I met this problem, too.

dereyly · 2016-01-19T07:23:05Z

Hi. Yes. I use MSRA init in convolution layers and same gaussian in full connected
But later I try LSUV: https://github.com/ducha-aiki/LSUVinit (All you need is good init)
it is better then MSRA
Finally batch normalization may solve this problem too

baiyancheng20 · 2016-01-20T08:33:34Z

@dereyly thanks you. I use the MSRA init, and it works. Have you re-trained the alex network? Did it achieve the same test accuracy (58% on val)?

dereyly · 2016-01-20T12:28:15Z

Yes. I train 450k (but u can stop earlier) iterations and get same result top1=58% and top5=81%

deanpospisil mentioned this issue Jul 18, 2016

ILSVRC12 Caffe hosted mean, different than calculated mean #4482

Open

szm-R mentioned this issue Aug 21, 2016

Training Caffe network with my own data: loss does not decrease #4611

Closed

shelhamer closed this as completed Apr 14, 2017

shelhamer added the question label Apr 14, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Again: Training imagenet: loss does not decrease #3243

Again: Training imagenet: loss does not decrease #3243

dereyly commented Oct 23, 2015

baiyancheng20 commented Jan 19, 2016

dereyly commented Jan 19, 2016

baiyancheng20 commented Jan 20, 2016

dereyly commented Jan 20, 2016

Again: Training imagenet: loss does not decrease #3243

Again: Training imagenet: loss does not decrease #3243

Comments

dereyly commented Oct 23, 2015

baiyancheng20 commented Jan 19, 2016

dereyly commented Jan 19, 2016

baiyancheng20 commented Jan 20, 2016

dereyly commented Jan 20, 2016