Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot reproduce ilsvrc2012 validation results #1644

Closed
shaibagon opened this issue Dec 28, 2014 · 11 comments
Closed

Cannot reproduce ilsvrc2012 validation results #1644

shaibagon opened this issue Dec 28, 2014 · 11 comments

Comments

@shaibagon
Copy link
Member

Hi,
I built caffe-dev and tried "out of the box" bvlc_googlenet model. The top-5 score I get is for ilsvrc2012 validation set is:
Test net output #8: loss3/top-5 = 0.831125
While the expected score (according to wiki) is 0.89.
I suspect I did not prepare my validation set properly.
I used script examples/imagenet/create_imagenet.sh to create the images lmdb setting RESIZE_HEIGHT and RESIZE_WIDTH to 227.

I experience same issue with vgg 19 layers model.

What am I missing here?

@ducha-aiki
Copy link
Contributor

Hi @shaibagon

RESIZE_HEIGHT and RESIZE_WIDTH to 227.

Default option is 256x256, so it might be the cause.

@shaibagon
Copy link
Member Author

@ducha_aiki - thank you for your reply.
I re-run it with RESIZE_HEIGHT and RESIZE_WIDTH set to 256. It made no change on the test results. Still getting top-5 ~83%.
What am I missing here?

The output of caffe test -model ./models/bvlc_googlenet/train_val.prototxt -weights ./models/bvlc_googlenet/bvlc_googlenet.caffemodel I get is:

I1229 08:59:31.476357 17712 caffe.cpp:174] Loss: 3.37957
I1229 08:59:31.476372 17712 caffe.cpp:186] loss1/loss1 = 2.48485 (* 0.3 = 0.745455 loss)
I1229 08:59:31.476384 17712 caffe.cpp:186] loss1/top-1 = 0.522318
I1229 08:59:31.476394 17712 caffe.cpp:186] loss1/top-5 = 0.750114
I1229 08:59:31.476408 17712 caffe.cpp:186] loss2/loss1 = 2.25783 (* 0.3 = 0.67735 loss)
I1229 08:59:31.476418 17712 caffe.cpp:186] loss2/top-1 = 0.589818
I1229 08:59:31.476428 17712 caffe.cpp:186] loss2/top-5 = 0.798568
I1229 08:59:31.476440 17712 caffe.cpp:186] loss3/loss3 = 1.95676 (* 1 = 1.95676 loss)
I1229 08:59:31.476451 17712 caffe.cpp:186] loss3/top-1 = 0.645795
I1229 08:59:31.476461 17712 caffe.cpp:186] loss3/top-5 = 0.83275

@ducha-aiki
Copy link
Contributor

@shaibagon
it is weird. Here are my log:
https://gist.github.com/ducha-aiki/21f41b1887749b122ca7
Do you have original imagenet 2012 val without any preprocessing?

@shaibagon
Copy link
Member Author

I downloaded the images using image URLs. no preprocessing. The images are stored on my device in their original size, only the examples/imagenet/create_imagenet.sh script resizes the images to 256x256. Should I somehow crop the images to aspect ratio 1:1 as preprocessing? what happens to image aspect ratio during resize?

@ducha-aiki
Copy link
Contributor

No, looks ok. Have you experienced performance drop also for bvlc_reference_caffenet?
As for the resizing via convert_imageset, I am personally don`t use it, instead store original images using #1239 And do on-the-fly resizing while train/val. But my results in wiki does not differ from BVLC reported performance, so that is not a cause.
Are you sure that you test all images, I mean batch_size * num_iterations == 50 000? Sorry, if it sounds silly, just to make sanity check to be sure. Could you post a log file like I did?

@shaibagon
Copy link
Member Author

Due to device capacity issues I use batch size 32 in test time as well as training time.
Moreover, since I downloaded the validation set from image URLs I do not have all 50K images, only about 44K of them. I do not suspect this is a cause of ~5% drop in performance, though...

I just tested the reference caffenet model as well, I get accuracy (top-1) of 52%, as opposed to 57% reported here.

Something is fishy here...

@ducha-aiki
Copy link
Contributor

Due to device capacity issues I use batch size 32 in test time as well as training time.

This cannot be a problem, unless you haven`t proportionally increased number of test iterations.

I do not suspect this is a cause of ~5% drop in performance, though...

It could be indeed. You can post somewhere your val.txt (assuming your filenames still are like ILSVRC2012_val_000000*.JPEG) and I can check performance on this list.

@shaibagon
Copy link
Member Author

Regarding number of test iterations - I changed the number such that #iteration * batch_size = number of validation examples. Moreover, even when testing on a subset of the validation set, the performance are still roughly 83% (top-5 googlenet) or 52% (top-1 reference caffenet).
I posted my val.txt file in:
https://gist.github.com/shaibagon/4850c225bd7e19f87142

Thank you VERY MUCH for your help.

@ducha-aiki
Copy link
Contributor

(bs=6)*(#iter=7331) = 43986 examples (you are unlucky to have so badly factorized number :)
My bvlc_alexnet results are

accuracy = 0.569068
accuracy5 = 0.796894

Some problems with your images, I suppose. Try to sort them by size, may be some of them downloaded with errors and have size ~1Kb or smth like this? Test log is here:
https://gist.github.com/ducha-aiki/27bec59ec6e51c84e53e

@shaibagon
Copy link
Member Author

@ducha-aiki - I started sifting through my images. There might be some "download each URL" issues. I'll update if I come to any conclusion here.

@shaibagon
Copy link
Member Author

@ducha-aiki - I found it! It turns out downloading non-existant flickr URL results with this picture (and not an error, as I would expect):
ilsvrc2012_val_00000007

Clearing these from my set yields (for bvlc_googlenet):

I1229 14:47:19.535240 31260 caffe.cpp:174] Loss: 2.33781
I1229 14:47:19.535255 31260 caffe.cpp:186] loss1/loss1 = 1.91557 (* 0.3 = 0.57467 loss)
I1229 14:47:19.535266 31260 caffe.cpp:186] loss1/top-1 = 0.5525
I1229 14:47:19.535277 31260 caffe.cpp:186] loss1/top-5 = 0.803125
I1229 14:47:19.535290 31260 caffe.cpp:186] loss2/loss1 = 1.53597 (* 0.3 = 0.460792 loss)
I1229 14:47:19.535300 31260 caffe.cpp:186] loss2/top-1 = 0.625
I1229 14:47:19.535310 31260 caffe.cpp:186] loss2/top-5 = 0.850312
I1229 14:47:19.535321 31260 caffe.cpp:186] loss3/loss3 = 1.30235 (* 1 = 1.30235 loss)
I1229 14:47:19.535331 31260 caffe.cpp:186] loss3/top-1 = 0.68
I1229 14:47:19.535341 31260 caffe.cpp:186] loss3/top-5 = 0.88625

Which is close enough for me.

Thank you very much for your help and patience!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants