DIGITS image classification view does not correctly handle output of FCNs #1492

AliaMYH · 2017-03-04T16:49:57Z

I've trained a network and the validation accuracy is very good for it, almost 100 percent. However when I come to 'classify many', using the val.txt for the dataset, the accuracy returned is terrible. Only one of the classes is predicted every time. I'm not exactly sure what the problem is.
My validation set is 20% of my training set, I've used the split option in digits.

AliaMYH · 2017-03-04T17:07:38Z

I tried using mean pixel as recommended by #625 , but the same thing is happening.

Classify One gives correct results.

AliaMYH · 2017-03-04T17:14:29Z

@samansarraf According to your comment on #625, am I meant to preprocess all my data outside of digits before training?

samansarraf · 2017-03-04T19:08:45Z

@AliaMYH your understanding is correct. To solve the issue at that time , I came up with the idea to preprocess my data and made them totally independent from classify many preprocessing module . There should be something on their codes that is not working in some cases. After I preprocessed data outside (including image resizing ...) the classify many exactly produced the same results as I was getting during training and validation . Don't give up ! you are pretty close to it .

AliaMYH · 2017-03-04T22:32:48Z

A curious thing though is that I use the exact same dataset and cropping with two different networks and one gave the above results, while the other one worked perfectly. The one with the above results is a fine-tuned Squeezenet, and the one that gave proper results was a fine-tuned AlexNet which I trained for comparison. I'm not exactly sure why this is.

Also, does this mean that it's only the displayed results that are incorrect or the entire training of the network.. considering that the 'Classify One' results are correct. Is the deploy.prototxt what is used during the inference?

@samansarraf @gheinrich

samansarraf · 2017-03-05T00:32:18Z

@AliaMYH No , the entire training (including validation) process is correct. Only classify many has some preprocessing problem in my opinion. I had the same kind of problem with GoogleNet so I what I did I cropped the data before training and used the cropped images for both training and classify many module then it worked.

AliaMYH · 2017-03-05T09:11:13Z

I agree as well. @gheinrich Can you confirm this?

gheinrich · 2017-03-06T09:55:18Z

@AliaMYH are you saying that the standard Alexnet provided in DIGITS gives proper result whether the implement of SqueezeNet you're using doesn't? It might be a case of DIGITS being confused by the output of the network. Are you using a single SoftMax output in SqueezeNet?

As a side note, if small differences in pre-processing lead to wildly different accuracy, that's a sign of overfit. You'd have to see how well the model generalized to unseen samples.

gheinrich · 2017-03-09T10:33:20Z

@AliaMYH you might want to verify the shape of the network output. It is probably a case of the network outputting one more dimension than DIGITS expect, which would confuse the classify many path.

gheinrich · 2017-03-13T07:14:13Z

Any feedback please @AliaMYH ?

SlipknotTN · 2017-03-13T17:58:36Z

Me and my colleagues have noticed the same problem @gheinrich , SqueezeNet model (we have tested 1.1) outputs one additional dimension causing the problem on the confusion matrix.
You can fix the problem at inference.py level b2062c6 or in classification/views.py dbcb9ed but the first solution should be better, probably it is possible find other solutions if we understand why SqueezeNet has this behaviour.
We could review the problem and make a PR if you like.

gheinrich · 2017-03-14T08:23:29Z

I had never looked at SqueezeNet before. The reason you get more dimensions is because this is a fully convolutional network, unlike the typical classification CNN (e.g. Alexnet) which has a fully convolutional feature extractor and a classifier that is made of fully-connected layers. Therefore, SqueezeNet produces a spatial output while DIGITS expects a flattened probability distribution of classes. I think it's better to fix DIGITS in classification/views.py. We could for example collapse the dimensions that have a cardinality of one with np.squeeze. If you are willing to take this up @SlipknotTN it would be very much appreciated, thanks in advance!

nollimahere · 2017-03-29T18:30:47Z

Hi, I created an account just to make this comment so I apologize that I am appearing out of the blue..

My peers and I work with squeezenet and alexnet on ubuntu14.04 package 'digits 4.0.0-1', and I've found through my testing that the proposed change to inference.py in b2062c6 allow us to run the 'classify many' web routine without issues on either type of network.

If, however, we make the change proposed to classification/views.py in pull #1536 then we are not able to run 'classify many' on alexnet or any of the other typical classification CNN's. Similarly, the changes commited in dbcb9ed also break the 'classify many' routine on alexnet.

SlipknotTN · 2017-03-30T08:05:36Z

Did you use the Alexnet definition present by default in DIGITS or other definitions? I'd like to try the exact version that you use.
The proposed fixes operates at different levels. The change on inference.py fixes the output earlier, the others only when you run classify_many. I made the PR following @gheinrich suggestion, so i put the fix in classification/views.py, like dbcb9ed but with more errors checks. I have tested the PR with SqueezeNet and GoogleNet, but probably a model like Alexnet with final FullyConnected breaks the fix.
Just be precise GoogleNet works well also before the patch, SqueezeNet needs the patch, but I didn't test Alexnet in any form.

nollimahere · 2017-03-30T16:11:53Z

Most recently we have been using the vanilla Alexnet from Digits and that is what I tested in our environment when I made that comment.

SlipknotTN · 2017-03-30T18:28:12Z

Mhm... I have tested the PR with Alexnet from DIGITS and it works. Did you try the entire branch https://github.com/cynnyx/DIGITS/tree/fcn-fix-pr or only to port the fix on DIGITS4?
Which error do you encounter? All the classifications assigned to the first column?

Meanwhile we have added to the PR the same fix for the TOPN category function.

nollimahere · 2017-04-05T17:13:05Z

okay - the issue I was having was not a relic of Digits-4. I have included the 2nd commit made by @belalessandro in your pull #1536 and that 2nd commit correctly handled the 'classify many' web routine without any other changes needed in the inference.py file.

Previously we were still only getting a single class prediction in the confusion matrix. So, for what it's worth, I believe that this commit works great and I look forward to seeing it downstream in the Ubuntu repos.

gheinrich changed the title ~~Classify Many gives much worse result that validation accuracy~~ DIGITS image classification view does not correctly handle output of FCNs Mar 14, 2017

gheinrich added bug UI labels Mar 17, 2017

SlipknotTN mentioned this issue Mar 27, 2017

FCN confusion matrix visualization fix #1536

Open

BradLarson mentioned this issue Aug 29, 2017

SqueezNet CaffeModel tuned in DIGITS #1780

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DIGITS image classification view does not correctly handle output of FCNs #1492

DIGITS image classification view does not correctly handle output of FCNs #1492

AliaMYH commented Mar 4, 2017 •

edited

Loading

AliaMYH commented Mar 4, 2017 •

edited

Loading

AliaMYH commented Mar 4, 2017

samansarraf commented Mar 4, 2017

AliaMYH commented Mar 4, 2017

samansarraf commented Mar 5, 2017

AliaMYH commented Mar 5, 2017

gheinrich commented Mar 6, 2017

gheinrich commented Mar 9, 2017

gheinrich commented Mar 13, 2017

SlipknotTN commented Mar 13, 2017

gheinrich commented Mar 14, 2017

nollimahere commented Mar 29, 2017 •

edited

Loading

SlipknotTN commented Mar 30, 2017

nollimahere commented Mar 30, 2017

SlipknotTN commented Mar 30, 2017

nollimahere commented Apr 5, 2017 •

edited

Loading

DIGITS image classification view does not correctly handle output of FCNs #1492

DIGITS image classification view does not correctly handle output of FCNs #1492

Comments

AliaMYH commented Mar 4, 2017 • edited Loading

AliaMYH commented Mar 4, 2017 • edited Loading

AliaMYH commented Mar 4, 2017

samansarraf commented Mar 4, 2017

AliaMYH commented Mar 4, 2017

samansarraf commented Mar 5, 2017

AliaMYH commented Mar 5, 2017

gheinrich commented Mar 6, 2017

gheinrich commented Mar 9, 2017

gheinrich commented Mar 13, 2017

SlipknotTN commented Mar 13, 2017

gheinrich commented Mar 14, 2017

nollimahere commented Mar 29, 2017 • edited Loading

SlipknotTN commented Mar 30, 2017

nollimahere commented Mar 30, 2017

SlipknotTN commented Mar 30, 2017

nollimahere commented Apr 5, 2017 • edited Loading

AliaMYH commented Mar 4, 2017 •

edited

Loading

AliaMYH commented Mar 4, 2017 •

edited

Loading

nollimahere commented Mar 29, 2017 •

edited

Loading

nollimahere commented Apr 5, 2017 •

edited

Loading