-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Multiclass Object detection prediction always returns the same class #3031
Comments
79 training images is a very small amount of data for trying to train an object detector. This particularly true if you have 38 classes. You're probably going to need a lot more training data in order to get an effective model. This is likely the reason for the issues you are having. During training, what happens to the loss? Is it generally decreasing? Is it decreasing for awhile and then going back up? |
@TobyRoseman I am training with only 79 images because i'm kind of debugging... I already trained with 1500 images, ~4500 bbox, 59 classes for about 12 hours and the same thing was happening so I want to save time debugging. Regarding the loss, it starts at ~14 and ends at ~1 so generally decreasing from start to end Is any character forbidden from being used as a label ? |
Ok, that sounds good.
That sound reasonable to me.
There are no forbidden characters for the labels.
Floats should be fine. A few questions: |
@TobyRoseman So I ran some tests, I limited my self to 10 images and their labels, and trained 3 models, one on Turicreate 5.8, one on 6.0 and one on 6.1 and this behavior seems to happen only on 6.X. However, on 5.8, if the iteration number is too high, the training freezes after a while and nothing gets printed in the output so i need to kill the process and lower max_iterations. |
@YassineElbouchaibi - thanks for the update. I'm glad 5.8 is working for you. TuriCreate 6.0 was a major update. We moved the Object Detector (and all of our other deep learning toolkits) from using MXNet to TensorFlow. I can't reproduce this issue. I used a two class dataset and trained an Object detector models on Ubuntu. Predictions from this model contain both classes. |
I strongly think this issue only happens when you have 3 classes or more... I’ll try testing tc v6.1 with a 3 class dataset I found and also with a 2 class dataset and see if my hypothesis is true. If it’s the case I’ll leave a link to an interactive python notebook for the reproduction. |
@YassineElbouchaibi - My apologies for not responding sooner. I finally trained an object detection model, on Linux, with a dataset containing more than two classes. I agree there is a serious issue here. On Linux, I trained an object detection model using a dataset with 6 classes, 24 image and 103 bounding boxes. After 2,000 iterations, I examined predictions on the train set. It produces roughly the right number of bounding box predictions, but the labels for all predictions are always only one of two classes, with most being from just one class. I get similar results with validation data. On macOS, the same code and same dataset produce reasonable looking results (i.e. class predictions have roughly the right distribution). We'll investigate this issue further. |
With more than two labels, this issue seems to replicate every time on Linux . This is not an issue on macOS with more than three labels. However if you take a model trained on Linux then predict on macOS, all predictions are for only two labels. This seems to be an issue with the training of our TensorFlow implementation. |
Two quick updates here: |
I've verified that this bug is present in turicreate 6.0 and that it was not present with 5.8 (the version before 6.0). Clearly this bug was introduced when migrating the MXNet implementation to TensorFlow. Looking at the TensorFlow implementation and in particular the loss function, I'm not totally understanding it all but I'm not seeing anything which would cause this issue. I'll start comparing our current loss function to the previous implementation. |
Is this issue fixed now after 85a8e44 commit? |
@MaddyThakker - this issue is fixed with that commit. However we've identified another issue with our TensorFlow implementation of object detection. The performance has significantly degraded since our 6.1 release. I'm actively investigating that issue and will do a point release once it is resolved. |
Thanks for the update @TobyRoseman. Would building turicreate from source include this fix? |
That would fix the issue of only making predictions for two classes, but your accuracy will be quite poor. |
This issues was fixed by #3160. We just released a point release (TuriCreate 6.2.2) with this fix. Upgrade your version of TuriCreate and you should be good to go. |
Multiclass Object detection prediction always returns the same class but different after each training.. Let me clarify.
So I have a dataset and here is its head :
Here is one exemple of annotations :
The corresponding image (labeled) is this one :
But after training an Object detection model here is what I get :
In fact, Broken glass is the only label I get :
...
So I verified model.classes and this is what I got : (the right thing)
I also verified model.summary : (Everything seems legit)
I would normally train for longer but limited the training because this problem happens even if I train for around 6-7 hours.
I work on google colab and here is a bit of my code :
Here is the output I get before all the iterations :
In conclusion everything seems normal but I only get one class for all my predictions afterwards.
Thanks guys!
The text was updated successfully, but these errors were encountered: