-
Notifications
You must be signed in to change notification settings - Fork 138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tensorflow image loader image would cause different result #2
Comments
I've nothing to add, but I've seen similar results during testing which has lead me to use the original classifier with caffe. I haven't had the time to look through the code more thoroughly but I'd suggest comparing the resize logic: https://github.com/yahoo/open_nsfw/blob/master/classify_nsfw.py#L19 |
Hey @zlin3000, I've put a lot of time into investigating this issue to no avail... @delta9 and you might be right in suspecting different Another reason might be different jpeg decoding mechanisms (see here and here). I haven't found the time to further investigate this, but I would love to solve this once and for all. I don't know when I'll get around to look into it again though. Help is always appreciated :) |
@zlin3000 @delta9 @mdietrichstein Has anyone found a solution for the issue ? |
@hristorv Not yet, I'm afraid |
I have fixed a bug in the model definition (e1ada8d) which definitely corrupted some classifications. It would be awesome if some of you could run your checks again and let me know if there are still major differences between the implementations. Thanks! |
Hey @mdietrichstein I just did some quick random tests and still found major differences: 50 KB Image
449 KB Image
1.7 MB Image
These are the NSFW scores for some images from .. reddit First I thought it had something to do with the image size but sadly it's all over the place. Thank you so much for your work though! |
Hey @delta9 Thanks for your help!
I've found out that tensorflow and caffe (original implementation) use different approaches in regards to padding when doing convolutions, pooling, etc. I've made some adaptions to the model and it looks like it delivers better results now when using the yahoo image loader ( |
I've spent some more time on this and have identified two serious problems: Padding issues when doing pooling/convolutions Replicating the original image loading and preprocessing procedure is hard
On top of that their model is very sensitive to changes in e.g. the JPEG codec, quality level, .... I don't think it's possible to perfectly replicate the whole process with plain tensorflow due to different jpeg encoding/decoding and resize implementations/configurations between PIL, skimage and tensorflow. That being said, I was still able to adapt the tensorflow loading code in a way that makes the difference a lot smaller than before (at least for my tests). The biggest difference I've observed was about 0.02. @delta9 @zlin3000 It would be awesome if you could check out the new version and test if the results have improved for you too. |
Thanks for the detailed explanation! My use case would have been to use the converted model with Tensorflow Serving in conjunction with a mobile app backend to check user generated content on upload. I was hoping for higher performance over the original caffe script since invoking the python script using a wrapper each time has a lot of overhead.
So I would need to preprocess the images using the yahoo image loader and then send over the data for prediction - if I want to use it with Tensorflow Serving? |
That's correct. You could also try to use the improved tensorflow image loader and check if the results are good enough for your use case. I'm currently trying to get access to a nsfw dataset to evaluate both image loader implementations and get some real numbers on the differences between them. |
using three datasets listed:
yahoo image loader gave the following results:
Original caffe yahoo NSFW gave the following results:
Overall 4.3*10^-6 difference is not significant, between yahoo image loader implemented in tensorflow vs original caffe yahoo nsfw model based on the dataset tensorflow image loader results: 3)porn images: |
Hey @waheebyaqub! Thank you so much for posting you results here. May I ask which dataset you were using for your test? I'm planning to use this dataset for a detailed comparison in the future. |
@mdietrichstein, I actually used the same data, that you have linked, with some preprocessing on porn frames. |
So has this problem been solved? |
@waheebyaqub Alright, thanks!
@liudanking If you use the yahoo image loader then yes, the issue is fixed. |
@mdietrichstein Partially solved is still cool! |
Not in the near future since I'm spending most of my time on a different project at the moment. I'd like to look into it once I have a bit more time though. |
Hi guys,@mdietrichstein @waheebyaqub , is there any chance you guys can provide an alternative link to download the dataset? The one for google sites no longer seems to be working (dataset download, site is up). |
@mdietrichstein @waheebyaqub thanks for your suggestions. Could you please share the final results of this new tuning and which is the improved of the original Yahoo! weights of Also the scores range posted above, are specific to this test, or we can consider valid in general? |
I randomly tested several images, the difference is between .10 to .20.
In fact, I tested the code one by one, and found the resize method might be the problem which cause this.
I also used opencv instead of PIL to do resize, the final result is similar to tensorflow resize. Moreover, I compared resize result between PIL and opencv, they are quite different, for example, the max difference value in one image is about 25, and the RMSD is about 3.
Last, I read some articles which point out that adding noise to a image might cause totally different result even though human being cannot find the difference between these two images.
PS: thanks to this repository which helps me to save time, otherwise I might need to spend lots of time to convert caffe to tensorflow. :)
The text was updated successfully, but these errors were encountered: