-
Notifications
You must be signed in to change notification settings - Fork 624
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inception Score calculation #29
Comments
The first point is an important issue. For the third point, note that they do NOT intend to calculate the inception score (IS) for 50,000 inputs. Rather, they spit 50,000 samples into 10 splits each with 5,000 samples. They then calculate IS for each split and return the average IS over splits. So the code is correct. |
So,which version is correct? |
Hi, I have rewritten the code for calculating Inception Score, taking the first problem into consideration:https://github.com/tsc2017/inception-score As to the second problem, since the softmax function hardly outputs a 0 for a category, which means the conditional and marginal distribution of y is supported on all the 1000 classes, it is unlikely to get a 0*log(0), log(∞) or divide-by-0 error, and I do not observe any numerical instability, neither with the old nor with my new implementation. Lastly, since the inception score is approximated by a statistic of a sample, just make sure the sample size is big enough. The common use of 50000 images in 10 splits seems acceptable. Take the CIFAR-10 training set images as an example, the inception score is around 11.34 in 1 split and 11.31±0.08 if in 10 splits. |
Can anyone tell me where can I find the material about the inception score? |
@lipanpeng https://arxiv.org/abs/1801.01973 @xunhuang1995 the third point is valid, because 5k splits might be too small to adequately represent 1k classes. And as they show in the paper, IS changes depending on the size of the split. |
The Inception Score calculation has 3 mistakes.
It uses an outdated Inception network that in fact outputs a 1008-vector of classes (see the following GitHub issue):
Fix: See link for the new inception Model.
It calculates the kl-divergence directly using logs, which leads to numerical instabilities (can output nan instead of inf). Instead,
scipy.stats.entropy
should be used.Fix: Replace the above with something along the lines of the following:
It calculates the mean of the exponential of the split rather than the exponential of the mean:
Here is the code in
inception_score.py
which does this:This is clearly problematic, as can easily be seen in a very simple case with a x~Bernoulli(0.5) random variable that E[e^x] = .5(e^(0) + e^(1)) != e^(.5(0)+.5(1)) = e^[E[x]]. This can further be seen with an example w/ a uniform random variable, where the split-mean over-estimates the exponential.
Fix: Do not calculate the mean of the exponential of the split, and instead calculate the exponential of the mean of the KL-divergence over all 50,000 inputs.
The text was updated successfully, but these errors were encountered: