-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
scaling the evidence for low probability cases (originally titled "switching to log evidence") #78
Comments
We will want to make a function called "logOfUnloggedMean" which will do something like this: given an array called "a", it will take log(a) and return an approximate value for log of the mean of a. That is, it will give us log(mean(a)) without using an exp inbetween. Note that what we need is different from mean(log(a)). |
http://knuthlab.rit.albany.edu/talks/knuth-me15-tutorial---final.pdf |
This turns out to be not a big problem. We are only taking the mean of the vector, not doing any multiplication. So as long as there is any sufficiently large probability (even one) the mean will turn out okay. As for the cases where more than one model has only zero values, this post will describe a suggested solution. I think I have a solution. The evidence is only needed for model selection and design of experiments, correct? So that means we could wait until all models and conditions are sampled before calculating it, as long as we retain the vectors for each “model”. Then we can add a scaling factor inside the exponent to make sure that the mean won’t end up as zero. self.evidence = np.mean(np.exp(self.post_burn_in_log_posteriors_un_normed_vec))/np.linalg.norm(self.post_burn_in_samples)# Then we could do this for each model/condition to get an arbitrary order of magnitude factor needed to push the values higher: Then we can go back and scale: self.evidenceScaled = self.evidence = np.mean(np.exp(self.post_burn_in_log_posteriors_un_normed_vec+ smallestOrderOfMagnitudeForLog)) / np.linalg.norm(self.post_burn_in_samples)# |
However, in order to prepare for this eventual need, I think it will be better if we move this calculation outside of the mcmc function. We should just keep those sampling vectors for each model’s mcmc sampling, and then calculate the evidence external to the mcmc sampling function. This is also advantageous since in the future since we’ll have some “continue sampling” type features and parallel processing features that will make it desirable to have such computations outside of the mcmc function. Note: in Oct 2020, some of the calculating was moved outside of the mcmc function and parallel processing implemented. So part of what was needed has been done. |
We presently have a line in the code that looks like this (introduced in a fix as noted in #67 ) :
self.evidence = np.mean(np.exp(self.post_burn_in_log_posteriors_un_normed_vec))/np.linalg.norm(self.post_burn_in_samples)# another variety:np.sqrt(2np.pi*np.std(self.post_burn_in_samples)**2)
It used to look like this:
evidence = np.mean(posteriors_un_normed_vec)np.sqrt(2np.pi*np.std(samples)**2)
We want to shift to using log evidence so that we don't need to use the exp in the numerator.
My original conception of the problem was like this (slightly incorrect, will be modified below)
We want numerator/denominator
we have log(numerator)
We’re doing 10^log(numerator)/denominator = numerator/denominator
We know that…
log(a/b) = log(a)-log(b)
so can’t we do this:
e^ ( log(numerator) – log(denominator)) this preserves the nonzeros for longer.
or better yet, we could just leave it like this:
log(numerator) - log(denominator) and change the variable name to log_evidence
Can’t we just keep log evidence rather than converting to evidence?
My idea of just converting to logs in a simple way is incorrect because of the mean.
We actually had:
mean(a)/b
then we converted to:
mean(e^log(a))/b
The mean makes things not as simple as log(a)-log(b).
Others have had similar challenges:
https://math.stackexchange.com/questions/834913/how-to-calculate-arithmetic-mean-of-log-values
This is the closest I have found to a solution that would work for us (though it's not in python).
https://stackoverflow.com/questions/15160847/logarithmic-mean-of-vector
I'm sure there is some way to convert things into an estimate that will allow us to use the log(a)-log(b) type way of doing things. Perhaps somebody else can figure it out when they are working on a model discrimination case that requires this fix.
The text was updated successfully, but these errors were encountered: