Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor SMC and properly compute marginal likelihood #3124

Merged
merged 8 commits into from
Aug 18, 2018

Conversation

aloctavodia
Copy link
Member

@aloctavodia aloctavodia commented Jul 28, 2018

This PR introduces many changes in the SMC sampler, as a result the code is simpler (although many features are still missing compared to current SMC) the marginal likelihood is computed properly, I think this could also help with the implementation of SMC-ABC. And the sampler should be faster, although I did not run benchmarks.

Test will certainly fail at this stage I did not even try to check them. This code should not work with variables with shape larger than 1 :-)

I tested using a Beta-binomial model because is simple and the analytical computation on the marginal likelihood is also simple.

In case someone wants to try:

data = np.repeat([1, 0], [50, 50])

def bern_beta(prior, data):
    a, b = prior
    z = np.sum(data)
    N = len(data)
    p_data = np.exp(betaln(a + z, b+N-z) - betaln(a, b))
    return p_data

marginals = []
traces = []

a_prior_0 = 2.
b_prior_0 = 5.
a_prior_1 = 10.
b_prior_1 = 10.
for alfa, beta in ((a_prior_0, b_prior_0), (a_prior_1, b_prior_1)):
    with pm.Model() as model:
        a = pm.Beta('a', alfa, beta)
        y = pm.Bernoulli('y', a, observed=data)
        trace = pm.sample(2000, step=pm.SMC(), random_seed=42)
        marginals.append(model.marginal_likelihood)
        traces.append(trace)
        trace = pm.sample(1000)
        traces.append(trace)
print(marginals[1] / marginals[0])
print(bern_beta([a_prior_1, b_prior_1], data) / bern_beta([a_prior_0, b_prior_0], data))

I get a Bayes factor of 3.35 (with SMC) and 3.40 (analytically). I tried different combinations of priors and results look OK. I get larger variance when one of the priors is very different from the posterior and the other is very close (like (1, 10) (50, 50)), but I think that is expected. Intuitively if the prior and posterior are very similar you will get very few stages (even just one) and hence the estimation should be more noisy and if the prior and the posterior are very different you have to take many intermediate steps and if you do not take enough estimation will also will be noisy (but nor sure about all this I should check something I read once about variance of marginal likelihood estimated by SMC).

This is the result of the previous model comparing the estimation by NUTS and SMC (I know is a simple model, but it could not work :-) results are also close to the analytic posterior)
new_smc_vs_nuts

The code is clunky and feedback very much appreciated!


def _initial_population(samples, chains, model, variables):
# FIXME!!!!
def _initial_population(samples, model, variables):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use the sample_prior_predictive function here right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's an option. Currently I need an array of samples not a dictionary, but this is something I need to change in order to work with multidimensional variables.

beta : float
tempering parameter of the current stage
weights : numpy array
Importance weights (floats)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need update (beta also appear twice above)

@twiecki
Copy link
Member

twiecki commented Aug 10, 2018

image
my favorite kind of PR.

# FIXME!!!!
def _initial_population(samples, model, variables):

def _initial_population(chains, model, variables):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason not using sample_prior_predictive?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not really, I will change it.

@junpenglao junpenglao changed the title [WIP] refactor SMC and properly compute marginal likelihood Refactor SMC and properly compute marginal likelihood Aug 17, 2018
@junpenglao
Copy link
Member

Dont forget release note.

@junpenglao
Copy link
Member

Q: I see the sampling is alternating between a fast stag and a slow stag, why is that?

Sample initial stage: ...
Beta: 0.065125 Stage: 0
100%|██████████| 10000/10000 [00:02<00:00, 4300.95it/s]
Beta: 0.172089 Stage: 1
100%|██████████| 10000/10000 [00:15<00:00, 649.18it/s]
Beta: 0.340472 Stage: 2
100%|██████████| 10000/10000 [00:01<00:00, 5214.46it/s]
Beta: 0.540063 Stage: 3
100%|██████████| 10000/10000 [00:15<00:00, 663.20it/s]
Beta: 0.771252 Stage: 4
100%|██████████| 10000/10000 [00:01<00:00, 5239.14it/s]
Beta: 1.000000 Stage: 5
100%|██████████| 10000/10000 [00:19<00:00, 507.09it/s]

@junpenglao
Copy link
Member

Marginal likelihood estimation is way more accurate, but still different compare to the result from the bridge sampler (see last few cells of the notebook)

@aloctavodia
Copy link
Member Author

The number of steps at each stage (and the proposal distribution) are tune taking into account the acceptance rate of the previous stage, thus not all stages take the same time. I guess oscillation are the result of the tuning predicting and easier than real stage. I will keep testing but it seems that using the tuning is effectively reducing the time without sacrificing the quality of the results.

I have been comparing the marginal likelihood with the analytical result of the beta-binomial model, at this point it seems to me that any difference is just due to the SMC approximation (and the intrinsically difficulty of computing the marginal likelihood). Nevertheless, I am almost sure I read something about improving/stabilizing the computation of marginal likelihood using SMC (I just need to remember where!). I would say SMC produce a marginal likelihoods estimation around the true value minus/plus a 20% error, This is an impression from running a couple of examples. Maybe there is a way to report an estimation of the variance or confidence interval (without running SMC several times!)

@aloctavodia
Copy link
Member Author

ohhh, I will run your example too and see if there is something I can do to improve the code. I guess we can merge this at this point (I will add a warning message to the sampler) and then I/we can try to polish any rough edges before 3.6

@junpenglao
Copy link
Member

yep that sounds like a good plan.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants