Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with prior_predictive_sample and MvNormal #3758

Closed
dfm opened this issue Jan 4, 2020 · 0 comments · Fixed by #4305
Closed

Issue with prior_predictive_sample and MvNormal #3758

dfm opened this issue Jan 4, 2020 · 0 comments · Fixed by #4305

Comments

@dfm
Copy link
Contributor

dfm commented Jan 4, 2020

This definitely isn't a high priority issue, but I'd love to understand what's going on if anyone has ideas!

This might be an issue with my understanding of sample_prior_predictive, but the fact that the behavior for Normal and MvNormal are not consistent suggests that it actually is a bug. Basically, it seems like MvNormal doesn't seem to condition properly on the sampled variables that it depends on (probably something to do with draw_values, but I don't really understand what happens under the hood well enough to know what!).

In the following example:

import numpy as np
import pymc3 as pm

np.random.seed(42)
ndim = 50
with pm.Model() as model:
    a = pm.Normal("a", sd=100, shape=ndim)
    b = pm.Normal("b", mu=a, sd=1, shape=ndim)
    c = pm.MvNormal("c", mu=a, chol=np.linalg.cholesky(np.eye(ndim)), shape=ndim)
    d = pm.MvNormal("d", mu=a, cov=np.eye(ndim), shape=ndim)    
    samples = pm.sample_prior_predictive(1000)
    
print(np.std(samples["a"]), np.std(samples["b"]), np.std(samples["c"]), np.std(samples["d"]))
print(np.std(samples["b"] - samples["a"]), np.std(samples["c"] - samples["a"]))

I get the following output:

100.01598664026606 100.01292464866555 99.96568032648382 nan
1.0016395711339057 141.20382229079829

In the first line, I'm surprised that the samples of d are all nan because it doesn't seem like there's anything wrong with the syntax, but the other results all seem right. But then the real issue is that I would expect the second line to return two numbers of order 1, but instead we're getting the sum of two random variables with sigmas of 1 and 100. This means that the mean of the MvNormal is not being conditioned properly/consistently with the actual samples being generated. The PGM looks fine:

PGM

So I expect that the issue is in the sampling, not the model specification.

Let me know if you have any ideas about what's going on here!

Versions and main components

  • PyMC3 Version: GitHub master (3.8)
  • Theano Version: 1.0.4
  • Python Version: 3.7.5
  • Operating system: Mac
  • How did you install PyMC3: pip
@Sayam753 Sayam753 mentioned this issue Nov 2, 2020
4 tasks
twiecki added a commit that referenced this issue Dec 11, 2020
* Made sample_shape same across all contexts, thereby resolves #3758

* Pass the failing test

* Worked on suggestions

* Used to_tuple for size

* Given a mention in release notes

* Update RELEASE-NOTES.md

Co-authored-by: Thomas Wiecki <thomas.wiecki@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant