-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make forward sampling functions return InferenceData
#5073
Conversation
InferenceData
InferenceData
Co-authored-by: Osvaldo Martin <aloctavodia@gmail.com>
This is all done and ready for (final?) review 🥳 |
This is weird: all the tests for |
Codecov Report
@@ Coverage Diff @@
## main #5073 +/- ##
==========================================
- Coverage 78.41% 75.22% -3.19%
==========================================
Files 130 87 -43
Lines 24461 14134 -10327
==========================================
- Hits 19182 10633 -8549
+ Misses 5279 3501 -1778
|
I managed to fix all the tests, except those related to SMC |
SMC uses prior predictive sampling internally. It should be a simple matter of setting return_inferencedata to False there |
You have a lot of unrelated changes due to your pre-commit. This seems to always happen with your setup :P (and you should revert them...) |
Aaaaah, that's why!
I didn't see any unrelated changes -- this is just Black doing its thing through pre-commit. And I did change quite a lot of lines (mainly tests; we use forward sampling a lot 😅 ). |
Look at |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good move to add this capability!
However, I'd prefer if we didn't add an unnecessary dimension to the prior.
With these changes, what's the recommended workflow if I want to combine the MCMC, prior and posterior traces into the same InferenceData
? Is there a idata.concat()
or idata.swallow()
method or something like that?
size=100000, | ||
alpha=0.05, | ||
fails=20, | ||
dist, paramdomains, valuedomain=Domain([0]), ref_rand=None, size=100000, alpha=0.05, fails=20 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AlexAndorra looks like you're rolling with an outdated pre-commit configuration.
Maybe a pre-commit uninstall
+ pre-commit install
is enough to get you updated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so: my pymc venv is brand new and I pulled from main
. I think the issue is PyCharm is running a Black from another environment than my pymc env. Is there an easy way to revert those changes though (some of them are not in dedicated commits)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know any automated way to revert them.
Next time try to stage selected ranges/lines contiously, then it's easier to spot this before committing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is weird: now I'm sure that the Black ran on this file is from pre-commitn and it still gives me diffs compared to main
. Doesn't that mean that there was a problem with this even before my PR?
assert gen["phi"].shape == (draws,) | ||
assert gen["y"].shape == (draws, n) | ||
assert "thetas" in gen | ||
assert gen.prior["phi"].shape == (1, draws) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's that additional prior dimension? "chain"
doesn't make much sense for prior predictive..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's necessary for the next step: merging the prior pred object automagically with the trace and the post pred samples: xarray
needs this dimension to match everything, as the two other objects have it (speaking under the control of @aloctavodia and @OriolAbril )
@ricardoV94: Yes, I saw them, it's just Black. But I think I know what's happening: PyCharm is probably running a Black from another venv than my pymc env! Seems hard to revert only those changes though, isn't it? Maybe easier in another PR? |
@michaelosthege: I use that: with model:
idata = pm.sample()
idata.extend(pm.sample_prior_predictive())
idata.extend(pm.sample_posterior_predictive(idata)) There may be something even better, but that already works like a (py)charm 🤩 |
It seems the bad changes are (all) in the |
Ooh damn, I thought the problem was coming from
in |
Ok, I just did that and think I fixed all the tests @ricardoV94 and @michaelosthege. So, if tests pass, everything should be ready to merge 🤞 |
) | ||
def test_missing(data): | ||
|
||
with Model() as model: | ||
x = Normal("x", 1, 1) | ||
with pytest.warns(ImputationWarning): | ||
y = Normal("y", x, 1, observed=data) | ||
_ = Normal("y", x, 1, observed=data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ha ha, just my own neurosis when I see a named but unused variable 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, let's see if the CI passes :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The diff looks fine now.
I'm still not a fan of having an unnecessary "chain"
dimension in prior predictive. InferenceData can handle constant_data
without "chain"/"draw" dims just fine. But if that's a matter of what InferenceData
supports, we should discuss that separately.
CI passes, CI passes 🍾 (the failing docs are expected) Agreed @michaelosthege. If this makes you feel any better, this is already the current behavior: a |
😬 |
To add a bit of extra info to Alex's comment:
The sampling and extends are "commutative". That is, this would also work
|
This PR adds the possibility to return foward samples directly as an
InferenceData
objects, and sets this behavior as the default. These objects then just need to be extended to the original trace (which is already anInferenceData
object by default in 4.0), which simplifies the workflow.I implemented this for
sample_prior_predictive
and you can already review -- I would like to get your approval on the logic before going tosample_posterior_predictive
andsample_posterior_predictive_w
, as the implementation will be very similar.sample_prior_predictive
sample_posterior_predictive
sample_posterior_predictive_w
InferenceData
object by default.