-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert random variables to value variables so pm.sample(var_names)
works correctly
#7284
Conversation
@ricardoV94 in order for this to match what is done in # Get value variables for the trace
if var_names is not None:
value_vars = []
transformed_rvs = []
for rv in model.unobserved_RVs:
if rv.name in var_names:
value_var = model.rvs_to_values[rv]
transform = model.rvs_to_transforms[rv]
if transform is not None:
transformed_rvs.append(rv)
value_vars.append(value_var)
transformed_value_vars = model.replace_rvs_by_values(transformed_rvs)
trace_vars = value_vars + transformed_value_vars
assert len(trace_vars) == len(var_names), "Not all var_names were found in the model" However, an assertion error is raised when there are transformed variables because they add two elements to the Do you think the modification already pushed in the PR is the correct one, or do we need to somehow explicitly account for the transformations? |
It seems the current state is doing the right thing. See the following example. import arviz as az
import numpy as np
import pymc as pm
batch = np.array(
[
1, 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5,
6, 6, 6, 7, 7, 7, 7, 8, 8, 8, 9, 9, 10, 10, 10
]
)
temp = np.array(
[
205, 275, 345, 407, 218, 273, 347, 212, 272, 340, 235, 300, 365,
410, 307, 367, 395, 267, 360, 402, 235, 275, 358, 416, 285, 365,
444, 351, 424, 365, 379, 428
]
)
y = np.array(
[
0.122, 0.223, 0.347, 0.457, 0.08 , 0.131, 0.266, 0.074, 0.182,
0.304, 0.069, 0.152, 0.26 , 0.336, 0.144, 0.268, 0.349, 0.1 ,
0.248, 0.317, 0.028, 0.064, 0.161, 0.278, 0.05 , 0.176, 0.321,
0.14 , 0.232, 0.085, 0.147, 0.18
]
)
batch_values, batch_idx = np.unique(batch, return_inverse=True)
coords = {
"batch": batch_values
}
with pm.Model(coords=coords) as model:
b_batch = pm.Normal("b_batch", dims="batch")
b_temp = pm.Normal("b_temp")
mu = pm.Deterministic("mu", pm.math.invlogit(b_batch[batch_idx] + b_temp * temp))
kappa = pm.Gamma("kappa", alpha=2, beta=2)
alpha = mu * kappa
beta = (1 - mu) * kappa
pm.Beta("y", alpha=alpha, beta=beta, observed=y)
with model:
idata_1 = pm.sample(random_seed=1234)
idata_2 = pm.sample(var_names=["b_batch", "b_temp", "kappa"], random_seed=1234)
az.plot_forest([idata_1, idata_2], var_names=["b_batch"])
az.plot_forest([idata_1, idata_2], var_names=["b_temp"])
az.plot_forest([idata_1, idata_2], var_names=["kappa"]) |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #7284 +/- ##
==========================================
+ Coverage 91.67% 92.33% +0.65%
==========================================
Files 102 102
Lines 17017 17018 +1
==========================================
+ Hits 15600 15713 +113
+ Misses 1417 1305 -112
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch!
This one warranted a regression test |
Description
This PR converts the random variables to value variables when
var_names
is notNone
inpm.sample()
. Before this PR, usingvar_names
resulted in sampling from the prior. The problem is better explained in the linked issue.Related Issue
Closes #7258
Type of change
📚 Documentation preview 📚: https://pymc--7284.org.readthedocs.build/en/7284/