-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Important Distribution-to-RandomVariable logic changes #4463
Comments
Can the Metropolis acceptance ratio be calculated between proposals with different shapes? |
Yes, but we can't delve into that here. |
I am checking the V4 branch and this looks like an issue. When generating logpt graphs for model variables it seems that from aesara.printing import debugprint as dprint
import pymc3 as pm
with pm.Model() as m:
x = pm.Uniform('x', lower=0, upper=1)
y = pm.Uniform('y', lower=0, upper=x)
dprint(m.logpt)
Sum{acc_dtype=float64} [id A] '__logp'
|MakeVector{dtype='float64'} [id B] ''
|Sum{acc_dtype=float64} [id C] ''
| |Sum{acc_dtype=float64} [id D] ''
| |Elemwise{mul,no_inplace} [id E] '__logp_x'
| |Elemwise{switch,no_inplace} [id F] ''
| | |Elemwise{mul,no_inplace} [id G] ''
| | | |Elemwise{mul,no_inplace} [id H] ''
| | | | |TensorConstant{1} [id I]
| | | | |Elemwise{mul,no_inplace} [id J] ''
| | | | |TensorConstant{1} [id K]
| | | | |Elemwise{ge,no_inplace} [id L] ''
| | | | |x [id M]
| | | | |TensorConstant{0.0} [id N]
| | | |Elemwise{mul,no_inplace} [id O] ''
| | | |TensorConstant{1} [id P]
| | | |Elemwise{le,no_inplace} [id Q] ''
| | | |x [id M]
| | | |TensorConstant{1.0} [id R]
| | |Elemwise{neg,no_inplace} [id S] ''
| | | |Elemwise{log,no_inplace} [id T] ''
| | | |Elemwise{sub,no_inplace} [id U] ''
| | | |TensorConstant{1.0} [id R]
| | | |TensorConstant{0.0} [id N]
| | |TensorConstant{-inf} [id V]
| |TensorConstant{1.0} [id W]
|Sum{acc_dtype=float64} [id X] ''
|Sum{acc_dtype=float64} [id Y] ''
|Elemwise{mul,no_inplace} [id Z] '__logp_y'
|Elemwise{switch,no_inplace} [id BA] ''
| |Elemwise{mul,no_inplace} [id BB] ''
| | |Elemwise{mul,no_inplace} [id BC] ''
| | | |TensorConstant{1} [id BD]
| | | |Elemwise{mul,no_inplace} [id BE] ''
| | | |TensorConstant{1} [id BF]
| | | |Elemwise{ge,no_inplace} [id BG] ''
| | | |y [id BH]
| | | |TensorConstant{0.0} [id BI]
| | |Elemwise{mul,no_inplace} [id BJ] ''
| | |TensorConstant{1} [id BK]
| | |Elemwise{le,no_inplace} [id BL] ''
| | |y [id BH]
| | |uniform_rv.1 [id BM] 'x'
| | |RandomStateSharedVariable(<RandomState(MT19937) at 0x7FF7A7954C40>) [id BN]
| | |TensorConstant{[]} [id BO]
| | |TensorConstant{11} [id BP]
| | |TensorConstant{0.0} [id N]
| | |TensorConstant{1.0} [id R]
| |Elemwise{neg,no_inplace} [id BQ] ''
| | |Elemwise{log,no_inplace} [id BR] ''
| | |Elemwise{sub,no_inplace} [id BS] ''
| | |uniform_rv.1 [id BM] 'x'
| | |TensorConstant{0.0} [id BI]
| |TensorConstant{-inf} [id BT]
|TensorConstant{1.0} [id BU] As a consequence m.logp evaluations are stochastic: m.logp({x: .9, y:.3})
array(0.26530097)
m.logp({x: .9, y:.3})
array(0.59186006) |
The random variables definitely shouldn't be in there, but I think this case is covered by some of my unfinished changes to |
I pushed a change recently should should've fixed that issue in the log-likelihoods. |
Another issue: The model logp is being constructed in terms of the untransformed variables. master branch: with pm.Model() as m:
x = pm.Uniform('x', 0, 1)
m.logp({'x': -1}) # TypeError: Missing required input: x_interval__ ~ TransformedDistribution
m.logp({'x_interval__': -1})
array(-1.62652338) V4 branch: with pm.Model() as m:
x = pm.Uniform('x', 0, 1)
m.logp({'x': -1})
array(-inf)
m.logp({'x_interval__': -1}) # TypeError: Missing required input: x Which means NUTS is pretty much hopeless at the moment: with m:
trace = pm.sample(compute_convergence_checks=False)
|
By the way it would be really nice to be able to keep generating the logp (under another method name) in terms of untransformed variables. This way the graphs can be used directly for other things that do not require / benefit from transformed variables (e.g., grid approximation) |
Ah, yes, I was still in process of determining how to handle transformations. I started by setting things up so that everything would work exactly as it currently does (i.e. always use the transformed variables when producing log-likelihoods), but, since it's just as easy to apply transforms to an existing log-likelihood graph—and it provides much more opportunity and flexibility—I stopped short of doing that. I'll add the transform stuff shortly. |
I checked bullet points that I know were already addressed / are stale / or have specific issues referring to it. We should revisit the open bullet points and check whether they still need to be worked on |
Closing this, as all points have been addressed, either because they are stale, solved, or because we have more specific reminder issues. |
PyMC's
Distribution
classes are being converted toRandomVariable
s in thev4
branch. A lot of core changes have already been made but a handful of important logic changes are still required in order to reinstate multiple PyMC features. This issue lists some points in the codebase where these changes are needed.First, for anyone who's interested in helping out with the
v4
branch, searching for the stringXXX:
will generally reveal parts of the logic that have been disabled and need to be refactored.Here is a potentially outdated list of those parts accompanied by short summaries of the problem(s) and the work involved:
pymc3.model.Model.register_rv
dict
-as-observed-data thing is about, so I couldn't finish refactoring this.pymc3.distributions.transforms.TransformedDistribution
forward_val
methods usedraw_values
, which has been removed. I think these methods can be removed entirely, because they only appear to be used bypymc3.sampling.sample_prior_predictive
and I don't think that logic is relevant any longer.pymc3.gp.gp.[Marginal, MarginalKron]
draw_values
, and I think they only need to be replaced by the creation and use oftheano.function
. For example,draw_values([mu, cov], point=point)
roughly translates to something liketheano.function([model[n] for n in point.keys()], [mu, cov])(*point.values())
, but we wouldn't want to compile that function every time a sample is needed, so we would need to doself.mu_cov_fn = theano.function([model[n] for n in point.keys()], [mu, cov])
somewhere (e.g. the constructor) and reuseself.mu_cov_fn
in those methods.pymc3.parallel_sampling._Process
pymc3.variational.opvi.Group
DictToArrayBijection
in a peculiar way and requires explicit shape information to do it. Just like all the other changes of this sort, we need to move the creation of the bijection(s) to the places where actual concrete samples are generated (and said bijection(s) are actually used/needed). I don't know enough about this code to do that, so someone who has worked on this should take a look.pymc3.sampling.sample_posterior_predictive_w
sample_[prior|posterior]_predictive
functions.sample_posterior_predictive_w
for V4 #4807pymc3.step_methods.elliptical_slice.EllipticalSlice.astep
draw_values
that needs to be replaced with atheano.function
.pymc3.step_methods.metropolis
dsize
field on the random variables to create a NumPy array that—in turn—is used to initialize the proposal distributions. Instead, we should use the initial sample values for each variable to do this, and perhaps update the proposal distributions when/if new samples are drawn that have new shapes.draw_values
that looks like it might be straightforward to fix.pymc3.step_methods.hmc.base_hmc.BaseHMC.__init__
pymc3.step_methods.gibbs.ElemwiseCategorical
dshape
property that can always be replaced by using the shapes of the initial sample point (but shouldn't, if avoidable).pymc3.step_methods.sgmcmc.BaseStochasticGradient
dshape
anddsize
properties that can always be replaced by using the shapes of the initial sample point (but shouldn't, if avoidable).pymc3.data
pymc3.model_graph.ModelGraph
dshape
; however, this one might not need to be changed, since the whole class could be replaced by the sample-space graphs provided by aModel
object. The only unique feature I can immediately see is the Graphviz plate notation. Theano graphs, like the sample-space graphs, can already be converted to Graphviz Digraphs using functionality that's been in Theano for a while, but the plate notation may be a new thing that requires changes/additions.Originally posted by @brandonwillard in #4440 (comment)
The text was updated successfully, but these errors were encountered: