Skip to content

Invalid logp expression if args aren't explicit in DensityDist #5155

Closed
@aseyboldt

Description

@aseyboldt

Description of your problem

On the latest pymc version (777622a) I get incorret logp graphs that contain sampled random variables if I use model variables in a closure of a density dist:

import numpy as np
import pymc as pm

with pm.Model() as model:
    a = pm.Normal("a")
    pm.DensityDist("b", logp=lambda x: (x - a) ** 2, observed=np.array(3.))

The logp function of the model is not deterministic now, because it uses a sampled version of a in the logp function of b instead of the value from the value variable. You can see this in the aesara graph (the normal_rv op should not be in here):

aesara.dprint(model.logpt)
Elemwise{add,no_inplace} [id A] '__logp'   
 |Elemwise{add,no_inplace} [id B] ''   
 | |Sum{acc_dtype=float64} [id C] ''   
 | | |TensorConstant{[]} [id D]
 | |Sum{acc_dtype=float64} [id E] ''   
 |   |Elemwise{mul,no_inplace} [id F] ''   
 |     |Assert{msg='sigma > 0'} [id G] 'a_logprob'   
 |     | |Elemwise{sub,no_inplace} [id H] ''   
 |     | | |Elemwise{sub,no_inplace} [id I] ''   
 |     | | | |Elemwise{mul,no_inplace} [id J] ''   
 |     | | | | |TensorConstant{-0.5} [id K]
 |     | | | | |Elemwise{pow,no_inplace} [id L] ''   
 |     | | | |   |Elemwise{true_div,no_inplace} [id M] ''   
 |     | | | |   | |Elemwise{sub,no_inplace} [id N] ''   
 |     | | | |   | | |a [id O]
 |     | | | |   | | |TensorConstant{0} [id P]
 |     | | | |   | |TensorConstant{1.0} [id Q]
 |     | | | |   |TensorConstant{2} [id R]
 |     | | | |Elemwise{log,no_inplace} [id S] ''   
 |     | | |   |TensorConstant{2.5066282746310002} [id T]
 |     | | |Elemwise{log,no_inplace} [id U] ''   
 |     | |   |TensorConstant{1.0} [id Q]
 |     | |All [id V] ''   
 |     |   |Elemwise{gt,no_inplace} [id W] ''   
 |     |     |TensorConstant{1.0} [id Q]
 |     |     |TensorConstant{0.0} [id X]
 |     |TensorConstant{1.0} [id Y]
 |Sum{acc_dtype=float64} [id Z] ''   
   |Elemwise{mul,no_inplace} [id BA] ''   
     |Elemwise{pow,no_inplace} [id BB] 'b_logprob'   
     | |Elemwise{sub,no_inplace} [id BC] ''   
     | | |TensorConstant{3.0} [id BD]
     | | |normal_rv{0, (0, 0), floatX, False}.1 [id BE] 'a'   
     | |   |RandomStateSharedVariable(<RandomState(MT19937) at 0x7F660912BC40>) [id BF]
     | |   |TensorConstant{[]} [id BG]
     | |   |TensorConstant{11} [id BH]
     | |   |TensorConstant{0} [id BI]
     | |   |TensorConstant{1.0} [id BJ]
     | |TensorConstant{2} [id BK]
     |TensorConstant{1.0} [id BL]

This can be fixed in user code by explicitly letting the DensityDist know about the parameter:

with pm.Model() as model2:
    a = pm.Normal("a")
    pm.DensityDist("b", a, logp=lambda x, a_val: (x - a_val) ** 2, observed=np.array(3.))

Finding a bug like this in an actual model took @ferrine and me a couple of hours, so if we could either change pm.logp so that this just works or so that we get an error if there are remaining rv ops in a logp graph that would help a lot.

(cc @ricardoV94)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions