Interface for building logprob graphs in AePPL #193

rlouf · 2022-10-30T20:14:54Z

rlouf
Oct 30, 2022
Maintainer

TL;DR

To fit the mathematical terminology more closely, and because of the ambiguities caused by transforming the graph in the function that builds the logprob graph, I suggest the following interface:

import aeppl
import aesara.tensor as at

srng = at.random.RandomStream(0)
x_rv = srng.normal(0, 1.)
y_rv = srng.normal(x, 1.)

x_rv, y_vr = x_rv.clone(), y_rv.clone()
logprob, value_variables = aeppl.joint_logprob({x_rv: x_vv, y_rv: y_vv})
logprobs, value_variables = aeppl.conditional_logprob({x_rv: x_vv, y_rv: y_vv})

That would however not resolve all rewrite-related ambiguities and I will open a separate discussions on the rewrite interface.

Joint and conditional probabilities

One of AePPL’s core functionalities is to produce graphs that compute log-probabilities for Aesara graphs that contain RandomVariables. These graphs correspond to well-defined mathematical objects and it is important that the semantics in AePPL follows the definition of these objects closely.

If we let $\mathcal{G}$ be a probabilistic graphical model (PGM) over the variables $X_1, \dots, X_N$, we say that the joint distribution $P(X_1, \dots, X_N)$ factorizes according to $\mathcal{G}$ if $P$ can be expressed as the product:

$$ P(X_1, \dots, X_N) = \prod_{i} P(X_i|Pa_{X_i}^{\mathcal{G}}) $$

where $P(X_i|Pa_{X_i}^{\mathcal{G}})$ are conditional probability distributions, and $P_{X_i}^{\mathcal{G}}$ the parents of $X_i$ in $\mathcal{G}$.

Aesara we can be used to build (directed) PGMs by building a graph that contains RandomVariables:

import aesara.tensor as at

x_rv = at.random.normal(0, name="x")
y_rv = at.random.normal(x, name="y")

In applications we are generally interested in computing the two quantities defined above: the joint probability $P(X_1=x_1, \dots, X_N=x_N)$ and the conditional probabilities $P(X_i=x_i| .)$. To compute the graph of the joint log-probability it is natural to define, following AePPL’s current syntax:

x_vv = x_rv.clone()
y_vv = y_rv.clone()

logprob = aeppl.joint_logprob({x_rv: x_vv, y_rv: y_vv})

which returns a scalar. joint_factorized_logprob currently returns the conditional log-probabilities, and the following interface would follow the mathematical terminology more closely:

x_vv = x_rv.clone()
y_vv = y_rv.clone()

logprobs = aeppl.conditional_logprob({x_rv: x_vv, y_rv: y_vv})

which returns a dictionary that maps x_vv and y_vv to the graph that compute the associated respecive conditional log-probabilities. Factorized logprob refers to a product and, without reading the documentation, one would expect joint_factorized_logprob to return a graph that represents the product of conditional log-probabilities.

The sum argument to joint_logprob should be removed and we only keep the behavior when sum=True.

Transformations

AePPL allows to specify transformations for the RandomVariables in a graph that are applied before constructing the logprob graph:

import aesara.tensor as at

srng = at.random.RandomStream(0)

mu_rv = srng.normal(1.)
sigma_rv = srng.halfcauchy(1.)
Y_rv = srng.normal(mu_rv, sigma_rv)

Many MCMC algorithms, and notoriously HMC, perform better when parameters are distributed on the real line. However the $\operatorname{C}^+$ distribution is defined on $\mathbb{R}^+$. To use HMC we would thus typically apply a log transformation to sigma_rv and condition on the result.

AePPL provides rewrite primitives that allow us to work in the transformed space without modifying our model. The goal is to be able to pass variables that take values in the transformed space to compute the model’s joint logprob, so outside callers need not be aware of the constraints on the support of the conditional distributions:

import aesara.tensor as at

import aeppl
from aeppl.transforms import TransformValuesRewrite, LogTransform

srng = at.random.RandomStream(0)

mu_rv = srng.normal(0, 1)
sigma_rv = srng.halfcauchy(1)
Y_rv = srng.normal(mu_rv, sigma_rv)

mu_vv = mu_rv.clone()
sigma_vv = sigma_rv.clone()
Y_vv = Y_rv.clone()

transforms_op = TransformValuesRewrite(
     {sigma_vv: LogTransform()}
)
logprob = aeppl.joint_logprob(
    {Y_rv: Y_vv, sigma_rv: sigma_vv, mu_rv: mu_vv},
    extra_rewrites=transforms_op
)

However this interface is ambiguous: we defined sigma_vv = sigma_rv.clone() in the original space, while after the transformation sigma_vv is supposed to take values in the transformed space. We can hide this ambiguity by having conditional_logprob clone and return the measurable variables:

logprob, value_variables = joint_logprob(Y_rv, sigma_rv, mu_rv, extra_rewrites=transforms_op)
print(value_variables)
# {Y_rv: Y_vv, sigma_rv: sigma_vv, mu_rv: mu_vv)
print(logprob.keys())
# [Y_rv, sigma_rv, mu_rv]

We can always pass observations directly to joint_logprob using keyword arguments:

logprob, value_variables = joint_logprob(sigma_rv, mu_rv, Y_rv=Y_val, extra_rewrites=transforms_op)

althought we could make it work easily, I don't particularly like an interface where random variables and extra parameters can both be passed as keyword arguments. It is not a big improvement over the confusion we were trying to solve. As explained in #195 I think that the ambiguity of the current interface can only be resolved if we refactor the transforms interface. This should be the way forward once the name changes mentioned above are made.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Interface for building logprob graphs in AePPL #193

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Interface for building logprob graphs in AePPL #193

rlouf Oct 30, 2022 Maintainer

TL;DR

Joint and conditional probabilities

Transformations

Replies: 0 comments

rlouf
Oct 30, 2022
Maintainer