Inference with Aesara #88

JulianChambrier · 2022-11-29T15:20:46Z

JulianChambrier
Nov 29, 2022

Dear all,

I would like to know if it is possible to do inference with Aesara. I know it was possible with PyMC3, but I can't find the way to do it with Aesara.

I would like for example, from a dataset to infer to a Pert distribution. More precisely, to get the min, mode, max and lambda parameters that best represent the input data.

Is this possible ? If so, would it be possible to show me a piece of code.

Thank you,

Sicerely,

brandonwillard · 2022-11-30T01:59:55Z

brandonwillard
Nov 30, 2022
Maintainer

Currently, the way the Aesara ecosystem works is by first specifying the necessary random variables—at least when they don't already exist in Aesara or AePPL, like the PERT distribution. In Aesara, random variables are usually represented by sub-classes of the RandomVariable Op.

For example:

from functools import reduce

import aesara
import aesara.tensor as at
from aesara.tensor.random.op import RandomVariable


class PertRV(RandomVariable):
    """A PERT-distributed random variable.

    See https://en.wikipedia.org/wiki/PERT_distribution

    """

    name = "pert"
    ndim_supp = 0
    ndims_params = [0, 0, 0]
    dtype = "floatX"
    _print_name = ("PERT", "\\operatorname{PERT}")

    def __call__(self, a, b, c, size=None, **kwargs):
        return super().__call__(a, b, c, size=size, **kwargs)

    @classmethod
    def rng_fn(cls, *args, **kwargs):
        # TODO: Choose a sampling approach
        raise NotImplementedError()


# Create an instance of the `Op`
pert = PertRV()

Next, a log-density needs to be associated with the new random variable Op, and that's where AePPL comes in.

Here's a questionably implemented example:

from aeppl.logprob import _logprob, betaln


@_logprob.register(PertRV)
def pert_logprob(op, values, *inputs, **kwargs):
    (x,) = values
    a, b, c = inputs[3:]

    # a < b < c and a <= x <= c
    dist_constraints = (
        at.all(at.gt(b, a)),
        at.all(at.gt(c, b)),
        at.all(at.le(a, x)),
        at.all(at.le(x, c)),
    )
    constraint_cond = reduce(at.bitwise_and, dist_constraints)

    alpha = 1 + 4 * (b - a) / (c - a)
    beta = 1 + 4 * (c - b) / (c - a)

    res = (
        (alpha - 1) * at.log(x - a)
        + (beta - 1) * at.log(c - x)
        - betaln(alpha, beta)
        - (alpha + beta - 1) * at.log(c - a)
    )

    return at.switch(constraint_cond, res, -np.inf)

After that, we should have everything we need to define a statistical model using an Aesara graph.

Here's one such model with an extremely arbitrary choice of priors:

import numpy as np

srng = at.random.RandomStream(904243)

# Priors for the parameters of the PERT model
A_rv = srng.uniform(0, 1.0, name="A")
B_rv = srng.uniform(0, 1.0, name="B")
C_rv = srng.uniform(0, 1.0, name="C")

# Create some "observations"
rng = np.random.default_rng(2302)
y_obs = at.as_tensor(rng.uniform(0, 2, size=(10,)))

# This will be our "observed" variable.
# FYI: `srng.gen` is used so that the random variables are seeded deterministically
# by `srng`.
Y_rv = srng.gen(pert, A_rv, A_rv + B_rv, A_rv + B_rv + C_rv, size=y_obs.shape, name="Y")

From here, one can construct all the necessary log-likelihoods with AePPL and use them with one's sampling package of choice–and one's target language of choice (e.g. Python, Numba, JAX). Custom Ops, like the example above, won't generally be supported by every target language–unless it has Python fallback capabilities like Numba.

Here's an example of how one can produce the requisite log-likelihoods used by a sampler:

from aeppl import joint_logprob
from aeppl.transforms import TransformValuesRewrite, DEFAULT_TRANSFORM


# This will apply transforms to the `RandomVariable`s that help
# with HMC sampling.
values_to_transforms = {A_rv: DEFAULT_TRANSFORM, B_rv: DEFAULT_TRANSFORM, C_rv: DEFAULT_TRANSFORM}

transforms_op = TransformValuesRewrite(values_to_transforms)

loglik, value_vars = joint_logprob(A_rv, B_rv, C_rv, realized={Y_rv: y_obs}, extra_rewrites=transforms_op)

# You can see the unsimplified log-likelihood with the following:
# aesara.dprint(loglik)

HMC/NUTS sampling isn't the only estimation approach one can use. @rlouf provides a great example of MAP estimation in this Gist.

Naturally, one should consider implementing HMC-assisting transforms for PertRV terms themselves. This is yet another way in which the custom Op approach starts to show its cracks.

Luckily, the Aesara modeling ecosystem is designed to support numerous alternative approaches, especially ones that leverage and extend existing implementations. For instance, we might notice that the PERT distribution is a transformed beta distribution and use that to generate samples and utilize existing transforms for the beta distribution.

To illustrate:

def pert(srng, a, b, c, **kwargs):
    r"""Construct a PERT distributed graph."""
    alpha = 1 + 4 * (b - a) / (c - a)
    beta = 1 + 4 * (c - b) / (c - a)

    X_rv = srng.beta(alpha, beta, **kwargs)

    z = a + (b - a) * X_rv

    return z


# We can draw "predictive" samples from this version of a PERT-distributed
# random variable!
pert(srng, 0.0, 1.0, 2.0, size=(3,)).eval()
# array([0.69521437, 0.39750624, 0.53022949])

The transform mechanisms in AePPL can be used to derive a log-likelihood for shifted and scaled random variables like the zs produced above. One could also use an OpFromGraph to encapsulate the graphs produced by the pert function and assign them arbitrary log-likelihoods (see discrete_markov_chain). Also, when mathematical properties are available for the relevant distributions, they can be added as rewrites and used to automatically derive log-densities for combinations of random variables (e.g. sum of normals is normal).

As time goes on, more and more built-in support for alternative representations of random variables will be added and it will become less and less necessary to write custom Ops. Regardless, we're currently in the process of simplifying all of the mechanisms behind the approaches mentioned here (e.g. aesara-devs/aesara#1316 will make the OpFromGraph approach much easier).

In general, AePPL was designed so that one can create their own user-facing Python PPL, or construct models in an extremely customizable, reusable, and automatable way. PyMC version 4 is basically a wrapper around Aesara and AePPL that streamlines the setup of a NUTS sampler with Stan-like defaults, but one can always specify these things on their own. Take a look at aemcmc.nuts.construct_sampler for an example of all the steps needed to produce a NUTS sampler. Notice that one can compile the reparameterized Aesara log-likelihood graphs and use them with any other Python-compatible sampler (e.g. littlemcmc if one wants to emulate PyMC more exactly).

Overall, we want the Aesara ecosystem to better "democratize" the specification and construction of statistical models, and we've already laid all the groundwork.

We're currently in the process of updating AeMCMC so that it constructs NUTS samplers with Stan-like defaults, as well. That, and a few standard user-facing helper functions (e.g. for posterior predictive sampling), will make the Aesara ecosystem more immediately accessible to casual users. We already have tools like this in our production work at Ampersand, and we only need to refactor a few of them for general use, so expect this situation to change rapidly.

Currently, Gibbs sampler steps and NUTS kernels can be constructed using AeMCMC, but–as I mentioned–the Stan-like defaults haven't been added yet (see #80, #86). Here's an outline of how it can be used right now:

import aemcmc

# Associate the observed variable Y with
obs_rvs_to_values = {Y_rv: y_obs}
sample_vars = [A_rv, B_rv, C_rv]

sample_steps, updates, initial_values, nuts_parameters = aemcmc.construct_sampler(
    obs_rvs_to_values, srng
)

# Initial values for the A, B, and C model parameters
inputs = [initial_values[rv] for rv in sample_vars]
# NUTS step size and mass matrix values
inputs += list(nuts_parameters.values())
# Posterior A, B, and C values
outputs = [sample_steps[rv] for rv in sample_vars]

# For simplicity, we'll compile the NUTS kernel and sketch a Python sampling loop
posterior_sampler_fn = aesara.function(inputs, outputs, updates=updates)

step_size = ...
mass_matrix = np.array([...])
samples = []
prev_samples = (...)
for i in range(...):
    new_samples = posterior_sampler_fn(*prev_samples, step_size, mass_matrix)
    samples.append(new_samples)
    prev_samples = new_samples

Anyway, I hope some of this conveyed the way that the Aesara projects work together and our plans for simplifying/packaging their combined use. I–or another Aesara member/contributor–can follow this up with a complete NUTS example soon, if some things still aren't clear. You may need to fill in more details, though, because I'm not familiar enough with Bayesian PERT models to devise a useful MWE at the moment. In the meantime, if you want more information about what we're doing, or any of the things mentioned here, feel free to ask!

5 replies

JulianChambrier Nov 30, 2022
Author

Dear Brandon,

Thank you for your detailed answer.

I am beginning to understand better how Aesara works, and for that I thank you.

I tried to test your code, unfortunately I have some things that do not work as for example construct_sampler (AttributeError: module 'aemcmc' has no attribute 'construct_sampler') . Probably due to a version problem (I'm on aesara 2.8.7).

Moreover, as I'm a beginner with Aesara and its functioning is particular, I confess I don't understand all the steps in your answer.

Just to detail my problem:

I would like to find the min, mode, max and lambda parameters that best describe a sample of data (whatever it is)
For example, we have a sample that follows a normal distribution, I'd like it to infer to a Pert distribution (even if it's not optimal and may even be meaningless) or a dataset that doesn't follow any particular distribution to be described in a min, mode, max, lambda (PERT) in order to make it easier to follow.

Basically, Inference_pert(sample) -> min, mode, max, lambda

In your answer, I'm having trouble seeing where this inference is made, where you pass a dataset and get back the parameters describing the Pert distribution.

I apologize in advance if I missed something.

Sincerely,

brandonwillard Dec 1, 2022
Maintainer

I tried to test your code, unfortunately I have some things that do not work as for example construct_sampler (AttributeError: module 'aemcmc' has no attribute 'construct_sampler') . Probably due to a version problem (I'm on aesara 2.8.7).

Yes, you'll need to use the latest versions of Aesara, AePPL, and AeMCMC for those examples.

Moreover, as I'm a beginner with Aesara and its functioning is particular, I confess I don't understand all the steps in your answer.

No problem; we're happy to explain how these things work; however, we need to be clear that Aesara itself isn't intended to serve as a PPL or statistical modeling library. See aesara-devs/aesara#879 for a high-level overview of Aesara's purpose/role.

It sounds like AeMCMC is the most suitable "front-end" or entry-point for your interests, but–as I mentioned–it's in the beta stages and is (and will continue to be) undergoing a lot of rapid changes. Fortunately, those changes are exactly the kind that will help with your higher-level modeling use-case.

(N.B. I moved this discussion to the AeMCMC repository for that reason.)

I would like to find the min, mode, max and lambda parameters that best describe a sample of data (whatever it is)
For example, we have a sample that follows a normal distribution, I'd like it to infer to a Pert distribution (even if it's not optimal and may even be meaningless) or a dataset that doesn't follow any particular distribution to be described in a min, mode, max, lambda (PERT) in order to make it easier to follow.

Regarding the example I gave above, I don't think it would produce reasonable/useful results as is. If you can fully specify a model (e.g. priors and values) so that a minimal working example can be constructed, we can attempt to follow up with something more useful.

JulianChambrier Dec 1, 2022
Author

(N.B. I moved this discussion to the AeMCMC repository for that reason.)

I understand, no problem.

It sounds like AeMCMC is the most suitable "front-end" or entry-point for your interests, but–as I mentioned–it's in the beta stages and is (and will continue to be) undergoing a lot of rapid changes. Fortunately, those changes are exactly the kind that will help with your higher-level modeling use-case.

I'll be following this very closely. Do you know if a full documentation will be on the program?

Regarding the example I gave above, I don't think it would produce reasonable/useful results as is. If you can fully specify a model (e.g. priors and values) so that a minimal working example can be constructed, we can attempt to follow up with something more useful.

I don't really understand what you are waiting for here. You want the Pert distribution model? If so, I have created one based on the Scipy library and the implementation of the beta distribution.

But I can give you a trivial example for more understanding.

import tensorflow_probability as tfp
tfd = tfp.distributions

sample = tfd.Pert(10, 50, 65).sample(10_000)

Here we have a sample of data that follows a Pert distribution of parameters min=10, mode=50, max=65 and lambda=4 (by default).

In this case the priors would be of the form

#Priors for the parameters of the PERT model
A_rv = srng.uniform(10, 65, , name="A")
B_rv = srng.uniform(10, 65, name="B")
C_rv = srng.uniform(10, 65, name="C")

rlouf Dec 1, 2022
Maintainer

I'll be following this very closely. Do you know if a full documentation will be on the program?\

Yes! We just need to get around to do it. The API is currently very much in flux; we'll focus on documentation once we get to a point where the produced samplers "just work" for most models.

AePPL is more stable and we should probably start documenting how it works.

Thank you for your interest!

brandonwillard Dec 7, 2022
Maintainer

Just a quick update: the indirect approach to defining the PERT distribution (i.e. via a beta) should work with the latest version of AePPL:

import aeppl
import aesara
import aesara.tensor as at
import numpy as np

# Create toy data with values in reasonable ranges
a_val = 21
b_val = 25
c_val = 44
alpha_val = 1 + 4 * (b_val - a_val) / (c_val - a_val)
beta_val = 1 + 4 * (c_val - b_val) / (c_val - a_val)

Y_mean = a_val + (b_val - a_val) * (alpha_val / beta_val)

y_val = np.r_[Y_mean + 0.1, Y_mean - 0.4]

# Create a graph of the model in Aesara
y_obs = at.as_tensor(y_val, dtype=np.float64)

srng = at.random.RandomStream(23920)

A_rv = srng.uniform(10, 65, name="A")
B_rv = srng.uniform(10, 65, name="B")
C_rv = srng.uniform(10, 65, name="C")

alpha = 1 + 4 * (B_rv - A_rv) / (C_rv - A_rv)
beta = 1 + 4 * (C_rv - B_rv) / (C_rv - A_rv)
X_rv = srng.beta(alpha, beta, name="X", size=y_obs.shape)
Y_rv = A_rv + (B_rv - A_rv) * X_rv


# Construct the total log-likelihood function for the model
logprob, (a_vv, b_vv, c_vv) = aeppl.joint_logprob(
    A_rv, B_rv, C_rv, realized={Y_rv: y_obs}
)

logprob_fn = aesara.function([a_vv, b_vv, c_vv], logprob, mode="JAX")

logprob_fn(a_val, b_val, c_val)
# DeviceArray(-13.8008764, dtype=float64)

The example above compiles to JAX, so you can use the exact same approach as @rlouf's Gist to produce MAP and NUTS estimates—although you might want to add some transforms to those terms, just like I did in my original comment.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inference with Aesara #88

{{title}}

Replies: 1 comment 5 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Inference with Aesara #88

JulianChambrier Nov 29, 2022

Replies: 1 comment · 5 replies

brandonwillard Nov 30, 2022 Maintainer

JulianChambrier Nov 30, 2022 Author

brandonwillard Dec 1, 2022 Maintainer

JulianChambrier Dec 1, 2022 Author

rlouf Dec 1, 2022 Maintainer

brandonwillard Dec 7, 2022 Maintainer

JulianChambrier
Nov 29, 2022

Replies: 1 comment 5 replies

brandonwillard
Nov 30, 2022
Maintainer

JulianChambrier Nov 30, 2022
Author

brandonwillard Dec 1, 2022
Maintainer

JulianChambrier Dec 1, 2022
Author

rlouf Dec 1, 2022
Maintainer

brandonwillard Dec 7, 2022
Maintainer