Invert deterministic operations from measured variables #119

ricardoV94 · 2022-02-02T08:52:53Z

ricardoV94
Feb 2, 2022

Epistemic status: I am not sure about the usefulness of this, other than for variable transforms which we already have a solution for. So this may very well fit in the cute but perhaps useless category.

When Aeppl is asked to derive a logprob graph from a deterministic 1-1 operation based on some upstream random/measurable variable, we have to create a graph that reverses these deterministic operations (adding extra logprob and jacobian terms if needed):

import aesara.tensor as at
from aeppl import joint_logprob

x = at.normal()
y = at.log(x)
z = at.normal(y)

y_vv = y.clone()
z_vv = z.clone()
joint_logprob({y: y_vv, z: z_vv})

This specific case is not implemented yet, but would be after #26

Other logprob terms like that of z are then allowed to depend on y. However, they are not allowed to depend x (which in the logprob term of z should be replaced by exp(y_vv)) or in any other intermediate variables between y and x if there were more. For instance the following is not possible:

x = at.normal()
y = at.log(x)
z = at.normal(x)

y_vv = y.clone()
z_vv = z.clone()
joint_logprob({y: y_vv, z: z_vv})

That type of logp graph can currently be achieved via the TransformValuesOpt but that feels like a clunky add-on. If the above were allowed, we would have a very natural way of defining variable transformations in Aeppl. For more complex transformations we could still use specialized Op's that specify what the inverse graph should be:

x = at.normal()
y = ComplicatedTransformOp()(x)
z = at.normal(x)

And the transformation would still be part of the generative graph, without the need for the extra rewrite.

I wonder how feasible this would be, and whether it opens the door for more interesting features or just falls in the "cute" category.

brandonwillard · 2022-02-09T19:14:24Z

brandonwillard
Feb 9, 2022
Maintainer

The expectations for this library are that both examples should be supported, so there's no need to question the relevance of the idea.

Side note:

Regarding these specific examples, the easiest way to support y = at.exp(x) is to include a rewrite that simply matches such a graph and replaces it with a log-normal RandomVariable. You could manually add such a rewrite for the example above and we would expect it to work.

This non-generalized-transform-based approach maintains a simplicity that generic transforms can't. More specifically, it's much easier to build further rewrites that operate on or expect log-normals, because it's much easier to identify a log-normal when it's just a RandomVariable. The same goes for basic convolutions (e.g. normal-normal).

Ultimately, we want to support such rewrites at least as much as the more generic transform approach. It should be possible for people to work with simple graphs that—as directly as is reasonably possible—map to a high-level representation of a model.

0 replies

ricardoV94 · 2022-04-27T12:15:27Z

ricardoV94
Apr 27, 2022
Author

Just to be clear the problem is not with replacing y = at.exp(at.random.normal(name="x")) -> at.random.lognormal(name="y"), but to allow other measured variables to depend on the original x variable, which in the logprob graph would mean depending on at.log(y_value)

This involves acting on the value variables directly (at least that's what we are doing so far for TransformedRVs), but we may not know which value variables are ultimately associated with intermediate IR variables

1 reply

brandonwillard Apr 27, 2022
Maintainer

This seems to touch up on #78, which introduces a graph-based means of "acting upon" value variables.

In that scenario, one could return a dict in a local rewrite and affect arbitrary changes in value variables quite easily.

Aside from that, I'm not entirely clear on the dependencies you're implying here.

In an example like

x = at.normal()
y = at.exp(x)
z = at.normal(y)

with values on/in terms of y and z (i.e. the joint distribution of y and z), y resolves to a log-normal density through a simple rewrite, and z to a normal conditioned on the value provided for y.

Although, z depends on y in the underlying probabilistic model, there is no relevant dependency between the two aside from the information from y necessary to derive the conditional distribution of z.

If we were trying to rewrite the model graph itself (i.e. to produce a new, equivalent model graph in terms of a log-normal y), then we would need to do a lot more, and it sounds like you might be interested in this, unless I'm missing something.

rlouf · 2022-10-04T08:09:16Z

rlouf
Oct 4, 2022
Maintainer

I am not completely sure about the design of AeMCMC’s inteface yet, but I think that what Ricardo suggests in the first comment would be useful to us. Indeed, in the current design iteration AeMCMC will return a stream of samplers, and callers should be able to inspect the graph that was used to sample as:

import aesara.tensor as at
from aemcmc.basic import construct_sampler

srng = at.random.RandomStream(0)
Y_rv = my_model(srng)
y_vv = Y_rv.clone()

sampling_steps, updates, initial_values = construct_sampler(srng, {Y_rv: y_vv})
print(sampling_steps.model)
# Prints the model graph for instance

The NUTS sampler currently operates on the original graph, and AePPL handles the transforms at the value variable level, so the transformation is not immediately obvious to callers, unless they inspect the transform attribute of the SamplingStep instance.

Let’s give an example of logprob with transforms on a simple model:

import aesara.tensor as at
from aemcmc.basic import construct_sampler

import aeppl
from aeppl.transforms import LogTransform, TransformValuesRewrites


srng = at.random.RandomStream(0)
mu_at = at.scalar('mu')
sigma_rv = at.random.lognormal(1.)
Y_rv = at.random.normal(mu_at, sigma_rv)

sigma_vv = sigma_rv.clone()
y_vv = Y_rv.clone()

rewrites = TransformValuesRewrites({sigma_vv: LogTransform()})
logprob = aeppl.joint_logprob({Y_rv: y_vv, sigma_rv: sigma_vv}, extra_rewrites=rewrites)

When I would like instead to be able to transform the original graph to the following and condition on sigma_rv_tr:

srng = at.random.RandomStream(0)
mu_at = at.scalar('mu')
sigma_rv = at.random.lognormal(1.)
sigma_rv_tr = at.log(sigma_rv)
Y_rv = at.random.normal(mu_at, sigma_rv)

sigma_vv_tr = sigma_rv_tr.clone()
y_vv = Y_rv.clone()

logprob = aeppl.joint_logprob({Y_rv: y_vv, sigma_rv_tr: sigma_vv_tr})
sigma_vv = at.exp(sigma_vv_tr)

So NUTS can operate on this space, and then returns at.exp(sigma_rv_tr) to the caller. AeMCMC can then return this kind of information:

print(sampling_steps[sigma_rv].model)
# Model sampled
#
# mu_at = at.scalar('mu')
# sigma_rv = at.random.lognormal(1.)
# sigma_rv_tr = at.log(sigma_rv)  <-- conditionned on
# Y_rv = at.random.normal(mu_at, sigma_rv)  <--- conditionned on
#
# Returns
#
# sigma_vv = at.exp(sigma_vv_trans)

More fundamentally, I don’t see any reason why the bijections needed by algorithms in the HMC family should have a special status among graph rewrites. In the end, these transforms should be implemented as a function that returns the transformed graph, the value variables to condition on and the value variables in the original space. It is fine to me if AePPL's interface to transforms stay the same, but I would like to be able to do the kinds of conditioning @ricardoV94 mentions in his original post.

4 replies

ricardoV94 Oct 4, 2022
Author

You can also introduce a "LogTransformedNormal” in the graph that will behave like that (I think). It's just a bit less elegant I think

rlouf Oct 4, 2022
Maintainer

Is there a clear path to implementing what you were suggesting in the original post (example 2) or are there things we'd need to figure out?

ricardoV94 Oct 4, 2022
Author

Is there a clear path to implementing what you were suggesting in the original post (example 2) or are there things we'd need to figure out?

I haven't figured out what the code to achieve that might look like.

brandonwillard Oct 5, 2022
Maintainer

As I said above, #78 is a first step in these directions. In other words, we will probably need to track variable/value pairs in-graph.

This comment was marked as off-topic.

Sign in to view

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Invert deterministic operations from measured variables #119

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 4 comments 5 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

This comment was marked as off-topic.

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Invert deterministic operations from measured variables #119

ricardoV94 Feb 2, 2022

Replies: 4 comments · 5 replies

brandonwillard Feb 9, 2022 Maintainer

ricardoV94 Apr 27, 2022 Author

brandonwillard Apr 27, 2022 Maintainer

This comment was marked as off-topic.

rlouf Oct 4, 2022 Maintainer

ricardoV94 Oct 4, 2022 Author

rlouf Oct 4, 2022 Maintainer

ricardoV94 Oct 4, 2022 Author

brandonwillard Oct 5, 2022 Maintainer

ricardoV94
Feb 2, 2022

Replies: 4 comments 5 replies

brandonwillard
Feb 9, 2022
Maintainer

ricardoV94
Apr 27, 2022
Author

brandonwillard Apr 27, 2022
Maintainer

rlouf
Oct 4, 2022
Maintainer

ricardoV94 Oct 4, 2022
Author

rlouf Oct 4, 2022
Maintainer

ricardoV94 Oct 4, 2022
Author

brandonwillard Oct 5, 2022
Maintainer