-
-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restrict BroadcastTo
lifting of RandomVariable
s
#71
Restrict BroadcastTo
lifting of RandomVariable
s
#71
Conversation
While looking into this, I realized that we can use this naive broadcasting as we always have, but that we should replace the underlying The idea is that we're only using these In other words, we have equivalence "modulo" a log-probability function (and an accompanying value variable transform). To be responsible about such a rewrite, we shouldn't return Actually, it might make sense to include the value variables as inputs to this new This idea is very similar to the old cc: @ricardoV94 @kc611 |
It is equivalent to the final logprob returned by aeppl, but that final logprob is not consistent with the original graph. It would double count terms. Maybe we need an explicit rv = at.broadcast_to(at.random.normal(0, 1), (10,))
vv = rv.clone()
# rv logp should be
at.switch(
# check all values are equal
at.all(vv == v[0]),
normal_logprob(vv[0], 0, 1),
-np.inf
) But that looks really awkward. Also there is probably no sampler that could ever propose valid values, so we might be better off just rejecting them altogether. |
Under our log-probability mapping, a graph representing an array containing a broadcasted random variable would map to an array of said random variable's scalar log-probability broadcasted, so we want the result to have multiple terms (i.e be an array). Are you refering to another issue? Broadly, I'm talking about the implementation of a general log-probability mapping for graphs representing broadcasted random variables. More specifically, the topic is the relevant parsing and representation of these domain elements. |
+1 for this idea (or atleast the general direction of it). Always having a value variable attached to this newly built RV derivative will ensure that such replacement only takes pace when the RV is used for log-likelihood graph generation and not for sampling later on. Here I'm assuming that this should work for values which are being provided by the users and not sampled from the RV itself. If it is the latter then there might be some inconsistencies depending upon how we handle them.
I think what @ricardoV94 is referring to here is the same case, i.e. the broadcasting of the value variable itself in cases when the values aren't explicitly provided but are sampled randomly from the RV. |
If I understand you correctly it means we will treat the two RVs as the same: rv1 = at.broadcast_to(at.random.normal(0, 1), (10,))
rv2 = at.random.normal(0, 1, size=10)
vv1 = rv1.clone()
vv2 = rv2.clone() That simplifies things quite a lot for other derived RVs, but we should decide that explicitly. It means we use the original aesara graph as something that somewhat loosely defines a log-probability graph and not as a true generative graph that we carefully invert to obtain the corresponding probability graph. And then your point about the |
The actual equivalence is logprob(broadcast_to(Y, s), broadcast_to(y, s)) == broadcast_to(logprob(Y, y), s) for I'm talking about implementing that equivalence relation above. In practice, we start with the expression What I'm saying is that we can use an intermediate representation in the space of More specifically, I'm talking about expanding My second point is about modeling |
I don't see what that achieves just yet. Seems like that already happens by the automatic broadcasting of the logprob terms when My question was about the case where I am not entirely sure if we are talking about the same thing though. I have no idea about the explicit inclusion of value variables in the graph. I say we give it a try and see if it makes our lives easier. |
Aside from clarifying/understanding what it is we're actually doing and what we intend to do, using the types that are true to the system(s) they model avoids confusion, design quagmires, over-engineering, etc. In other words, it makes it possible to continually simplify a process instead of complicating it over time (e.g. through modularity and the like). For instance, if we sort these things out at a high level like we are, we can probably find a redesign that recasts all the operations in terms of simple(r) local and global rewrites that lie entirely within the Aesara At a lower level, using the correct types generally implies that type-related implementation details are addressed, such as type/instance equivalence (e.g.
Yes, it should, and that's part of the reason why a change to the basic rewrite logic in
The premise is that we want the broadcasting, because those are the elements we're handling with the |
Just to follow this chain of thought a little more, if we use the old Most of our custom rewrites could be easily rewritten to handle the extra For example, the Fortunately, with some simple stand-alone We could also replace |
Let's give it a try and see how it evolves? |
I'll repurpose this PR to that effect. |
This PR restricts the cases in which
BroadcastTo
Op
s will be lifted throughRandomVariable
Op
s vianaive_bcast_rv_lift
.Simply put, an expression like
at.broadcast_to(at.random.normal(0, 1), (10,))
should not be lifted, since it would result in ten independent random variates, instead of a single variate that's broadcasted to the shape(10,)
. Lifting should only happen when the shape of theRandomVariable
already matches the broadcasted shape (e.g. viasize
and/or one of the distribution parameters), or when the broadcasting only introduces broadcastable dimensions (i.e. adds extra dimensions of length one).This PR is—in part—an answer to an issue mentioned in Part 2 of this comment: #51 (reply in thread).
Currently, this draft PR only introduces some tests for
naive_bcast_rv_lift
that state what needs to be changed/implemented.