-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suggestions for documentation and features #94
Comments
Hey @John-Boik ! Many thanks! Your feedback is invaluable. It is important for us to get such descriptive comments from someone outside of our lab. We indeed think of the custom nodes and rules as one of the selling points for |
Related to my previous post, and perhaps a topic deserving of documentation, suppose that I have a custom node for a function that looks like On a related topic, perhaps deserving of additional documentation, in test_in.jl for the delta node, unscented, the function
And inputs are defined as:
Assume this rule is for the backward message What is What do the two inputs to JointNormal refer to? I assume the first refers to In the next test, the tuple of tuples is ((1,), (2,), ())). Other tests use ((0,), (0,)) and ((1,), (1,)). Again, what do these refer to? In short, how does the signature contain messages While these topics might be good to address in documentation, I'm also hoping to learn the answers here. |
Hey @John-Boik, thank you for your questions! I'll do my best to provide clear answers.
Regarding obtaining q_ins within the rule body, you can compute it manually. Delta nodes have special behavior and do not follow the default arguments specification for performance reasons. You can refer to this code example here. If you wish to access q_ins in your rule, you can incorporate similar code that computes q_ins within your rule.
Our inference backed instead of This approach may work because of the specific assumptions and characteristics of the unscented method with Gaussian distributions. In delta node rules, we precompute q_ins and then marginalize all arguments except one and divide to compute backward messages. The k = 2 refers to the "index" of the edge for which we are computing the message, e.g., in_2. Here we get to your last question:
In general, a user should never interact with the
It does not. It contains a joint marginal over |
Hi @John-Boik! Another follow-up on your suggestions. As for specifics:
Thank you once again for your valuable feedback, and we look forward to making these improvements. |
Thanks @bvdmitri for answering my questions. That helped quite a bit. But I think there is still a bit more to the story. I did not understand until I read your reply that the But, including all this in a custom node might not be as simple as you suggest, if I understood you correctly. You had said that the Consider a custom node for the simple function In the case of a delta node, the model statement could look like:
For the delta node, the order of computation is:
It appears that a simple application of code similar to that of So, is there a way to calculate Another option would be to send Yet another option would be to include in the custom node structure items like Or, maybe I have missed something and the answer is right in front of me. Any suggestions would be appreciated. Perhaps you are wondering why I would want to calculate |
You're right, my bad. I missed that. But your next question helps:
I assume you have the following node specification (or something similar) function fx_add end
@node typeof(fx_add) Deterministic [ out, x1, x2 ] So if you simply want to get a message on the same edge from the forward rule you simply write it like:
This specification explicitly states, that you require inbound message on the The same applies to other interfaces, so if you want messages on the same edges for all interfaces you can write x3 ~ fx_add(x1, x2) where {
pipeline = RequireMessage(
out = NormalMeanPrecision(0.0, 0.01),
x1 = NormalMeanPrecision(0.0, 0.01),
x2 = NormalMeanPrecision(0.0, 0.01),
)
} The x3 ~ fx_add(x1, x2) where {
pipeline = RequireMessage(out, x1, x2)
} which says that you want inbound messages on all edges when computing outbound messages, but without any explicit initialisation (will probably require
You are right and that is the reason why delta nodes have a custom layout for rules, but if you only want to experiment I would advise you to recompute the marginals in all rules, which is easier for now. P.S. Note that there is currently a small bug in the model specification library with P.S.S. This is the code I used to quickly test the using RxInfer
function fx_add end
@node typeof(fx_add) Deterministic [ out, x1, x2 ]
@model function try_fx_add()
y = datavar(Float64)
# Avoid using interfaces names for the variables due
# to the bug in GraphPPL, I put `_`
x1_ ~ NormalMeanVariance(0.0, 1.0)
x2_ ~ NormalMeanVariance(0.0, 1.0)
out_ ~ fx_add(x1_, x2_) where { pipeline = RequireMessage(out, x1, x2) }
y ~ NormalMeanVariance(out_, 1.0)
end
inference(
model = try_fx_add(),
data = (y = 1.0, )
) and I get the expected error with a missing rule
Note that |
Thanks @bvdmitri, that was very helpful. But I think a tweak is still needed. In the rules for calculating the backward messages, say, I include code below for a complete working example. In it, I calculate the marginals for I do have a related question. Suppose that you wanted to use the delta method and have an inverse function for only one of the inputs, say, And I have a comment. The search function for RXInfer documentation seems to have some issues. For example, search for the term "inv" and you will get a list of links to pages, but for some of those pages the term "inv" only occurs in the left navigation panel, not in the actual page on the right. I've noticed this for a variety of search terms and it makes it more difficult to find what you are looking for. Perhaps this is an issue that is beyond your control to fix. But if you can fix it, that would be helpful.
|
My bad, @bvdmitri, I misunderstood the meaning of a pipeline with multiple messages, and my tweak was not necessary (although it does save a little computation). A working version using a pipeline with multiple messages is below. For your convenience, I repeat my question here. Suppose that you wanted to use a delta node and have an inverse function for only one of the inputs, say, x1 in my example. How would you specify that you have only one of the inverses? I'm guessing something like DeltaMeta(method = Linearization(), inverse = (x1_inv, nothing)). Would that be correct? Thanks again for your help.
|
Hey @John-Boik, sorry for the later reply, I was on my vacation. Specifying only one of the inverses is not implemented (yet), though the way you guessed it is exactly how we have planned to implement this functionality. |
Initial draft of the custom nodes and rules is available in the documentation: https://biaslab.github.io/RxInfer.jl/stable/manuals/custom-node/ |
I have done a (reasonably) deep dive into learning RxInfer over the past several weeks and would like to offer some suggestions regarding documentation and features. I write this both from my recent experience and from my guess as to how some other practitioners might wish to use (and learn) RxInfer, and where they might get stuck. I imagine that you (the RxInfer team) want the package to be used by a wide audience. Some material below is in reference to my post on discourse and the associated request to open an issue here. That post is about documentation for the score function and
@average_energy
macro.First, some kudos are in order. RxInfer is quite impressive and it represents a large and important step forward from ForneyLab (which I have slight acquaintance with). I understand that RxInfer is a work in progress and that you likely have a variety of short- and long-term plans for documentation and features. So if I don't offer anything new here, please take this post as confirmation of your current direction. This post is lengthy, and if I'm guided to split it into certain portions for separate issues, I'm happy to do so. And I'm happy to contribute in some way towards the suggestions below, but I'm not sure what would be most helpful.
Big Picture
I suspect that you might be aiming to eventually implement a large set of nodes, associated rules, and approximation methods so that users can easily build models by combining building blocks via short snippets of code. If so, this would of course be useful. But in the meantime, and perhaps always, some users (like myself) will be interested in relying heavily on custom nodes. Interesting real-world problems might tend to require custom nodes, and custom nodes provide a means to tinker and learn about a given model, about RxInfer, and about new methods and approaches. You already allow for the creation of custom nodes, but I'm suggesting that creating custom nodes might actually be the bread and butter for some users. In my learning curve, almost all of my time was spent getting custom nodes and associated rules to work correctly.
I understand that part of the upcoming documentation will include more details on creating custom nodes. I'd like to suggest what might be additions to what you have planned. While the examples you now offer (e.g., mountain car problem) are useful, and more would be helpful, what would have been most useful to me in learning RxInfer is a large set of simple, self-contained custom nodes, complete with message passing rules, marginal rules, and score/free energy functions. By simple, I mean a series of simple linear and nonlinear operators, such as [+, -, *, /, log(x), x^2] combined with a series of approximation methods, such as: [no approximation, linearization, CVI, importance, RTS]. Some nodes would involve single operators and others (composite nodes) would involve multiple operators, for example: [+ and * and log and CVI]. Another good example would be a node that returns a Boolean variable for
p(x>z) < eta
, where x is a random variable andz
andeta
are scalars. Some custom nodes would be deterministic, and some stochastic. The code for each node would exist in a single file. It would not need its own notebook page or lengthy introduction (but liberal code comments are always helpful). Each node would probably need a simple (bare minimum) model and inference implementation, so that one can check that the node actually works and returns a free energy value, etc. On that note, some implementations could use the inference() function, while others use the update!() function, just for variety.Obviously, RxInfer already includes code for nodes like + and matrix multiplication. For these and others, a user can figure things out by digging through your code base. My suggestion is to have, for each custom node example, one self-contained file that includes a complete implementation, cut down to the bare minimum, simplest code, including all rules, score methods (for free energy calculation), and so on. To make things really simple, and eliminate most multiple dispatch (as a user might do when prototyping for a specific problem), each node can be tailored for specific set of message types (Gaussian, for example). An exception here might be use of variable families (e.g., UnivariateNormalDistributionsFamily) in rule signatures, which can cut down on the number of required rules.
I'm not suggesting that someone would use these nodes in their model, but rather they would serve as examples of how to piece together different building blocks, quick and dirty, to build (increasingly complex) custom nodes. Importantly, they would also serve as examples of how to get the syntax right so that the code runs. I'm suggesting that all the pieces, including calls to approximation methods, reside within the rules of a custom node, not called via a meta specification to code elsewhere. As a user, for example, I might want to try out both linearization and importance sampling in the rule of a node, or implement my own approximation method, and halt the code within the rule to examine function outputs. Efficiency and code elegance are not the goals here.
Specifics
Below I provide a list of specific questions, comments, and observations. I'm not necessarily looking for answers to the questions, but offer them as examples of the kinds of questions I have had and that new users might have. Perhaps you could address them in the documentation.
Gamma(shape=x,rate=x)
in the Gamma mixture example throws errors.inference()
function vs. calls toupdate!()
. Any specific examples? It seems that with the new callbacks, you can do anything with theinference()
function that you can with theupdate!()
function.x_prior ~ Normal(...)
followed byx0 = x_prior
when setting an initial prior? Why can't this occur in one call,x0 ~ Normal(...)
, where x0 is then updated in every step of a for loop?stochastic
vs.deterministic
for defining node types? For example, if I use importance sampling for passing messages through a custom + node, is the node stochastic or deterministic? Internally, how are the two cases handled differently?@model my_model(; mu)
or via use of datavars in the model (and associated use of thedata
parameter in theinference()
function). For example, if a model contains the codex ~ Normal(mu, 2.)
, one can send mu to the model either way. Under what circumstances must some data be sent via datavars and others be sent via the signature call?-- Require initialization of marginals, and some nodes that require initialization of marginals and messages.
-- Have in the model some complex if-then conditional statements. This could be paired with the
p(x>z) <eta
custom node I suggested above.-- Implement a distribution from Distributions.jl (e.g., Poisson)
-- Have a rule that calls other rules and/or nodes and/or models.
Finally, it would be good to include a full example that is described by narrative text. Consider the following, for example.
The questions are, do they meet that night for dinner? Does anyone call? Do they meet on a later night for dinner? Does Jack have to alter his errands?
The specific probabilities of event occurrences are important for the outcome, but not important for the concept. They can be anything reasonable. Some interesting aspects of the story are that both characters undergo an errand process independent of the other. Those processes involve chance constraints and optimal control, and for Jack they occur twice. And the story line involves several Boolean decision variables. Plus, it stems from a narrative rather than mathematical form. The story could be set up within one model. But could it also be formed as, say, three separate models? Could each errand branch be processed as a sequence of rules within rules? How many ways could this problem be conceived? (Related, how many ways can branch and recursive programming be implemented in RxInfer?)
The narrative aspect is important here. The idea is that RxInfer might be suitable for modeling a person's belief system, as that person is depicted in a story. As such, there is a meta process of: Narrative --> Factor graph --> Simulation --> Results. The example would serve as a warm up.
The text was updated successfully, but these errors were encountered: