Skip to content

psycharo-zz/factor-graph

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

40 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

factor-graph

This project's goal is to create a MATLAB/C++ framework for inference on Forney-style factor graphs.

The latest package can be downloaded here.

Currently only gaussian distributions are supported.

More in the Wiki

Installation

Just unpack the .zip and run installFFG.m. This will build the framework and add ffg to matlab path. Should work on Matlab 2012b under Windows and Linux.

Factor graphs intro

Factor graph is a specific type of graphical model used to represent factorizations of functions. They were originally developed to solve coding and signal processing problems. However, their area of application goes beyound that: such well known probabalistical graphical models as Bayesian and Markov networks can be represented as factor graphs (by Hammersley–Clifford theorem). The algorithms used with factor graphs use the concept of messages, or summaries, that are passed along graph's edges, and usually are refered to as summary propagation algorithms. We are focusing on a specific notation for factor graphs, called Forney-Style factor graphs (FFG). Below we give a short intro into the concept. For more details, see Loeliger2004

Let's start with an example. Let f(w,x,y,z) = g(w,x,y) h(y,z) be a factorization of a function f(w,x,y,z) of several variables. The FFG expressing this factorization will look as follows:

            |
           x|
            |
     w    +-+-+    y    +---+    z
 ---------+ g +---------+ h +---------
          +---+         +---+

As you can see, FFG consists of nodes (g and h), edges (y) and half-edges (w, x and z). It can be constructed of a function factorization according to the following rules:

  • for each factor, there is a node in the graph
  • for each variable appearing in two factors, there is an edge in the graph
  • for each variable appearing in one factor, there is a half-edge in the graph

Note that factorizations are restricted to those with no more than two factors sharing a variable. This restriction can be easily resolved though, by introducing extra variables and an equality constraint. Let's assume that we have factorization f(x,y,z) = g(x,y)h(y,z)l(y). The variable y is shared by three factors g,h and l. Let's introduce variables y' and y'', and equality constraint y = y' = y''. The factor that enables this equality constraint is f=(y,y',y'')=d(y - y')d(y - y''), where d is Dirac's delta function. The FFG of the modified factorization will then look as follows:

   x   +---+    y    +---+    y'   +---+   z
-------+ g +---------+ = +---------+ h +-------
       +---+         +-+-+         +---+
                       |
                       |y''
                       |
                       |
                     +-+-+
                     | l |
                     +---+

Another useful constraint factor is the zero-sum constraint factor: f+(x,y,z) = d(x - y - z), that expresses the constraint x = y + z. This, of course, can be generalized further.

Factor graphs can be used to express a variety of models. A real world yet simple example would be a linear-space model, a model widely used in signal processing:

X_t = A X_{t−1} + B U_t
Y_t = C X_t + W_t

Here Y_t can be thought of as an observation (measurement) that is made on a step t; X_t and X_{t-1} are unobserved random variables on steps tandt-1` correspondinly; U_t is the process noise, W_t is the observation noise (both assumed to be drawn from Gaussian distribution); and A, B, C are parameters of the model (in the simplest case, all identities).

To construct the graphical model in FFG notation, we can make use of the equality constaint and zero-sum constraint factors defined above, and also introduce a new factor to express matrix multiplication. Similarly, f_A(x, y) = d(y - A x) will ensure constraint y = A x. The whole model in FFG notation will then look as follows:

                      U_t
                       |
                       |
                       v
                     +---+
                     | B |
                     +-+-+
                       |
                       v
           +---+     +---+     +---+
 X_{t-1}-->| A +---->| + +---->| = +--->X_t
           +---+     +---+     +-+-+
                                 |
                                 v
                               +---+
                               | C |
                               +-+-+
                                 |
                                 v
                    +---+      +---+
                    |W_t+----->| + |
                    +---+      +-+-+
                                 |
                                 v
                                Y_t

Example, Kalman Filtering

We will see now how one can define a factor graph model and run sum-propagation algorithm in Matlab using the ffg framework. We will use a specific type of linear-space model, known as Kalman Filter as an example (latest code for example is available in kalmanScheduleExample:

X_t = X_{t-1} + U_t
Y_t = X_t + W_t

Let's first see how the FFG for a single timeslice will look like. In the framework, for convenience, we substitute each half-edge with an edge and a special kind of node, to which we will refer to as evidence node. The factor graph for this model will be (for a timeslice t, letters are used to refer them in the code later):

            +---+ 
            |U_t| 
            +-+-+ 
              | 
              v 
+-------+   +---+    +---+    +---+
|X_{t-1}+-->|+,a+--->|=,e+--->|X_t| 
+-------+   +---+    +-+-+    +---+ 
		       | 
		       v
             +---+   +---+ 
             |W_t+-->|+,b| 
             +---+   +-+-+ 
                       | 
                       v 
                     +---+ 
                     |Y_t| 
                     +---+ 

The network in ffg framework is represented by a class ffg.Network. Let's create one:

nwk = ffg.Network;

Now, we need to add nodes and edges to this network. There are quite a few node types already available in the framework, including evidence nodes (ffg.EvidenceNode), equality constaints nodes (ffg.EqualityNode) and zero-sum constraint nodes (ffg.AddNode), all of them ancestor of a single base class ffg.FactorNode. Let's first create the nodes themselves:

% hidden variable, X_{t-1}
xin = ffg.EvidenceNode;
% hidden variable, X_t
xout = ffg.EvidenceNode;
% observation, corresponds to a half-edge y
y = ffg.EvidenceNode;
% equality node, connects a, b, xout
e = ffg.EqualityNode;
% process noise, U_t
u = ffg.EvidenceNode;
% observation noise, W_t
w = ffg.EvidenceNode;
% zero-sum node, connects xin + u = e
a = ffg.AddNode;
% zero-sum node, connects e + w = y
b = ffg.AddNode;

Every node has a unique identifier assigned when it is created, its value can be retrieved by the method id().

Now we have the nodes, and we can actually connect them according to the model. This is done via the method addEdge(src, dst) of ffg.Network. Note, that by default each connection is directed. For each call nwk.addEdge(src,dst), underneath we are adding an outgoing connection to src and incoming to dst:

% connecting nodes with each other
nwk.addEdge(xin, a); 
nwk.addEdge(u, a);
nwk.addEdge(a, e); 
nwk.addEdge(e, xout); 
nwk.addEdge(e, b);
nwk.addEdge(w, b); 
nwk.addEdge(b, y); 

Some nodes, such as ffg.EvidenceNode and ffg.EqualityNode do not distinguish between incoming and outgoing connections, so, in this particular case nwk.addEdge(e, xout) and nwk.addEdge(xout, e) are equivalent.

It is also possible, that a node might require other type of connection, rather than incoming or outgoing. In this case, we can specify the type of the connection by passing two more parameters to ffg.addEdge:

ffg.addEdge(src,dst,type_for_src, type_for_dst) 

Where type_for_src is a string specifying the type of the connection for the node src (which, of course, should support this type of connection), and similarly type_for_dst. If the parameters are empty, the types are assumed to be 'outgoing' for src and 'incoming' for dst.

That finishes the construction of the network. To have a look at it, we can use the function ffg.drawNetwork(nwk).

Now we can use sum-propagation algorithm to do the actual filtering (in other words, estimating unobserved variables). Here, we have two choices:

  • use non-loopy summary propagation algorithm (works only for networks with no loops)
  • specify our own schedule of message propagation

Here we will consider only the latter one. The schedule is simply a cell array of pairs of nodes; it can be specified for the network via setSchedule method:

% creating message-passing schedule
nwk.setSchedule({xin, a; ...
                   u, a; ...
                   a, e; ...
                   y, b; ...
                   w, b; ...
                   b, e; ...
                   e, xout});

Before this schedule can be applied, we will want to initialise our model by setting some dummy values to evidence nodes (half-edges) - this is required by the summary propagation algorithm. To set an initial value to an instance of ffg.EvidenceNode, there is a method receive(msg), that takes a message. Underneath, it just stores the message inside the nodes and sends it when the sum propagation algorithm is run. In this particular example, we need to initialise nodes u, w, and xout (also xin and y but they should receive updates on every iteration). Gaussian messages can be with ffg.gaussVarianceMessage(mean,var) or ffg.gaussPrecisionMessage(mean,precision). For this example we will only need messages with variances:

% specifying distribution params         
sd = 10.0;
sd2 = sd*sd;
% initialising X_{t-1}
xout.receive(ffg.gaussVarianceMessage(1+randn()*sd, sd2));
% process noise, setting it to constant
u.receive(ffg.gaussVarianceMessage(1, 0));
% observation noise
w.receive(ffg.gaussVarianceMessage(0, sd2));

To see the list of messages for a node, one can use a method messages, that returns a cell array of messages. For instance for a node w, we w.messages{1} will give the following result:

ans = 
    from: -1
    type: 'VARIANCE'
    mean: 0
     var: 100

Here, from specifies the sender of the message (-1 means that there is no sender, i.e. it corresponds to the half-edge), type is the type of the message (currently it is only 'VARIANCE' or 'PRECISION'), mean and var are specific for gaussians.

Finally, we can do the filtering. In this example, we use the current iteration number with the added noise as input data, and run sum propagation for a thousand times. On every step we initialise xin with an estimation of the previous step (dummy message for the first iteration), and y by observing the new evidence (that we generated). To run a single iteration of the sum propagation algorithm on the network, we call makeStep() method of ffg.Network. To retrieve the message from xout we could, again, use the method messages, but than we would have to search through the cell array - ffg.EvidenceNode has a special method - evidence() that returns a message that it received from its connection (that was specified by nwk.addEdge).

% the number of iterations to run sum-propagation
N_ITERATIONS = 1000;
% place to store results for each 
result = zeros(N_ITERATIONS, 2);    
samples = zeros(N_ITERATIONS, 1);
msg = ffg.gaussVarianceMessage(randn()*sd, sd2);
for i = 1:N_ITERATIONS
    samples(i) = i+randn()*sd;

    % setting X_{t-1} to the estimation on the previous step
    xin.receive(msg);
    % receiving subsequent observation
    y.receive(ffg.gaussVarianceMessage(samples(i), 0));
    nwk.makeStep();         
    msg = xout.evidence();
    
    result(i,:) = [msg.mean, msg.var];
end

TODO

  • Gradient descent
  • EM algorithm
  • Variational Bayes
  • ? discrete variables