Skip to content

Constructing a model

Shai Carmi edited this page Oct 17, 2018 · 25 revisions

This page details how to create your own models.

Defining the HMM subclass and the constructor

The first step is to define a subclass for the desired factorial HMM. First, set a member variable describing the names of the hidden Markov chains and their sizes. This is done as in this example:

class MyFactorialHMM(FactorialHMMParentClass):
    hidden_indices = I = Indices([['bit', 2], ['trit', 3], ['quit', 4]])

The parent class FactorialHMMParentClass must be either FactorialHMMDiscreteObserved or FactorialHMMGeneralObserved, depending on whether the observations are discrete or continuous, respectively. Most of the methods to implement are common to both types of HMMs, but some are type-specific (see below). The variable hidden_indices should contain the names and sizes of the Markov chains. In the above example, we defined a factorial HMM with three Markov chains, with 2, 3 and 4 states, respectively. The Indices class assigns sequential indices to the fields by order of appearance, so that later in the code, they can be referred to as, e.g. self.I.trit.

The signature of the constructor is:

def __init__(self, params, n_steps, calculate_on_init=True):

where:

  • params - this is an object saved as self.params but is not used internally. This is convenient when implementing your own methods.
  • n_steps - the number of steps (e.g., the number of observations)
  • calculate_on_init - set to False if you want to perform the model building explicitly yourself. (You probably don't.)

Methods to implement in all models

The following are methods needed to be implemented, both with discrete and in general observations:

Setting the transition matrix

The signature is

SetTransitionMatricesTensor(self):

This method calculates and saves the transition matrices of all the hidden Markov chains, per step. This is done by repeated calling to SetTransitionSubmatrix(self, n_step, field, submatrix), where the matrix submatrix will be set to be the transition between the n_step and n_step+1 steps of the Markov chains corresponding to field. It is highly recommended to use the named value for the field, e.g. I.bit (see above). Note that in submatrix, element (i,j) (i.e., row i, column j) is the transition probability from state j to state i. The sum of each column must be 1.

An example:

def SetTransitionMatricesTensor(self):
    I = self.I
    p1 = self.params['p1']
    p2 = self.params['p2']

    for n in range(self.n_steps - 1):
        self.SetTransitionSubmatrix(n, I.bit, [[1-p1, p1], [p1, 1-p1]])
        self.SetTransitionSubmatrix(n, I.trit, [[1-p1, p1*(1-p2), 1/3], [p1*p2, 1-p1 ,1/3], [p1*(1-p2), p1*p2, 1/3]])
        self.SetTransitionSubmatrix(n, I.quit, np.identity(4))

Setting the initial hidden state

This sets the probability distribution of the initial state in each of the Markov chain. This is done by repeated calling to SetInitialHiddenSubstate, whose signature is SetInitialHiddenSubstate(self, field, substate). This sets the field field to the distribution vector defined by `substate. For example:

def SetInitialHiddenState(self):
    I = self.I

    self.SetInitialHiddenSubstate(I.bit, [0.8, 0.2])
    self.SetInitialHiddenSubstate(I.trit, [0.2, 0.4, 0.4])
    self.SetInitialHiddenSubstate(I.quit, [0.25, 0.25, 0.25, 0.25])

Additional methods to implement for discrete observations

For discrete observations, use the FactorialHMMDiscreteObserved class. You need to implement the following.

Observed fields

The model supports an observation composed of several discrete-state fields, each of a possibly different size. This is initialize similarly to the hidden states (preferably after them), e.g

class MyFactorialHMM(FactorialHMMDiscreteObserved):
    hidden_indices = I = Indices([['bit', 2], ['trit', 3], ['quit', 4]])
    observed_indices = J = Indices([['bit_a', 2], ['bit_b', 2], ['digit', 10]])

The distribution of the observed state conditional on the hidden state

The missing ingredient is specifying the distribution of the observed state conditional on the hidden state. This no longer decomposes to different chains. This is done by repeated calls to SetObservedGivenHiddenSubmatrix, with signature SetObservedGivenHiddenSubmatrix(self, n_step, obs_given_hidden). Here, n_step is the step for this distribution (that is, the model supports nonhomogenous conditional distributions), and obs_given_hidden is a multivariate array, defined as follows. The first indices correspond to the hidden substates, and then indices correspond to observations. For example, the entry obs_given_hidden[h1, h2, h3, o1, o2, o3] corresponds to the probability Pr(o1, o2, o3 | h1, h2, h3). The full construction, for example, can be:

def SetObservedGivenHidden(self):
    obs_given_hidden = np.ones(self.hidden_indices.field_sizes + self.observed_indices.field_sizes)

    random_state = np.random.RandomState(0)
    for st in self.all_hidden_states:
        R = random_state.rand(*self.observed_indices.field_sizes)
        R /= R.sum()
        obs_given_hidden[list(st) + [Ellipsis]] = R

    for n_step in range(self.n_steps):
        self.SetObservedGivenHiddenSubmatrix(n_step, obs_given_hidden)

This creates a random distribution matrix, and also fixes the same matrix for each step, but this is a good template for the code.

Additional methods to implement for general observations

For general (not necessarily discrete) observations, use the FactorialHMMGeneralObserved class. You need to implement the following.

Observed field(s)

Define the observed fields in addition to the hidden fields, e.g.:

class GaussianFactorialHMM(FactorialHMMGeneralObserved):
    hidden_indices = I = Indices([['bit_a', 2], ['bit_b', 2]])
    observed_indices = J = ['x1','x2']

In the following, we focus on an example where the emission probabilities are Normal. For a hidden state (h1,h2) (h1=1,2, h2=1,2), and for an observe state j (j=1,2), the observation is drawn from a Normal distribution with unit variance and mean mu(h1,h2,j)=r(h1,h2,j)+m, where r(h1,h2,j) will be drawn at random during initiation (below), and m is a parameter.

Constructor

In case you need to override the constructor to perform some initialization, this could be done, e.g.:

def __init__(self, params, n_steps, calculate_on_init=True):
    # Call base class constructor
    super().__init__(params, n_steps, calculate_on_init)
    m = self.params['m']
    # Draw means: one mean for each combination of hidden state and observation
    self.mus = m + np.random.RandomState().rand(*(self.hidden_indices.field_sizes + [self.n_observed_states]))

The distribution of the observed state conditional on the hidden state

Here, we specify the (joint) probability density function (PDF) of the observation(s) given the hidden states. We need to provide a function that, given an observation z, calculates the PDF of z for each possible hidden state. In the example, the two observations are independent, each drawn from a Normal distribution with mean mu(h1,h2,j) (where h1,h2 are the hidden states and j is the index of the observation). Their joint PDF is the product of the Normal PDFs.

# The observed state will be a list of two observations: [x1,x2]
def GetObservedGivenHidden(self, observed_state, n_step):
    a = np.ones(self.hidden_indices.field_sizes)
    for st in self.all_hidden_states:
        a[tuple(st)] = np.prod(scipy.stats.norm(loc=self.mus[list(st)+[Ellipsis]]).pdf(observed_state))
    return a

This returns a multidimensional array a such that, a[h1, h2] is the PDF P(x1,x2 | h1, h2). In this example, the emission probabilities are the same for all time steps, but they can be made different by using the n_step argument.

Drawing an observation

Theoretically, once the emission probabilities (the joint PDF of the observations given all possible hidden states) have been specified (as above), we can automatically use the PDF to draw observations given the hidden states. However, this was not implemented, and the user must provide a function to draw observations given a hidden state. This is done by implementing DrawObservedGivenHidden(self, hidden_state, random_state), where hidden_state specifies the hidden state and random_state is a np.random.RandomState instance to be used for random draws to provide reproducibility. The function returns the drawn observed state(s). In our example, for a hidden state (h1,h2), we draw a pair of Normal variables with means mu(h1,h2,1) and mu(h1,h2,2).

def DrawObservedGivenHidden(self, hidden_state, n_step, random_state):
    return scipy.stats.norm(loc=self.mus[list(hidden_state)+[Ellipsis]]).rvs(size=len(self.observed_indices), random_state=random_state)

A class implementing a simple Factorial HMM

To allow users with relatively little experience to get started with the package, we also implemented a class for Factorial HMMs that have the following simplifying assumptions:

  1. The state space is of the same size for all sub-chains.
  2. The transition probabilities are the same for all time steps.
  3. Observations are discrete.

The class is called FullDiscreteFactorialHMM. Its constructor is:

def __init__(self, params, n_steps,calculate_on_init=True):

where:

  • params - this is a dictionary that contains the parameters of the HMM (see below).
  • n_steps and calculate_on_init are as above.

The HMM parameters to be set are as follows:

  • hidden_alphabet_size: The number of states per chain (K)
  • n_hidden_chains: The number of chains (M)
  • observed_alphabet_size: The alphabet size of the observations
  • n_observed_chains: The dimension of the observations
  • initial_hidden_state: The distribution of the initial states
  • transition_matrices: The transition probability matrices
  • obs_given_hidden: The matrices for the emission probabilities

To illustrate this class, consider the following example.

random_state = np.random.RandomState(0)
def MyFullDiscreteFactorialHMM(n_steps):
    K = 2
    M = 4
    D = 3
    params = {
        'hidden_alphabet_size': K,
        'n_hidden_chains': M,
        'observed_alphabet_size': D,
        'n_observed_chains': 1,
    }

    params['initial_hidden_state'] = np.zeros((M, K))   
    params['transition_matrices'] = np.zeros((M, K, K)) 
    params['obs_given_hidden'] = np.zeros([K] * M + [D])

    for i in range(M):
        p1,p2 = random_state.rand(2)
        # An example of a 2x2 general transition matrix
        params['transition_matrices'][i, :, :] =  [[1-p1, p2], [p1, 1-p2]]
        # Uniform initial probabilities        
        params['initial_hidden_state'][i, :] = [1/K] * K
    
    for st in itertools.product(*[range(K)] * M):
        R = random_state.rand(D)
        R /= R.sum()
        # In this example, random emission probabilities
        params['obs_given_hidden'][list(st) + [Ellipsis]] = R

    return FullDiscreteFactorialHMM(params=params, n_steps=n_steps, calculate_on_init=True)