-
Notifications
You must be signed in to change notification settings - Fork 12
Constructing a model
This page details how to create your own models.
The first step is to define a subclass for the desired factorial HMM. First, set a member variable describing the names of the hidden Markov chains and their sizes. This is done as in this example:
class MyFactorialHMM(FactorialHMMParentClass):
hidden_indices = I = Indices([['bit', 2], ['trit', 3], ['quit', 4]])
The parent class FactorialHMMParentClass
must be either FactorialHMMDiscreteObserved
or FactorialHMMGeneralObserved
, depending on whether the observations are discrete or continuous, respectively. Most of the methods to implement are common to both types of HMMs, but some are type-specific (see below). The variable hidden_indices
should contain the names and sizes of the Markov chains. In the above example, we defined a factorial HMM with three Markov chains, with 2, 3 and 4 states, respectively. The Indices
class assigns sequential indices to the fields by order of appearance, so that later in the code, they can be referred to as, e.g. self.I.trit
.
The signature of the constructor is:
def __init__(self, params, n_steps, calculate_on_init=True):
where:
-
params
- this is an object saved asself.params
but is not used internally. This is convenient when implementing your own methods. -
n_steps
- the number of steps (e.g., the number of observations) -
calculate_on_init
- set to False if you want to perform the model building explicitly yourself. (You probably don't.)
The following are methods needed to be implemented, both with discrete and in general observations:
The signature is
SetTransitionMatricesTensor(self):
This method calculates and saves the transition matrices of all the hidden Markov chains, per step. This is done by repeated calling to SetTransitionSubmatrix(self, n_step, field, submatrix)
, where the matrix submatrix
will be set to be the transition between the n_step
and n_step+1
steps of the Markov chains corresponding to field
. It is highly recommended to use the named value for the field, e.g. I.bit
(see above). Note that in submatrix
, element (i,j) (i.e., row i, column j) is the transition probability from state j to state i. The sum of each column must be 1.
An example:
def SetTransitionMatricesTensor(self):
I = self.I
p1 = self.params['p1']
p2 = self.params['p2']
for n in range(self.n_steps - 1):
self.SetTransitionSubmatrix(n, I.bit, [[1-p1, p1], [p1, 1-p1]])
self.SetTransitionSubmatrix(n, I.trit, [[1-p1, p1*(1-p2), 1/3], [p1*p2, 1-p1 ,1/3], [p1*(1-p2), p1*p2, 1/3]])
self.SetTransitionSubmatrix(n, I.quit, np.identity(4))
This sets the probability distribution of the initial state in each of the Markov chain. This is done by repeated calling to SetInitialHiddenSubstate
, whose signature is SetInitialHiddenSubstate(self, field, substate)
. This sets the field field
to the distribution vector defined by `substate. For example:
def SetInitialHiddenState(self):
I = self.I
self.SetInitialHiddenSubstate(I.bit, [0.8, 0.2])
self.SetInitialHiddenSubstate(I.trit, [0.2, 0.4, 0.4])
self.SetInitialHiddenSubstate(I.quit, [0.25, 0.25, 0.25, 0.25])
For discrete observations, use the FactorialHMMDiscreteObserved
class. You need to implement the following.
The model supports an observation composed of several discrete-state fields, each of a possibly different size. This is initialize similarly to the hidden states (preferably after them), e.g
class MyFactorialHMM(FactorialHMMDiscreteObserved):
hidden_indices = I = Indices([['bit', 2], ['trit', 3], ['quit', 4]])
observed_indices = J = Indices([['bit_a', 2], ['bit_b', 2], ['digit', 10]])
The missing ingredient is specifying the distribution of the observed state conditional on the hidden state. This no longer decomposes to different chains. This is done by repeated calls to SetObservedGivenHiddenSubmatrix
, with signature SetObservedGivenHiddenSubmatrix(self, n_step, obs_given_hidden)
. Here, n_step
is the step for this distribution (that is, the model supports nonhomogenous conditional distributions), and obs_given_hidden
is a multivariate array, defined as follows. The first indices correspond to the hidden substates, and then indices correspond to observations. For example, the entry obs_given_hidden[h1, h2, h3, o1, o2, o3]
corresponds to the probability Pr(o1, o2, o3 | h1, h2, h3)
. The full construction, for example, can be:
def SetObservedGivenHidden(self):
obs_given_hidden = np.ones(self.hidden_indices.field_sizes + self.observed_indices.field_sizes)
random_state = np.random.RandomState(0)
for st in self.all_hidden_states:
R = random_state.rand(*self.observed_indices.field_sizes)
R /= R.sum()
obs_given_hidden[list(st) + [Ellipsis]] = R
for n_step in range(self.n_steps):
self.SetObservedGivenHiddenSubmatrix(n_step, obs_given_hidden)
This creates a random distribution matrix, and also fixes the same matrix for each step, but this is a good template for the code.
For general (not necessarily discrete) observations, use the FactorialHMMGeneralObserved
class. You need to implement the following.
Define the observed fields in addition to the hidden fields, e.g.:
class GaussianFactorialHMM(FactorialHMMGeneralObserved):
hidden_indices = I = Indices([['bit_a', 2], ['bit_b', 2]])
observed_indices = J = ['x1','x2']
In the following, we focus on an example where the emission probabilities are Normal. For a hidden state (h1,h2)
(h1=1,2, h2=1,2)
, and for an observe state j
(j=1,2)
, the observation is drawn from a Normal distribution with unit variance and mean mu(h1,h2,j)=r(h1,h2,j)+m
, where r(h1,h2,j)
will be drawn at random during initiation (below), and m
is a parameter.
In case you need to override the constructor to perform some initialization, this could be done, e.g.:
def __init__(self, params, n_steps, calculate_on_init=True):
# Call base class constructor
super().__init__(params, n_steps, calculate_on_init)
m = self.params['m']
# Draw means: one mean for each combination of hidden state and observation
self.mus = m + np.random.RandomState().rand(*(self.hidden_indices.field_sizes + [self.n_observed_states]))
Here, we specify the (joint) probability density function (PDF) of the observation(s) given the hidden states. We need to provide a function that, given an observation z
, calculates the PDF of z
for each possible hidden state. In the example, the two observations are independent, each drawn from a Normal distribution with mean mu(h1,h2,j)
(where h1,h2
are the hidden states and j
is the index of the observation). Their joint PDF is the product of the Normal PDFs.
# The observed state will be a list of two observations: [x1,x2]
def GetObservedGivenHidden(self, observed_state, n_step):
a = np.ones(self.hidden_indices.field_sizes)
for st in self.all_hidden_states:
a[tuple(st)] = np.prod(scipy.stats.norm(loc=self.mus[list(st)+[Ellipsis]]).pdf(observed_state))
return a
This returns a multidimensional array a
such that, a[h1, h2]
is the PDF P(x1,x2 | h1, h2)
. In this example, the emission probabilities are the same for all time steps, but they can be made different by using the n_step
argument.
Theoretically, once the emission probabilities (the joint PDF of the observations given all possible hidden states) have been specified (as above), we can automatically use the PDF to draw observations given the hidden states. However, this was not implemented, and the user must provide a function to draw observations given a hidden state. This is done by implementing DrawObservedGivenHidden(self, hidden_state, random_state)
, where hidden_state
specifies the hidden state and random_state
is a np.random.RandomState
instance to be used for random draws to provide reproducibility. The function returns the drawn observed state(s). In our example, for a hidden state (h1,h2)
, we draw a pair of Normal variables with means mu(h1,h2,1)
and mu(h1,h2,2)
.
def DrawObservedGivenHidden(self, hidden_state, n_step, random_state):
return scipy.stats.norm(loc=self.mus[list(hidden_state)+[Ellipsis]]).rvs(size=len(self.observed_indices), random_state=random_state)
To allow users with relatively little experience to get started with the package, we also implemented a class for Factorial HMMs that have the following simplifying assumptions:
- The state space is of the same size for all sub-chains.
- The transition probabilities are the same for all time steps.
- Observations are discrete.
The class is called FullDiscreteFactorialHMM
. Its constructor is:
def __init__(self, params, n_steps,calculate_on_init=True):
where:
-
params
- this is a dictionary that contains the parameters of the HMM (see below). -
n_steps
andcalculate_on_init
are as above.
The HMM parameters to be set are as follows:
-
hidden_alphabet_size
: The number of states per chain (K) -
n_hidden_chains
: The number of chains (M) -
observed_alphabet_size
: The alphabet size of the observations -
n_observed_chains
: The dimension of the observations -
initial_hidden_state
: The distribution of the initial states -
transition_matrices
: The transition probability matrices -
obs_given_hidden
: The matrices for the emission probabilities
To illustrate this class, consider the following example.
random_state = np.random.RandomState(0)
def MyFullDiscreteFactorialHMM(n_steps):
K = 2
M = 4
D = 3
params = {
'hidden_alphabet_size': K,
'n_hidden_chains': M,
'observed_alphabet_size': D,
'n_observed_chains': 1,
}
params['initial_hidden_state'] = np.zeros((M, K))
params['transition_matrices'] = np.zeros((M, K, K))
params['obs_given_hidden'] = np.zeros([K] * M + [D])
for i in range(M):
p1,p2 = random_state.rand(2)
# An example of a 2x2 general transition matrix
params['transition_matrices'][i, :, :] = [[1-p1, p2], [p1, 1-p2]]
# Uniform initial probabilities
params['initial_hidden_state'][i, :] = [1/K] * K
for st in itertools.product(*[range(K)] * M):
R = random_state.rand(D)
R /= R.sum()
# In this example, random emission probabilities
params['obs_given_hidden'][list(st) + [Ellipsis]] = R
return FullDiscreteFactorialHMM(params=params, n_steps=n_steps, calculate_on_init=True)