Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SSM for SIR-Model #54

Closed
vwiela opened this issue Jun 8, 2022 · 4 comments
Closed

SSM for SIR-Model #54

vwiela opened this issue Jun 8, 2022 · 4 comments

Comments

@vwiela
Copy link

vwiela commented Jun 8, 2022

Dear all,

My goal is to do parameter inference for the SIR-model, which I described with the following SDE system:

$$ \begin{cases} dS(t) = -\alpha S(t)I(t)dt+\frac{1}{\sqrt{100}}\sqrt{\alpha S(t)I(t)}dB_1(t) \\ dI(t) = \alpha S(t)I(t)-\gamma I(t)dt-\frac{1}{\sqrt{100}}\sqrt{\alpha S(t)I(t)}dB_1(t)+\frac{1}{\sqrt{100}}\sqrt{\gamma I(t)}dB_2(t) \end{cases} $$

For $X=(S,I)^T$ this can be written as

$$ dX(t) = \mu(X(t))dt+\sigma(X(t))dB(t) $$

with

$$ \mu(X)=\left(\begin{array}{c} -\alpha X[1]X[2] \\ \alpha X[1]X[2]-\gamma*X[2] \end{array}\right) \text{ and } \sigma(X)=\frac{1}{\sqrt{100}}\left(\begin{array}{cc} \sqrt{\alpha X[1]X[2]} & 0\\ -\sqrt{\alpha X[1]X[2]} & \sqrt{\gamma X[2]} \end{array}\right) $$

The symmetric product of the diffusion coefficient would then be the positive definite matrix

$$ \Sigma(X)=\frac{1}{100}\left(\begin{array}{cc} \alpha X[1]X[2] & -\alpha X[1]X[2]\\ -\alpha X[1]X[2] & \alpha X[1]X[2]+\gamma X[2] \end{array}\right) $$

I use a simple Euler-scheme discretisation to make it accesable as an autoregressive process of order 1, such that I can specify it in terms of distributions.

$$ S_t = S_{t-1}-\beta S_{t-1}I_{t-1}dt+\frac{1}{\sqrt{N}}\sqrt{S_{t-1}I_{t-1}}*\sqrt{dt}Z_1 $$

$$ I_t = I_{t-1}+(\beta S_{t-1}I_{t-1}-\gamma I_{t-1})dt-\frac{1}{\sqrt{N}}(\sqrt{S_{t-1}I_{t-1}}Z_1+\sqrt{\gamma I_{t-1}} \sqrt{dt}Z_2) $$

whith $Z_1, Z_2\sim\mathcal{N}(0,1)$ being independent.
Or in different notation one can write $X=(S,I)^T$ as being multivariate normal distributed.

$$
X_t\sim \text{MultiNormal}(X_{t-1}+\mu(X_{t-1})dt, \sigma(X_{t-1})\sqrt{(dt)})
$$

Since particle MCMC methods would be my favourite choice to perform inference I tried to implement this model as a SSM and ended up with:

def multi_dist(xp, beta, gamma, dt):  # xp means X_{t-1}, defines the BM parts
    d = {'B1': dists.Normal(loc=0, scale=(1/10)*np.sqrt(beta*xp['S']*xp['I']*dt)),
         'B2': dists.Normal(loc=0, scale=(1/10)*np.sqrt(beta*xp['S']*xp['I']*dt)),
         'S': dists.Cond(lambda x: dists.Dirac(xp['S']-beta*xp['S']*xp['I']*dt + x['B1'] )),
         'I': dists.Cond(lambda x: dists.Dirac( xp['I'] + (beta*xp['S']*xp['I']-gamma*xp['I'])*dt-x['B1']+x['B2']))
        }
    return dists.StructDist(d)

class SIRModel(ssm.StateSpaceModel):
    default_params = {'beta': 0.08,
                      'gamma': 0.02,
                      'dt': 0.1,
                      'sigma': 0.01}

    def PX0(self):
        return multi_dist({'S': 0.96, 'I': 0.04}, self.beta, self.gamma, self.dt)
    def PX(self, t, xp):
        return multi_dist(xp, self.beta, self.gamma, self.dt)
    def PY(self, t, xp, x):
        return dists.Normal(loc=x['I'], scale=self.sigma)  # whatever

SIR = SIRModel()
results, observations = SIR.simulate(3000)

This approach worked (at least for the simulation, did not try the ifnerence yet), but it increases the dimension of the vector $x$ by the number of transitions in the system.
Since my overall goal is to perform inference for multivariate models with a far öarger number of compartments and transitions, I would like to avoid this.

So my question is, whether there is a smarter way to implement this multivariate process as a state-space model based on multivariate normal distributions or something else.

Thanks for any advice
Vincent

@nchopin
Copy link
Owner

nchopin commented Jun 9, 2022

Hi,
your implementation looks correct to me. Personally, I would not have defined distributions for the intermediate quantities $B_1$ and $B_2$ (unless these quantities are of interest to you for some reason?). Since both $S(t)$ and $I(t)$ are Gaussian, conditional on $S(t-1)$ and $I(t-1)$, why not defining directly the model like this?

def multi_dist(xp, beta, gamma, dt):
    d = {'S': dists.Normal(loc=xp['S']-beta*xp['S']*xp['I']*dt,
                            scale=(1/10)*np.sqrt(beta*xp['S']*xp['I']*dt)), 
         'I': dists.Normal(loc=xp['I'] + (beta*xp['S']*xp['I'] - gamma*xp['I']) * dt,
                            scale=(1/10)*np.sqrt(beta*xp['S']*xp['I']*dt)) } 
    return dists.StructDist(d)


# rest unchanged

Then you get particles of dimension two (instead of 4).

As for the multivariate extension, I guess there are two possibilities:

  1. if your model is such that the different components (like your $I(t)$ and $S(t)$'s) depend on each other, then you have to do as above, i.e. define a state vector $X_t$ with all the necessary components, etc.
  2. If your model is such that you can separate the components in independent groups (but the groups have parameters in common, say) then the likelihood of your data will factorise over these groups (again because they are independent), and you may use a distinct particle filter for each factor of the likelihood, and use the product of these factors inside a PMMC algorithm. This would require extra work; basically you have to adapt the current implementation of PMMH to make this work, but you should get much better performance (for a given CPU budget) again in case your model have this specific structure.

Hope this helps

ps: why do you have a $\beta$ inside the square root of the scale of $B_1$ and $B_2$ btw?

@vwiela
Copy link
Author

vwiela commented Jun 9, 2022

Thanks for your answer.
I have several questions, though:

  1. Regarding your proposed code where do you define $x$ when using $x['B1']$?
  2. There is a MvNormal distribution implemented, would it be possible to use that with a state dependent correlation matrix?

And a new question(solved):
Above I assumed that as observational data, we just have the proportion of infected people in the population, which is the process plus some additive gaussian noise due to measurement error.
But how could one deal with multi-dimensional observations at each time point (e.g. we observe S and I, which are again the states of x plus some additive noise). I guess the data we pass must be something like an array of structured arrays, such that at each timepoint t `data[t]$ is a structured array? Is that correct and is there a nice way to create such input from a given dataframe, where each column corresponds to the measurment of one compartment and the rowas are the measurements at one time-point.

Edit: Just one structured array of the length of timepoints and column names referring to the compartments is what worked.

And regarding the $\beta$ inside the square root of the scale. Thehe scale is just the diffusion coefficent, which basically specifies the correlation. And when deriving the SDE system as a diffusion approximation to the underlying Markov Jump process the square root appears.
(My notation in my original post uses $\alpha$ in the equations and $\beta$ in the implementation, sorry for that.)

@nchopin
Copy link
Owner

nchopin commented Jun 10, 2022

Ah, sorry, my bad. One way to fix my code :

def multi_dist(xp, beta, gamma, dt):  # xp means X_{t-1}, defines the BM parts
    locS = xp['S'] - beta*xp['S']*xp['I']*dt
    locI =  xp['I'] + (beta*xp['S']*xp['I'] - gamma*xp['I'])*dt
    scale = (1/10)*np.sqrt(beta*xp['S']*xp['I']*dt)
    d = {'S': dists.Normal(loc=locS, scale=scale),
         'I': dists.Cond(lambda x: dists.Normal(loc=locI - x['S'] + locS, scale=scale)
        }

where basically I replaced $B_1$ by $S$ minus its expectation.

The current version of MvNormal does not allow for a covariance matrix that varies across the particles. I implemented this for a colleague, but I don't like it so much because there is a loop over $n$, so it may be slow:


class Generalized_MV_Normal(dists.ProbDist):
    """Multivariate normal, with dim >=2, allowing for a different cov matrix for each
    particle. 
    """
    def __init__(self, loc=None, cov=None):
        self.loc = loc
        self.cov = cov
        self.dim = self.cov.shape[-1]

    def rvs(self, size=1):
        x = np.empty((size, self.dim))
        for n in range(size):
            x[n, :] = random.multivariate_normal(self.loc[n, :], 
                                                 self.cov[n, :, :])
        return x

    def logpdf(self, x):
        N = x.shape[0]
        l = np.empty(N)
        for n in range(N):
            l[n] = stats.multivariate_normal.logpdf(x[n], mean=self.loc[n, :],
                                                    cov=self.cov[n, :, :])
        return l

Since this is the second time someone is asking for something like this, I am going to open an issue and try to think of ways to make the above code more efficient (numba?).

@nchopin
Copy link
Owner

nchopin commented Jun 11, 2022

I'm going to close this issue, but feel free to re-open it or open another one in case you have other questions. As for the multivariate Gaussian issue, see #55.

@nchopin nchopin closed this as completed Jun 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants