-
Notifications
You must be signed in to change notification settings - Fork 0
0. Meeting
Christian Kaus edited this page Feb 10, 2016
·
67 revisions
- Discuss on algorithms
- Evaluate their suitability for our project
- Select an algorithm
- Plan to write a pseudo-code
- Wiki
- We choose to work on the least squares method
- Two parts of algorithm:
- SIR modelling with certain values for parameters beta and gamma
- Optimisation
- Quantify the quality of model fit
- Compare with the target
- Suggest beta and gamma with heuritics
- Iterate until target is reached
-
References: Going multi-viral: synthedemic modelling of internet-based spreading phenomena, PARAMETER ESTIMATION AND UNCERTAINTY QUANTIFICATION FOR AN EPIDEMIC MODEL and On-the-fly Modelling and Prediction of Epidemic Phenomena
-
Questions:
-
Errors? Can we assume that there are no errors/anomalies in the data? Yes
-
We divide the data into parts and fit?
- Discuss pseudo-code
- everyone should be able to understand one or both algorithms, see references of meeting 15. November 2015
- Write final pseudo-code together
- Discuss on technical details
- Libraries (think about what we need)
- Programming languages
- to estimate beta, gamma AND S_0? S_0 is often unknown. (implemented in several papers including Danila's master thesis)
- scipy.optimize: http://dan.iel.fm/emcee/current/user/line/
- Yena's pseudocode:
SIR <- function(time, parameters) where
dS <- - beta*S*I
dI <- beta*S*I - gamma*I
dR <- gamma*I
solve ode
return (S,I,R,time)
SSE <- function(parameters, data) where
sse <- sum((x-p_i.I)^2)
return sse
MAIN:
parameters <- (beta0,gamma0)
time <- total time in dataset
for next n points p_1, .. , p_n in dataset do
sumOfSq <- SSE(parameters, data until p_n)
if sumOfSq > sumTarget do
parameter <- optimize(SIR)
return parameters
- We start our work on the implementation based on the pseudocode above.
- We have decided for Python because it has several libraries that we can use:
- numpy: Data types
- scipy.optimize: package for optimising functions, e.g. leastsq, minimize
- pyqtgraph: library for plotting graphs (alternative: matplotlib)
- scipy.integrate.odeint: ODE function in python
- All the team members do the implementation using least squares method until Tuesday 24 Nov.
- When done, consider other optimisation methods such as MCMC, ML, etc.
- Discuss possible algorithm for our sir model
- plan intermediate presentation
- discuss different code snippets / example implementations
- we decide to implement the following algorithms
- Least square method (by Nathan Lemoine)
- MCMC method
- intermediate presentation
-
- select model and read example data
-
- compute and fit model depending on input data
-
- plot fitted model + data
- Prepare presentation slides
- Merge to the master
- Need more data sets?
- Australia: Department of Health, National Notifiable Diseases Surveillance System (selected single disease looks good)
- Singapore: Ministry of Health, Weekly Infectious Disease Bulletin (in pdf format?!)
- Pacific Islands: Routine Surveillance (pdf again..)
- WHO Ebola data (many formats including csv)
- Flu data in Europe: Flu News Europe (data can be exported)
- WHO Influenza situation update: list
- Japan: Infectious Diseases (downloadable csv's)
- Presentation slides on Google Slides; Please add in till Tuesday (8th Dec).
Simulation:
Die Simulation soll folgende Antworten geben: 0. abstrakten Simulator schreiben (Christian)
- Welcher beta und gamma Initial-Parameter ist geeignet? (Yena)
- Welchen Einfluss hat der SSE auf den Fit? (Christian)
- Welche Methode (Nelder Mead, Powell, ... ) ist geeignet? (Albert)
- Annahme: Je mehr Datensätze desto besser der Fit? (Christian)
- Was für einen Unterschied macht das Gewicht für jedem Datenpunkt auf den Fit? (Yena)
- Brauchen wir Grenzen für Parameter? Was für die Grenzen? (später)
- Können wir Cross-Validation einführen? (Albert)
- Each team member presents their results
- Discuss how we can integrate the current improvements in the existing code
- We got an answer on the Stack Overflow about our problem with fitting of SIR models: the problem was the wrong equations of SIR model (refer to the answer) and wrong usage of parameter 'k'
- Our solution now uses
scipy.optimize.curve_fit
method instead ofscipy.optimize.minimize
(Note thatcurve_fit
usesminimize
in its implementation)
- Do some tasks, commit your changes and post the progress in the issues
- each presents their results
- Yena: Data often do not have information on S and I. -> possible to fit just with data of new cases? (paper Mathematical Modelling, Simulation, and Optimal Control of the 2014 Ebola Outbreak in West Africa. A. Rachah, D. F. M. Torres. may be useful)
- -> just use new cases as Infected. Cite the paper for correctness (see Section 3).
- We can now install/remove
EpiPy
on Unix-like and Windows operating systems - the normalize function is wrong, we actually do not care
- user should select start point of epidemic on GUI (needed for multi epidemic)
- We have decided on the standard date format of data sets: YYYY-MM-DD e.g. 2016-01-10
- Discuss algorithm for fitting of multiple sub-epidemics of one
- Two approaches:
- Sum of graphs -> applicable only in the case of independent epidemics
- Each trough is a new start -> have to consider recovered, susceptible (no. of recovered is not zero)
- code formatting
- find unused functions and variables
- extend GUI (reading CSV file)
- start code documentation (class diagram with dia, writing comments)
- improve event handling
- define latex / libre office template for final report
- prepare final presentation
- prepare project for going public
- prepare and generate code documentation with sphinx
- code review
- finish wiki
- close all tasks