-
Notifications
You must be signed in to change notification settings - Fork 7
Meeting December 2019 (Programming)
-
Lots of notes now on the wiki!
-
Follow the process outlined in 'implementing a disease module'
- Have a basic test file to run your code
- Develop the model incrementally, adding complexity gradually
- Catch problems early
- Easier for us to help!
-
Master should be merged into your branches soon after PRs are merged
- Notification in the Slack #programming channel
- Prevents complicated conflicts
- Keeps your branch up-to-date
-
Open draft PRs on Github
- Can use the collaboration tools but indicates work-in-progress
-
In Pycharm, set the working directory for to always be the root 'TLOmodel' directory
- Paths can be relative to this
-
tlo.util.transition_states
- Takes a single Dataframe column of states and transition probability matrix (Dataframe)
- Returns a new column with transitioned states
- Example on the wiki
-
tlo.util.nested_to_record
- A flattened dictionary representation of a Dataframe
- e.g.
Name Region Username 1 Nathaniel Midwest nzburke 2 Elisabeth South ewfoster 3 Briana Midwest bclancaster 4 Estella West elpotter 5 Lamont South llwoods
becomes
{'First Name_1': 'Nathaniel', 'First Name_2': 'Elisabeth', 'First Name_3': 'Briana', 'First Name_4': 'Estella', 'First Name_5': 'Lamont', 'Region_1': 'Midwest', 'Region_2': 'South', 'Region_3': 'Midwest', 'Region_4': 'West', 'Region_5': 'South', 'User Name_1': 'nzburke', 'User Name_2': 'ewfoster', 'User Name_3': 'bclancaster', 'User Name_4': 'elpotter', 'User Name_5': 'llwoods'}
- Can be used for logging
-
In disease modules (
class Xyz(Module)
),self.load_parameters_from_dataframe
loads parameters from resource dataframe, updating the classPARAMETER
s
- Installation and setup guide
- Still need a Windows version!
- Phase 4 & 5 from the checklist for developing a disease module
- Pre-PR checklist
- We're figuring out the tooling on Windows!
-
Improve logging using a TLO-specific logging module
- Handles setting up the logging of TLO
- One-liners to configure output
- e.g. turn off, save to file etc.
- Deal with strange output that causes problems downstream e.g. nan
- Enforce documentation of log lines
LOGGING = { 'population_by_sex': LogLine('Population alive by sex'), 'cause_of_death': LogLine('Deaths in the last month grouped by caused') }
- (TBH: A flag to indicate whether or not this output should be subject to scaling to match whole population size)
- Includes improving the parsing of logs
- Filtering the log lines when parsing
- Using a faster implementation of building dataframes from log lines
-
Performance
- Health System is the bottleneck
- Continuing to profile and refactor
- Over-allocating rows in the population dataframe
- Essential that models only work on
is_alive
individuals!
- Essential that models only work on
- The more frequent an event, the more to worry about the "work" in each call
- We want to add a set of tools to easily profile blocks of code (using e.g. decorator)
- Health System is the bottleneck
-
More robust testing
- CI to run tests on small and large population sizes
- checks on the use of is_alive
-
To ease configuring and running simulations, and processing of output
-
Prepare to run on compute clusters
-
A command-line tool to manage this:
tlo
-
Have a collection of templates (or one that can be configured) that describe a "scenario"
- e.g.
tlo create-scenario my_test --template basic_scenario --some --other --options
- would create a directory and write a scenario file therein
# ------------------------------------------------------------- # Name: my_test # Created: 10/12/2019 12:45 # Template: basic_scenario # ------------------------------------------------------------- import time import tlo.logging from tlo import Date, Simulation from tlo.methods.demography import Demography from tlo.methods.contraception import Contraception # ------------------------------------------------------------- # Basic configuration # ------------------------------------------------------------- start_date = Date(2010, 1, 1) end_date = Date(2051, 1, 1) initial_population_size = 100000 resourcefilepath = './resources/' tlo.logging.configure() simulation = Simulation(start_date=start_date) # ------------------------------------------------------------- # Register modules # ------------------------------------------------------------- simulation.register(Demography(resourcefilepath=resourcefilepath)) simulation.register(Contraception(resourcefilepath=resourcefilepath)) # Uncomment both import and register lines below # from tlo.methods.enhanced_lifestyle import Lifestyle # simulation.register(Lifestyle(resourcefilepath=resourcefilepath)) from tlo.methods.depression import Depression simulation.register(Depression(resourcefilepath=resourcefilepath)) # from tlo.methods.epilepsy import Epilepsy # simulation.register(Epilepsy(resourcefilepath=resourcefilepath)) # ------------------------------------------------------------- # Override parameters # ------------------------------------------------------------- simulation.override_parameters( { Demography: { 'fraction_of_births_male': 0.2 }, Depression: { 'init_rp_ever_depr_per_year_older_f': 0.125, 'prob_3m_selfharm_depr': lambda rng: rng.rand(), 'rr_depr_on_antidepr': lambda rng: rng.exponential(0.1) }, } ) # ------------------------------------------------------------- # Run simulation # ------------------------------------------------------------- simulation.seed_rngs(int(time.time()) simulation.make_initial_population(n=initial_population_size) simulation.simulate(end_date=end_date)
- e.g.
-
We then create a sample from our scenario
- e.g.
tlo create-sample my_test --some --other --options
- Takes above scenario and samples value where necessary (placed in a sub-directory)
simulation.override_parameters( { Demography: { 'fraction_of_births_male': 0.2 }, Depression: { 'init_rp_ever_depr_per_year_older_f': 0.125, 'prob_3m_selfharm_depr': 0.5187848579652606, 'rr_depr_on_antidepr': 0.05841701302920538 }, } )
- Can create several samples
-
tlo create-sample my_test --count 100
would create 100 samples of the scenario file
-
- e.g.
-
Finally we run the sample as many times as we would like
-
tlo run-sample my_test --all
- runs all the samples -
tlo run-sample my_test --sample 15
- runs a specific sample -
tlo run-sample my_test --all --runs 1000
- run all the samples, each 1000 times
-
-
The resulting set of files might look something like this:
scenarios ├── fixed_antidepr ├── my_test │ ├── scenario.py │ ├── sample_001 │ │ ├── sample.py │ │ ├── run_0001 │ │ │ ├── output.csv │ │ │ ├── output.log │ │ │ └── output.pickle │ │ ├── run_0002 │ │ ├── ... │ │ └── run_1000 │ ├── sample_002 │ ├── ... │ └── sample_100 ├── random_selfharm └── some_scenario
-
Could also generate script required to submit jobs on a computer cluster
tlo run-sample my_test --sample 1 --runs 1000 --job-array
- creates a shell script to submit job array to cluster
TLO Model Wiki