Improving performance: profiling `scale_run` simulations #286

matt-graham · 2021-05-21T16:39:39Z

The current version of src/scripts/profiling/scale_run.py is running very slowly. After reducing the simulation time to 1 month it still took around 2.5 hours to complete a run on my laptop, so if this simulation rate remained the same running for the full 20 years here would take around ~25 days!

From some trial-and-error the main causes of the slowdown compared to the previous script (which for comparison took around 30 minutes to do a 2 year simulation with the same population size, but different set of modules / configuration) seems to be both of setting spurious_symptoms=True in the initializer for the SymptomManager module

TLOmodel/src/scripts/profiling/scale_run.py

Line 66 in 63f58f0

    
           symptommanager.SymptomManager(resourcefilepath=resourcefilepath, spurious_symptoms=True),

and setting mode_appt_constraints=2 and capabilities_coefficient=0.01 in the initializer for the HealthSystem module,

TLOmodel/src/scripts/profiling/scale_run.py

Lines 71 to 74 in 63f58f0

    
           healthsystem.HealthSystem(resourcefilepath=resourcefilepath, 
        
                                     mode_appt_constraints=2, 
        
                                     capabilities_coefficient=0.01 
        
                                     ),

with each individually causing a large slow down. The additional disease modules seem to be only adding a small overhead in comparison.

A SnakeViz plot of running for 1 month simulation time with the original configuration is below

Just under 80% of the run time is being spent in get_appt_footprint_as_time_request which is called 1351075 times.

In comparison running for 1 month simulation time with spurious_symptoms=False in SymptomManager and mode_appt_constraints=0, capabilities_coefficient=1 in HealthSystem gives the following SnakeViz plot

In this case the percentage of time spent in get_appt_footprint_as_time_request is much reduced though still significant (27%) and the number of calls much lower (24344, ~2% of 1351075). The overall runtime is also around 6% of previously.

The text was updated successfully, but these errors were encountered:

matt-graham · 2021-06-08T07:58:17Z

After the changes made in #287 a corresponding profiling run as described above gives the following SnakeViz output

which is significantly improved but there are still some bottlenecks so I will keep this issue open for now to keep track of progress.

matt-graham · 2021-06-09T11:59:01Z

After the changes made in #303 a profiling run as described above gives the following SnakeViz output

Next 🎯: get_squeeze_factors

matt-graham · 2021-06-11T14:07:38Z

After the changes made in #308, a profiling run as described above gives the following SnakeViz output

As running for longer simulation times is now tractable without being overly onerous, going forwards I'll use a 6 month simulation time run of scale_run.py as the baseline to give a better reflection of the end goal of doing a full 20 year run. I'll also switch to using pyinstrument rather than cProfile / SnakeViz as it seems to give less biased figures due to lower overhead, is more directly interpretable as it directly estimates the time spent at each level of the call tree and also is easy to include the text output from in PRs vs screenshotting SnakeViz!

After the changes made in #308, a pyinstrument profiling output for a 6 month run of scale_run.py is as below

pyinstrument profiling output for 6 month run using 5fcd391

4061.675 <module>  ../<string>:1
   [4 frames hidden]  .., runpy
      4061.675 _run_code  runpy.py:64
      └─ 4061.675 <module>  scale_run_updated.py:1
         └─ 4047.957 simulate  tlo/simulation.py:194
            ├─ 4000.051 fire_single_event  tlo/simulation.py:262
            │  └─ 4000.046 run  tlo/events.py:32
            │     ├─ 2501.234 apply  tlo/methods/healthsystem.py:1329
            │     │  ├─ 1200.651 <listcomp>  tlo/methods/healthsystem.py:1407
            │     │  │  └─ 1173.925 get_appt_footprint_as_time_request  tlo/methods/healthsystem.py:738
            │     │  │     ├─ 414.627 <listcomp>  tlo/methods/healthsystem.py:771
            │     │  │     ├─ 264.343 get_facility_info  tlo/methods/healthsystem.py:727
            │     │  │     │  ├─ 208.279 __getitem__  pandas/core/indexing.py:2148
            │     │  │     │  │     [34 frames hidden]  pandas, ..
            │     │  │     │  └─ 43.083 [self]
            │     │  │     ├─ 217.513 [self]
            │     │  │     ├─ 124.905 __init__  collections/__init__.py:540
            │     │  │     │     [4 frames hidden]  collections
            │     │  │     └─ 111.355 __getitem__  pandas/core/frame.py:2987
            │     │  │           [22 frames hidden]  pandas, ..
            │     │  ├─ 392.668 __iadd__  collections/__init__.py:807
            │     │  │     [12 frames hidden]  collections, ..
            │     │  ├─ 227.622 heappop  ../<built-in>:0
            │     │  │     [2 frames hidden]  ..
            │     │  ├─ 187.204 __getitem__  pandas/core/indexing.py:2148
            │     │  │     [34 frames hidden]  pandas, ..
            │     │  ├─ 178.306 log_hsi_event  tlo/methods/healthsystem.py:1013
            │     │  │  ├─ 78.200 <dictcomp>  tlo/methods/healthsystem.py:1039
            │     │  │  ├─ 51.226 [self]
            │     │  │  └─ 41.073 info  tlo/logging/core.py:208
            │     │  ├─ 170.586 [self]
            │     │  ├─ 51.558 get_squeeze_factors  tlo/methods/healthsystem.py:791
            │     │  │  └─ 49.889 <listcomp>  tlo/methods/healthsystem.py:830
            │     │  └─ 41.507 did_not_run  tlo/methods/hsi_generic_first_appts.py:335
            │     ├─ 1166.280 apply  tlo/methods/healthseekingbehaviour.py:193
            │     │  ├─ 836.514 schedule_hsi_event  tlo/methods/healthsystem.py:385
            │     │  │  ├─ 494.516 <listcomp>  tlo/methods/healthsystem.py:500
            │     │  │  │  ├─ 200.845 all  pandas/core/generic.py:10870
            │     │  │  │  │     [1796 frames hidden]  pandas, .., typing, numpy, abc, inspect
            │     │  │  │  ├─ 198.571 __getitem__  pandas/core/indexing.py:882
            │     │  │  │  │     [1030 frames hidden]  pandas, numpy, .., abc, typing, conte...
            │     │  │  │  └─ 49.394 new_method  pandas/core/ops/common.py:50
            │     │  │  │        [458 frames hidden]  pandas, .., numpy, abc
            │     │  │  ├─ 154.988 __getitem__  pandas/core/indexing.py:882
            │     │  │  │     [1606 frames hidden]  pandas, .., numpy, typing, abc, conte...
            │     │  │  └─ 56.658 new_method  pandas/core/ops/common.py:50
            │     │  │        [546 frames hidden]  pandas, .., numpy, abc, warnings
            │     │  ├─ 144.549 have_what  tlo/methods/symptommanager.py:435
            │     │  │  └─ 142.867 apply  pandas/core/frame.py:7622
            │     │  │        [956 frames hidden]  pandas, .., numpy, abc
            │     │  │           142.453 apply_series_generator  pandas/core/apply.py:281
            │     │  │           ├─ 82.435 <lambda>  tlo/methods/symptommanager.py:440
            │     │  │           │  └─ 81.897 <listcomp>  tlo/methods/symptommanager.py:440
            │     │  │           │     └─ 71.790 __getitem__  pandas/core/series.py:837
            │     │  │           │           [38 frames hidden]  pandas, ..
            │     │  └─ 76.369 iterrows  pandas/core/frame.py:1026
            │     │        [375 frames hidden]  pandas, .., abc, numpy
            │     └─ 97.779 apply  tlo/methods/malaria.py:557
            │        └─ 94.663 schedule_hsi_event  tlo/methods/healthsystem.py:385
            │           └─ 56.425 <listcomp>  tlo/methods/healthsystem.py:500
            └─ 46.343 initialise_simulation  tlo/methods/hiv.py:557
               └─ 46.136 get_time_from_infection_to_aids  tlo/methods/hiv.py:862

matt-graham · 2021-07-28T13:07:54Z

Summary of recent progress in reducing scale_run.py run times:

#313 (changing appointment footprint to sparse structure)

pyinstrument profiling output for 6 month run using 5fcd391 (pre-PR), 4060s total

4061.675 <module>  ../<string>:1
   [4 frames hidden]  .., runpy
      4061.675 _run_code  runpy.py:64
      └─ 4061.675 <module>  scale_run_updated.py:1
         └─ 4047.957 simulate  tlo/simulation.py:194
            ├─ 4000.051 fire_single_event  tlo/simulation.py:262
            │  └─ 4000.046 run  tlo/events.py:32
            │     ├─ 2501.234 apply  tlo/methods/healthsystem.py:1329
            │     │  ├─ 1200.651 <listcomp>  tlo/methods/healthsystem.py:1407
            │     │  │  └─ 1173.925 get_appt_footprint_as_time_request  tlo/methods/healthsystem.py:738
            │     │  │     ├─ 414.627 <listcomp>  tlo/methods/healthsystem.py:771
            │     │  │     ├─ 264.343 get_facility_info  tlo/methods/healthsystem.py:727
            │     │  │     │  ├─ 208.279 __getitem__  pandas/core/indexing.py:2148
            │     │  │     │  │     [34 frames hidden]  pandas, ..
            │     │  │     │  └─ 43.083 [self]
            │     │  │     ├─ 217.513 [self]
            │     │  │     ├─ 124.905 __init__  collections/__init__.py:540
            │     │  │     │     [4 frames hidden]  collections
            │     │  │     └─ 111.355 __getitem__  pandas/core/frame.py:2987
            │     │  │           [22 frames hidden]  pandas, ..
            │     │  ├─ 392.668 __iadd__  collections/__init__.py:807
            │     │  │     [12 frames hidden]  collections, ..
            │     │  ├─ 227.622 heappop  ../<built-in>:0
            │     │  │     [2 frames hidden]  ..
            │     │  ├─ 187.204 __getitem__  pandas/core/indexing.py:2148
            │     │  │     [34 frames hidden]  pandas, ..
            │     │  ├─ 178.306 log_hsi_event  tlo/methods/healthsystem.py:1013
            │     │  │  ├─ 78.200 <dictcomp>  tlo/methods/healthsystem.py:1039
            │     │  │  ├─ 51.226 [self]
            │     │  │  └─ 41.073 info  tlo/logging/core.py:208
            │     │  ├─ 170.586 [self]
            │     │  ├─ 51.558 get_squeeze_factors  tlo/methods/healthsystem.py:791
            │     │  │  └─ 49.889 <listcomp>  tlo/methods/healthsystem.py:830
            │     │  └─ 41.507 did_not_run  tlo/methods/hsi_generic_first_appts.py:335
            │     ├─ 1166.280 apply  tlo/methods/healthseekingbehaviour.py:193
            │     │  ├─ 836.514 schedule_hsi_event  tlo/methods/healthsystem.py:385
            │     │  │  ├─ 494.516 <listcomp>  tlo/methods/healthsystem.py:500
            │     │  │  │  ├─ 200.845 all  pandas/core/generic.py:10870
            │     │  │  │  │     [1796 frames hidden]  pandas, .., typing, numpy, abc, inspect
            │     │  │  │  ├─ 198.571 __getitem__  pandas/core/indexing.py:882
            │     │  │  │  │     [1030 frames hidden]  pandas, numpy, .., abc, typing, conte...
            │     │  │  │  └─ 49.394 new_method  pandas/core/ops/common.py:50
            │     │  │  │        [458 frames hidden]  pandas, .., numpy, abc
            │     │  │  ├─ 154.988 __getitem__  pandas/core/indexing.py:882
            │     │  │  │     [1606 frames hidden]  pandas, .., numpy, typing, abc, conte...
            │     │  │  └─ 56.658 new_method  pandas/core/ops/common.py:50
            │     │  │        [546 frames hidden]  pandas, .., numpy, abc, warnings
            │     │  ├─ 144.549 have_what  tlo/methods/symptommanager.py:435
            │     │  │  └─ 142.867 apply  pandas/core/frame.py:7622
            │     │  │        [956 frames hidden]  pandas, .., numpy, abc
            │     │  │           142.453 apply_series_generator  pandas/core/apply.py:281
            │     │  │           ├─ 82.435 <lambda>  tlo/methods/symptommanager.py:440
            │     │  │           │  └─ 81.897 <listcomp>  tlo/methods/symptommanager.py:440
            │     │  │           │     └─ 71.790 __getitem__  pandas/core/series.py:837
            │     │  │           │           [38 frames hidden]  pandas, ..
            │     │  └─ 76.369 iterrows  pandas/core/frame.py:1026
            │     │        [375 frames hidden]  pandas, .., abc, numpy
            │     └─ 97.779 apply  tlo/methods/malaria.py:557
            │        └─ 94.663 schedule_hsi_event  tlo/methods/healthsystem.py:385
            │           └─ 56.425 <listcomp>  tlo/methods/healthsystem.py:500
            └─ 46.343 initialise_simulation  tlo/methods/hiv.py:557
               └─ 46.136 get_time_from_infection_to_aids  tlo/methods/hiv.py:862

pyinstrument profiling output for 6 month run using b290f04 (post-PR), 3120s total

3124.388 <module>  ../<string>:1
   [4 frames hidden]  .., runpy
      3124.388 _run_code  runpy.py:64
      └─ 3124.388 <module>  scale_run_updated.py:1
         └─ 3111.459 simulate  tlo/simulation.py:194
            ├─ 3066.945 fire_single_event  tlo/simulation.py:262
            │  └─ 3066.938 run  tlo/events.py:32
            │     ├─ 1694.334 apply  tlo/methods/healthsystem.py:1319
            │     │  ├─ 555.128 <listcomp>  tlo/methods/healthsystem.py:1397
            │     │  │  └─ 535.696 get_appt_footprint_as_time_request  tlo/methods/healthsystem.py:734
            │     │  │     ├─ 232.383 get_facility_info  tlo/methods/healthsystem.py:723
            │     │  │     │  ├─ 186.981 __getitem__  pandas/core/indexing.py:2148
            │     │  │     │  │     [34 frames hidden]  pandas, ..
            │     │  │     │  └─ 37.195 [self]
            │     │  │     ├─ 161.598 [self]
            │     │  │     └─ 132.843 __init__  collections/__init__.py:540
            │     │  │           [4 frames hidden]  collections
            │     │  ├─ 348.971 __iadd__  collections/__init__.py:807
            │     │  │     [12 frames hidden]  collections, ..
            │     │  ├─ 216.923 heappop  ../<built-in>:0
            │     │  │     [2 frames hidden]  ..
            │     │  ├─ 182.151 __getitem__  pandas/core/indexing.py:2148
            │     │  │     [34 frames hidden]  pandas, ..
            │     │  ├─ 155.993 [self]
            │     │  ├─ 100.272 log_hsi_event  tlo/methods/healthsystem.py:1003
            │     │  │  ├─ 52.150 [self]
            │     │  │  └─ 35.463 info  tlo/logging/core.py:208
            │     │  ├─ 47.124 get_squeeze_factors  tlo/methods/healthsystem.py:781
            │     │  │  └─ 45.622 <listcomp>  tlo/methods/healthsystem.py:820
            │     │  └─ 38.118 did_not_run  tlo/methods/hsi_generic_first_appts.py:335
            │     ├─ 1069.676 apply  tlo/methods/healthseekingbehaviour.py:193
            │     │  ├─ 760.387 schedule_hsi_event  tlo/methods/healthsystem.py:385
            │     │  │  ├─ 463.801 <listcomp>  tlo/methods/healthsystem.py:500
            │     │  │  │  ├─ 193.890 all  pandas/core/generic.py:10870
            │     │  │  │  │     [1796 frames hidden]  pandas, .., numpy, typing, abc, inspect
            │     │  │  │  ├─ 180.673 __getitem__  pandas/core/indexing.py:882
            │     │  │  │  │     [1030 frames hidden]  pandas, .., numpy, abc, typing, conte...
            │     │  │  │  ├─ 46.204 new_method  pandas/core/ops/common.py:50
            │     │  │  │  │     [458 frames hidden]  pandas, .., numpy, abc
            │     │  │  │  └─ 36.275 wrapper  pandas/core/strings/accessor.py:93
            │     │  │  │        [236 frames hidden]  pandas, .., numpy, re
            │     │  │  ├─ 144.639 __getitem__  pandas/core/indexing.py:882
            │     │  │  │     [1606 frames hidden]  pandas, .., numpy, typing, abc, conte...
            │     │  │  └─ 52.143 new_method  pandas/core/ops/common.py:50
            │     │  │        [546 frames hidden]  pandas, .., numpy, abc, warnings
            │     │  ├─ 139.978 have_what  tlo/methods/symptommanager.py:435
            │     │  │  └─ 138.370 apply  pandas/core/frame.py:7622
            │     │  │        [966 frames hidden]  pandas, .., numpy, abc
            │     │  │           137.956 apply_series_generator  pandas/core/apply.py:281
            │     │  │           ├─ 78.682 <lambda>  tlo/methods/symptommanager.py:440
            │     │  │           │  └─ 78.295 <listcomp>  tlo/methods/symptommanager.py:440
            │     │  │           │     └─ 69.741 __getitem__  pandas/core/series.py:837
            │     │  │           │           [38 frames hidden]  pandas, ..
            │     │  └─ 70.356 iterrows  pandas/core/frame.py:1026
            │     │        [376 frames hidden]  pandas, .., abc, numpy
            │     └─ 85.117 apply  tlo/methods/malaria.py:557
            │        └─ 83.166 schedule_hsi_event  tlo/methods/healthsystem.py:385
            │           └─ 49.937 <listcomp>  tlo/methods/healthsystem.py:500
            └─ 43.024 initialise_simulation  tlo/methods/hiv.py:557
               └─ 42.813 get_time_from_infection_to_aids  tlo/methods/hiv.py:862
                  └─ 37.114 predict  tlo/lm.py:339
                     └─ 36.700 eval  pandas/core/computation/eval.py:160
                           [23771 frames hidden]  pandas, .., collections, ast, abc, nu...

#319 (appointment types per facility level as list of sets)

pyinstrument profiling output for 6 month run using ad0e751 (pre-PR), 4990s total

4993.373 <module>  ../<string>:1
   [4 frames hidden]  .., runpy
      4993.373 _run_code  runpy.py:64
      └─ 4993.373 <module>  scale_run_arg.py:1
         └─ 4979.298 simulate  tlo/simulation.py:194
            └─ 4932.258 fire_single_event  tlo/simulation.py:260
               └─ 4932.247 run  tlo/events.py:32
                  ├─ 3127.327 apply  tlo/methods/healthsystem.py:1326
                  │  ├─ 1726.654 <listcomp>  tlo/methods/healthsystem.py:1404
                  │  │  └─ 1691.103 get_appt_footprint_as_time_request  tlo/methods/healthsystem.py:735
                  │  │     ├─ 657.910 get_facility_info  tlo/methods/healthsystem.py:724
                  │  │     │  ├─ 583.035 __getitem__  pandas/core/indexing.py:2148
                  │  │     │  │     [60 frames hidden]  pandas, ..
                  │  │     │  └─ 62.161 [self]
                  │  │     ├─ 446.810 <listcomp>  tlo/methods/healthsystem.py:768
                  │  │     ├─ 262.114 [self]
                  │  │     ├─ 139.141 __getitem__  pandas/core/frame.py:2987
                  │  │     │     [22 frames hidden]  pandas, ..
                  │  │     └─ 138.252 __init__  collections/__init__.py:540
                  │  │           [4 frames hidden]  collections
                  │  ├─ 400.981 __iadd__  collections/__init__.py:807
                  │  │     [12 frames hidden]  collections, ..
                  │  ├─ 259.284 heappop  ../<built-in>:0
                  │  │     [2 frames hidden]  ..
                  │  ├─ 211.322 __getitem__  pandas/core/indexing.py:2148
                  │  │     [34 frames hidden]  pandas, ..
                  │  ├─ 191.852 log_hsi_event  tlo/methods/healthsystem.py:1010
                  │  │  ├─ 83.061 <dictcomp>  tlo/methods/healthsystem.py:1036
                  │  │  └─ 59.753 [self]
                  │  ├─ 181.283 [self]
                  │  └─ 52.810 get_squeeze_factors  tlo/methods/healthsystem.py:788
                  │     └─ 51.022 <listcomp>  tlo/methods/healthsystem.py:827
                  ├─ 1431.438 apply  tlo/methods/healthseekingbehaviour.py:200
                  │  ├─ 1042.926 schedule_hsi_event  tlo/methods/healthsystem.py:385
                  │  │  ├─ 604.817 <listcomp>  tlo/methods/healthsystem.py:500
                  │  │  │  ├─ 244.164 all  pandas/core/generic.py:10870
                  │  │  │  │     [1796 frames hidden]  pandas, .., typing, numpy, abc, inspect
                  │  │  │  ├─ 239.054 __getitem__  pandas/core/indexing.py:882
                  │  │  │  │     [1030 frames hidden]  pandas, .., numpy, abc, typing, conte...
                  │  │  │  ├─ 61.244 new_method  pandas/core/ops/common.py:50
                  │  │  │  │     [458 frames hidden]  pandas, .., numpy, abc
                  │  │  │  └─ 50.281 wrapper  pandas/core/strings/accessor.py:93
                  │  │  │        [236 frames hidden]  pandas, .., numpy, re
                  │  │  ├─ 187.281 __getitem__  pandas/core/indexing.py:882
                  │  │  │     [1606 frames hidden]  pandas, .., typing, numpy, abc, conte...
                  │  │  └─ 72.168 new_method  pandas/core/ops/common.py:50
                  │  │        [546 frames hidden]  pandas, .., numpy, abc, warnings
                  │  ├─ 154.549 have_what  tlo/methods/symptommanager.py:455
                  │  │  └─ 152.679 apply  pandas/core/frame.py:7622
                  │  │        [1020 frames hidden]  pandas, .., numpy, abc
                  │  │           152.218 apply_series_generator  pandas/core/apply.py:281
                  │  │           ├─ 84.863 <lambda>  tlo/methods/symptommanager.py:460
                  │  │           │  └─ 84.412 <listcomp>  tlo/methods/symptommanager.py:460
                  │  │           │     └─ 75.163 __getitem__  pandas/core/series.py:837
                  │  │           │           [38 frames hidden]  pandas, ..
                  │  └─ 89.763 iterrows  pandas/core/frame.py:1026
                  │        [373 frames hidden]  pandas, .., abc, numpy
                  └─ 118.387 apply  tlo/methods/malaria.py:580
                     └─ 114.423 schedule_hsi_event  tlo/methods/healthsystem.py:385
                        └─ 65.844 <listcomp>  tlo/methods/healthsystem.py:500

pyinstrument profiling output for 6 month run using 30d6c1e (post-PR), 3910s total

3908.824 <module>  ../<string>:1
   [4 frames hidden]  .., runpy
      3908.824 _run_code  runpy.py:64
      └─ 3908.824 <module>  scale_run_arg.py:1
         └─ 3895.191 simulate  tlo/simulation.py:194
            ├─ 3849.575 fire_single_event  tlo/simulation.py:260
            │  └─ 3849.570 run  tlo/events.py:32
            │     ├─ 2858.453 apply  tlo/methods/healthsystem.py:1330
            │     │  ├─ 1582.785 <listcomp>  tlo/methods/healthsystem.py:1408
            │     │  │  └─ 1551.660 get_appt_footprint_as_time_request  tlo/methods/healthsystem.py:739
            │     │  │     ├─ 588.230 get_facility_info  tlo/methods/healthsystem.py:728
            │     │  │     │  ├─ 524.436 __getitem__  pandas/core/indexing.py:2148
            │     │  │     │  │     [60 frames hidden]  pandas, ..
            │     │  │     │  └─ 51.976 [self]
            │     │  │     ├─ 417.660 <listcomp>  tlo/methods/healthsystem.py:772
            │     │  │     ├─ 241.726 [self]
            │     │  │     ├─ 129.434 __getitem__  pandas/core/frame.py:2987
            │     │  │     │     [22 frames hidden]  pandas, ..
            │     │  │     └─ 128.753 __init__  collections/__init__.py:540
            │     │  │           [4 frames hidden]  collections
            │     │  ├─ 373.016 __iadd__  collections/__init__.py:807
            │     │  │     [12 frames hidden]  collections, ..
            │     │  ├─ 233.442 heappop  ../<built-in>:0
            │     │  │     [2 frames hidden]  ..
            │     │  ├─ 185.336 __getitem__  pandas/core/indexing.py:2148
            │     │  │     [34 frames hidden]  pandas, ..
            │     │  ├─ 174.578 log_hsi_event  tlo/methods/healthsystem.py:1014
            │     │  │  ├─ 79.488 <dictcomp>  tlo/methods/healthsystem.py:1040
            │     │  │  ├─ 50.608 [self]
            │     │  │  └─ 39.991 info  tlo/logging/core.py:208
            │     │  ├─ 165.047 [self]
            │     │  ├─ 50.289 get_squeeze_factors  tlo/methods/healthsystem.py:792
            │     │  │  └─ 48.589 <listcomp>  tlo/methods/healthsystem.py:831
            │     │  └─ 41.363 did_not_run  tlo/methods/hsi_generic_first_appts.py:317
            │     ├─ 726.546 apply  tlo/methods/healthseekingbehaviour.py:200
            │     │  ├─ 391.373 schedule_hsi_event  tlo/methods/healthsystem.py:393
            │     │  │  ├─ 183.215 __getitem__  pandas/core/indexing.py:882
            │     │  │  │     [1606 frames hidden]  pandas, .., typing, numpy, abc, conte...
            │     │  │  └─ 61.678 new_method  pandas/core/ops/common.py:50
            │     │  │        [546 frames hidden]  pandas, .., numpy, abc, warnings
            │     │  ├─ 140.315 have_what  tlo/methods/symptommanager.py:455
            │     │  │  └─ 138.590 apply  pandas/core/frame.py:7622
            │     │  │        [949 frames hidden]  pandas, .., numpy, abc
            │     │  │           138.171 apply_series_generator  pandas/core/apply.py:281
            │     │  │           ├─ 78.980 <lambda>  tlo/methods/symptommanager.py:460
            │     │  │           │  └─ 75.817 <listcomp>  tlo/methods/symptommanager.py:460
            │     │  │           │     └─ 69.245 __getitem__  pandas/core/series.py:837
            │     │  │           │           [38 frames hidden]  pandas, ..
            │     │  └─ 80.101 iterrows  pandas/core/frame.py:1026
            │     │        [400 frames hidden]  pandas, .., abc, numpy
            │     └─ 43.426 apply  tlo/methods/malaria.py:580
            │        └─ 40.493 schedule_hsi_event  tlo/methods/healthsystem.py:393
            └─ 44.027 initialise_simulation  tlo/methods/hiv.py:570
               └─ 43.783 get_time_from_infection_to_aids  tlo/methods/hiv.py:892

#322 (caching appointment time requests)

SnakeViz profiling output for 12 month run (pre-PR), 12000s total

SnakeViz profiling output for 12 month run (post-PR), 8170s total

#325 (using Counter update method to sum footprints)

SnakeViz profiling output for 12 month run (pre-PR), 12000s total

SnakeViz profiling output for 12 month run (post-PR), 11300s total

#337 (multiply daily capabilities by capabilities_coefficient once only)

SnakeViz profiling output for 5 year run (pre-PR), 15100s total

SnakeViz profiling output for 5 year run (post-PR), 12600s total

matt-graham · 2021-08-25T08:43:07Z

Summary of recent PRs reducing scale_run.py run times:

#343 (use set instead of dataframe in check that HSI event footprint is possible)

Snakeviz profiling output for 5 year run (pre-PR), 12600s total

Snakeviz profiling output for 5 year run (post-PR), 9370s total

#346 (vectorising health seeking behaviour poll event)

Snakeviz profiling output for 5 year run (pre-PR), 9370s total

Snakeviz profiling output for 5 year run (post-PR), 6450s total

matt-graham · 2021-08-25T09:48:05Z

A full 20 year / 20k population run of scale_run.py with profiling enabled now takes a relatively manageable five and half hours (and probably a bit less without the profiling overhead). We can probably therefore consider increasing the computational demands of the profiling runs. @tbhallett are there any changes you think it would be worth making to the profiling run configurations, e.g. increasing the simulation population or adding in further disease modules?

The SnakeViz breakdown for a 20 year / 20k population run on current master (a691b72) is as follows

The ten event methods with the highest proportions of the overall run time are as follows:

healthsystem.HealthSystemScheduler.apply 27.7%
healthseekingbehaviour.HealthSeekingBehaviourPoll.apply 14.7%
pregnancy_supervisor.PregnancySupervisorEvent.apply 8.56%
cardio_metabolic_disorders.CardioMetabolicDisorders_MainPollingEvent.apply 7.16%
symptommanager.SymptomManager_SpuriousSymptomResolve.apply 5.20%
labour.BirthEvent.apply 4.92%
symptommanager.SymptomManager_SpuriousSymptomOnset.apply 4.74%
postnatal_supervisor.PostnatalSupervisorEvent.apply 4.46%
postnatal_supervisor.PostnatalWeekOneEvent.apply 2.82%
malaria.MalariaCureEvent.apply 2.55%

The HealthSystemScheduler.apply and HealthSeekingBehaviourPoll.apply are still the two largest time consumers, taking together around 42.4% of the overall run time, though this is significantly improved compared to prior to some of the recent optimisation when these two methods together constituted the vast majority of the run time.

Within HealthSystemScheduler.apply, HealthSystem.impose_beddays_footprint is quite costly constituting 6.36% of overall run time and 23.0% of the time in HealthSystemScheduler.apply, and hopefully the changes #258 will help reduce this and/or enable making further optimizations. HealthSystem.get_squeeze_factors
is also relatively costly (3.84% of overall run time, 13.8% of time in HealthSystemScheduler.apply) and there may be scope for some further optimizations there though given there have already been several rounds of optimization of get_squeeze_factors I suspect there is not too much room for further improvement.

Within HealthSeekingBehaviourPoll.apply, HealthSystem.schedule_hsi_event is the largest contributor to run time, with 9.19% of the overall run time and around 62.4% of the time in HealthSeekingBehaviourPoll.apply. As commented in #346, as many similar events are scheduled simultaneously in HealthSeekingBehaviourPoll.apply there is potentially room for improvement here by adding a 'batched' schedule_hsi_event equivalent which avoids redundantly repeating checks etc. when many of the attributes of the HSI events being scheduled are shared.

Within pregnancy_supervisor.PregnancySupervisorEvent.apply there is no obvious bottleneck with the time split realtively evenly over many different methods so there does not seem to obvious avenues for significant improvement.

Within cardio_metabolic_disorders.CardioMetabolicDisorders_MainPollingEvent.apply nearly all the time is being spent in evaluating the predict method of linear models. This suggests there may be some value in either trying to further optimize the linear model predict function (possibly exploring some of the ideas mentioned in #267) in general, with the total time spent in LinearModel.predict 13.8% of the total run time, or specifically in the context of cardio_metabolic_disorders.CardioMetabolicDisorders_MainPollingEvent.apply it may be worth exploring whether optimizing the linear model implementations there e.g. by creating custom predict functions to use in LinearModel.custom, may be worthwhile.

Together the apply methods of the two spurious symptoms related events SymptomManager_SpuriousSymptomResolve and SymptomManager_SpuriousSymptomOnset constitute around 10% of the overall run time and so there is potentially scope for looking at possible improvements in the handling of spurious symptoms.

matt-graham · 2021-08-26T20:35:54Z

A full 20 year / 20k population run of scale_run.py with profiling enabled now takes a relatively manageable five and half hours (and probably a bit less without the profiling overhead). We can probably therefore consider increasing the computational demands of the profiling runs. @tbhallett are there any changes you think it would be worth making to the profiling run configurations, e.g. increasing the simulation population or adding in further disease modules?

To provide one additional datapoint, a 5 year run of scale_run.py with profiling enabled and an initial population of 50k using the code on 4c51802 (current tip of branch in #354) took 3 hours to complete, so extrapolating, a full 20 year run would be roughly 12 hours which seems reasonable. This was with the settings in scale_run.py otherwise unchanged (in particular no changes were made to the capabilities_coefficient value which I think may need to be adjusted to reflect the change in scaling?).

tbhallett · 2021-08-31T12:28:29Z

Thanks for all this @matt-graham; c. 5h for 20k/20y simulation with all the bells and whistles feels really manageable to me!!

I think going up to 50k would make sense as would be provide a respectable c1800 persons per district.

Yes, there are now new completed disease modules that we should fold into the profiling scrips (notably Alri, BladderCancer and Measles). Should we add these in now? I can do a quick PR. (Shortly three more will be added, which will be the complete set: "Rti", "Tb" and "Malnutrition").

I'll also do a PR that updates the capabilities_coefficient.

Looking through the various things, I see that we rely on simplified_births in many cases. I wonder if we should stop doing that, and let all the profiling rely on the full set of modules for contraceptions/births/labour etc.

But, apart from these thoughts, I don't think we need to change the spec of these runs. Let me know what you and @tamuri think and if you'd need me to do a PR suggesting these changes.

matt-graham · 2021-09-17T14:45:27Z

Summary of recent PRs reducing scale_run.py run times:

#354 (add method to schedule batch of HSI events)

Snakeviz profiling output for 5 year 20k population run (pre-PR), 6260s total

Snakeviz profiling output for 5 year 20k population run (post-PR), 5580s total

#357 (add flags to declare properties of LinearModel predictor conditions)

Snakeviz profiling output for 5 year 20k population run (pre-PR), 6260s total

Snakeviz profiling output for 5 year 20k population run (post-PR), 5510s total

#366 (avoid repeated dataframe accesses in HealthSystemScheduler)

Snakeviz profiling output for 5 year 50k population run (pre-PR), 10100s total

Snakeviz profiling output for 5 year 50k population run (post-PR), 9320s total

matt-graham · 2021-09-17T15:27:33Z

After the updates to scale_run.py in #363, a 5 year profiled run with the configuration otherwise left as the new defaults (i.e. 50k initial population and capabilities_coefficient set based on ratio of initial population to actual estimated 2010 population), the SnakeViz output for the overall breakdown of the toal run time (9980s) is

The ten event methods with the highest proportions of the overall run time are as follows:

healthsystem.HealthSystemScheduler.apply 22.5% (2250s)
healthseekingbehaviour.HealthSeekingBehaviourPoll.apply 18.3% (1820s)
labour.BirthEvent.apply 7.04% (703s)
symptommanager.SymptomManager_SpuriousSymptomOnset.apply 5.41% (540s)
malaria.MalariaCureEvent.apply 5.01% (500s)
pregnancy_supervisor.PregnancySupervisorEvent.apply 5.00% (499s)
symptommanager.SymptomManager_SpuriousSymptomResolve.apply 4.07% (407s)
postnatal_supervisor.PostnatalWeekOneEvent.apply 3.93% (392s)
postnatal_supervisor.PostnatalSupervisorEvent.apply 2.87% (287s)
labour.LabourAtHomeEvent.apply 2.84% (284s)

Some initial thoughts about possible performance improvements

The total time spent in functions / methods in the symptommanager module is now quite high, both in the symptommanager.SymptomManager_SpuriousSymptomOnset and symptommanager.SymptomManager_SpuriousSymptomResolve events themselves, plus virtually all the time spent in malaria.MalariaCureEvent.apply is in running clear_symptoms.

The latter arises in the lines

TLOmodel/src/tlo/methods/malaria.py

Lines 1226 to 1229 in 7812c64

    
           for idx in all_cured: 
        
               self.sim.modules["SymptomManager"].clear_symptoms( 
        
                   person_id=idx, disease_module=self.module 
        
               )

which clears all the symptoms for a set of person IDs in a loop - it might be there could some performance gain therefore from extending clear_symptoms to allow passing a set / list of integers for the person_id argument.

In clear_symptoms itself, the main work is done in a loop over a set of symptoms, with change_symptom called for each symptom

TLOmodel/src/tlo/methods/symptommanager.py

Lines 499 to 505 in 7812c64

    
           for symp in symptoms_caused_by_this_disease_module: 
        
               self.change_symptom( 
        
                   person_id=person_id, 
        
                   symptom_string=symp, 
        
                   add_or_remove='-', 
        
                   disease_module=disease_module 
        
               )

Running change_symptom for each of a set of symptoms seems to be quite a common pattern

TLOmodel/src/tlo/methods/alri.py

Lines 1013 to 1019 in 7812c64

    
           for symptom in symptoms: 
        
               self.sim.modules['SymptomManager'].change_symptom( 
        
                   person_id=person_id, 
        
                   symptom_string=symptom, 
        
                   add_or_remove='+', 
        
                   disease_module=self, 
        
               )

TLOmodel/src/tlo/methods/alri.py

Lines 1531 to 1537 in 7812c64

    
           for symptom in models.symptoms_for_disease(disease_type=disease_type): 
        
               m.sim.modules['SymptomManager'].change_symptom( 
        
                   person_id=person_id, 
        
                   symptom_string=symptom, 
        
                   add_or_remove='+', 
        
                   disease_module=m, 
        
               )

TLOmodel/src/tlo/methods/diarrhoea.py

Lines 1042 to 1053 in 7812c64

    
           for symptom, prob in possible_symptoms_for_this_pathogen.items(): 
        
               if rng.random_sample() < prob: 
        
                   self.sim.modules['SymptomManager'].change_symptom( 
        
                       person_id=person_id, 
        
                       symptom_string=symptom, 
        
                       add_or_remove='+', 
        
                       disease_module=self.module 
        
                   ) 
        
                   if symptom == 'bloody_stool': 
        
                       props_new['gi_last_diarrhoea_type'] = 'bloody' 
        
                   elif symptom == 'dehydration': 
        
                       props_new['gi_last_diarrhoea_dehydration'] = 'some'

TLOmodel/src/tlo/methods/malaria.py

Lines 482 to 490 in 7812c64

    
           for symptom in symptom_list: 
        
               # this also schedules symptom resolution in 5 days 
        
               self.sim.modules["SymptomManager"].change_symptom( 
        
                   person_id=list(clinical_index), 
        
                   symptom_string=symptom, 
        
                   add_or_remove="+", 
        
                   disease_module=self, 
        
                   duration_in_days=p["dur_clin"], 
        
               )

TLOmodel/src/tlo/methods/malaria.py

Lines 524 to 531 in 7812c64

    
           for symptom in symptom_list: 
        
               self.sim.modules["SymptomManager"].change_symptom( 
        
                   person_id=list(severe_index), 
        
                   symptom_string=symptom, 
        
                   add_or_remove="+", 
        
                   disease_module=self, 
        
                   duration_in_days=None, 
        
               )

TLOmodel/src/tlo/methods/measles.py

Lines 280 to 294 in 7812c64

    
           for symptom in symptom_list: 
        
               specific_symptom_prob = symptom_prob.loc[symptom_prob.symptom == symptom, "probability"].values[0] 
        
               # random sample whether person will have symptom 
        
               if rng.random_sample(size=1) < specific_symptom_prob: 
        
                   # schedule symptom onset 
        
                   self.sim.modules["SymptomManager"].change_symptom( 
        
                       person_id=person_id, 
        
                       symptom_string=symptom, 
        
                       add_or_remove="+", 
        
                       disease_module=self.sim.modules["Measles"], 
        
                       date_of_onset=symp_onset, 
        
                       duration_in_days=14,  # same duration for all symptoms 
        
                   )

TLOmodel/src/tlo/methods/symptommanager.py

Lines 605 to 632 in 7812c64

    
           for symp in sorted(self.module.generic_symptoms): 
        
               does_not_have_symptom = self.module.who_not_have(symptom_string=symp) 
        
               for group in ['children', 'adults']: 
        
                   p = self.generic_symptoms['prob_per_day'][group][symp] 
        
                   dur = self.generic_symptoms['duration_in_days'][group][symp] 
        
                   idx_grp = idx[group] 
        
                   eligible_to_get_symptom = idx_grp[idx_grp.isin(does_not_have_symptom)] 
        
                   persons_to_onset_with_this_symptom = list( 
        
                       eligible_to_get_symptom[self.rand(len(eligible_to_get_symptom)) < p]) 
        
                   # Do onset 
        
                   self.sim.modules['SymptomManager'].change_symptom( 
        
                       symptom_string=symp, 
        
                       add_or_remove='+', 
        
                       person_id=persons_to_onset_with_this_symptom, 
        
                       duration_in_days=None,   # <- resolution for these is handled by the SpuriousSymptomsResolve Event 
        
                       disease_module=self.module, 
        
                   ) 
        
                   # Schedule resolution: 
        
                   self.module.spurious_symptom_resolve_event.schedule_symptom_resolve( 
        
                       person_id=persons_to_onset_with_this_symptom, 
        
                       symptom_string=symp, 
        
                       date_of_resolution=(self.sim.date + pd.DateOffset(days=dur)).date() 
        
                   )

TLOmodel/src/tlo/methods/symptommanager.py

Lines 658 to 667 in 7812c64

    
           for symp in self.to_resolve.keys(): 
        
               if date_today in self.to_resolve[symp]: 
        
                   person_ids = self.to_resolve[symp].pop(date_today) 
        
                   person_ids_alive = list(df.index[df.is_alive & (df.index.isin(person_ids))]) 
        
                   self.module.change_symptom( 
        
                       person_id=person_ids_alive, 
        
                       add_or_remove='-', 
        
                       symptom_string=symp, 
        
                       disease_module=self.module 
        
                   )

which suggests there might also be some gain to either generalising change_symptom to allow passing a list / set of symptoms to try to avoid the overhead of repeated calls.

A considerable amount of time is being spent in schedule_hsi_event - 1090s which is 11.0% of the overall run time. This is partly due to the high number of individual calls (10342802) to this method, corresponding to an average rate of about 5700 HSI events being scheduled per simulated day; for a population of 50k this seems reasonable (?). Although we have already done several rounds of optimization to schedule_hsi_event, given it is getting called so often it is probably worth doing a further pass to see if there are any even minor optimizations that can be made. From the breakdown of the time spent

a considerable amount of time is being spent in operations in the function itself as well as in calling get_appt_footprint_as_time_request.

Another possible target for optimization is the __init__ method of HSI_GenericFirstApptAtFacilityLevel1 which is called 9262660 times (suggesting HSI_GenericFirstApptAtFacilityLevel1 constitute the vast majority of the HSI events being scheduled), with total time spent in this method 482s (4.83% of overall time).

tbhallett · 2021-09-17T15:41:12Z

Thanks for this Matt

my first quick reactions—-

i totally agree on the making symptom manager accept a set of symptoms to onset for one person or more than one person, and otherwise making that work in batches. I have often thought about doing this but hadn’t clocked how big of a deal it may be for performance.
I am a little worried that some of the profiling results is due to the rate of spurious symptoms between very high. We don’t know the real rate but the values in there are probably too high when all the disease modules are in. I haven’t got a good answer for that. But just to be aware that the numbers of hits of spurious symptoms is probably too high. In the diarrhoea PR I turn it down a bit (done today!). Perhaps I should pull that into it’s own PR so we can see the effect.
re the init of GenericHSI: I had wondered if we could make this into a batch thing, like we did with SpuriousSymptomBatchOnset. ie, one event per day, which does the “work” for each person that is going to have a GenericHSI that day. Every GenericHSI is scheduled by HealthSeekingBehavioir and for the “tomorrow”.. so that module could curate a record of the people instead (???).

matt-graham · 2021-09-17T16:00:53Z

Thanks @tbhallett

i totally agree on the making symptom manager accept a set of symptoms to onset for one person or more than one person, and otherwise making that work in batches. I have often thought about doing this but hadn’t clocked how big of a deal it may be for performance.

Okay it seems like this is probably a good first target for me to work on then!

I am a little worried that some of the profiling results is due to the rate of spurious symptoms between very high. We don’t know the real rate but the values in there are probably too high when all the disease modules are in. I haven’t got a good answer for that. But just to be aware that the numbers of hits of spurious symptoms is probably too high. In the diarrhoea PR I turn it down a bit (done today!). Perhaps I should pull that into it’s own PR so we can see the effect.

Thanks that's useful to know - I'll hold off on looking at specific optimisation the spurious symptoms related events for now then until we've checked if reducing the rate is sufficient to make these events non-performance critical.

re the init of GenericHSI: I had wondered if we could make this into a batch thing, like we did with SpuriousSymptomBatchOnset. ie, one event per day, which does the “work” for each person that is going to have a GenericHSI that day. Every GenericHSI is scheduled by HealthSeekingBehavioir and for the “tomorrow”.. so that module could curate a record of the people instead (???).

Ah that's an interesting idea - given the number of generic HSI events and also the large time spent scheduling them in HealthSeekingBehaviour this sort of approach could be quite a big performance gain, I'll have a look at how this could be done!

matt-graham · 2022-07-14T10:00:25Z

A 5 year / 50k initial population run of scale_run.py on bfb03a7 gives the following SnakeViz plot of the profiling results

with the overall simulation time 25320s.

There seem to be a few parts of some of the newer / rewritten modules that would benefit from some refactoring to improve performance:

In the rti module 4540s (18% of overall run time) is being spent in evaluating the line

TLOmodel/src/tlo/methods/rti.py

Line 3043 in 7669163

any_not_null = df.loc[df.is_alive, 'rt_date_to_remove_daly'].apply(lambda x: pd.notnull(x).any())

I think the poor performance here is due to the rt_date_to_remove_daly column being of object dtype as it contains a list. This means that operations on the column will fallback to Python loops rather than using faster NumPy array operations. As the lists are all of the same length (8) I think it would make more sense here to have 8 distinct columns of data type (matching the 8 numbered rt_injury_* columns). This would then allow the apply operation here to be replaced with something like
```
any_not_null = df.loc[df.is_alive, [f'rt_date_to_remove_daly_{i}' for i in range(8)].notnull().any(axis=1)
```

In the schisto module 2550s (10% of overall run time) is being spent in evaluating the method SchistoSpecies.update_infectious_status_and_symptoms. The main bottlenecks seem to be in evaluating the line

TLOmodel/src/tlo/methods/schisto.py

Line 578 in 7669163

    
           sm.clear_symptoms(person_id=no_longer_high_infection, disease_module=schisto_module)

and

TLOmodel/src/tlo/methods/schisto.py

Lines 550 to 553 in 7669163

    
           correct_status = df.loc[idx].apply( 
        
               lambda x: _inf_status(x['age_years'], x[prop('aggregate_worm_burden')]), 
        
               axis=1 
        
           )

The former might warrant looking at if we can do any further optimizations to SymptomManager though from a brief look I can't see anything obvious that can be improved about the clear_symptoms implementation. The _inf_status function used in the apply in the latter consists of a series of boolean logic that could potentially be translated to boolean operations directly on the column / Series objects which might speed things up as I think the apply will be applied to each row individually.

In the labour module 1600s (6% of overall run time) is being spent in evaluating the Labour.predict method. About half of this seems to be in indexing __getitem__ operations. The repeated indexing of the mni object iby person_id n

TLOmodel/src/tlo/methods/labour.py

Lines 1081 to 1095 in 7669163

    
           df = self.sim.population.props 
        
           mni = self.sim.modules['PregnancySupervisor'].mother_and_newborn_info 
        
           person = df.loc[[person_id]] 
        
           # We define specific external variables used as predictors in the equations defined below 
        
           has_rbt = mni[person_id]['received_blood_transfusion'] 
        
           mode_of_delivery = mni[person_id]['mode_of_delivery'] 
        
           received_clean_delivery = mni[person_id]['clean_birth_practices'] 
        
           received_abx_for_prom = mni[person_id]['abx_for_prom_given'] 
        
           amtsl_given = mni[person_id]['amtsl_given'] 
        
           delivery_setting = mni[person_id]['delivery_setting'] 
        
           delay_one_two = mni[person_id]['delay_one_two'] 
        
           delay_three = mni[person_id]['delay_three'] 
        
           macrosomia = mni[person_id]['birth_weight'] == 'macrosomia'

could be avoided by setting mni = self.sim.modules['PregnancySupervisor'].mother_and_newborn_info[person_id] initially though its not clear if this the bottleneck here as looking at the referenced Pandas __getitem__ operation it seems to be for a _LocationIndexer object suggesting it might be the person = df.loc[[person_id]] which is the culprit. If so it is not immediately clear how to reduce the cost.

tamuri · 2023-03-09T10:23:03Z

Can we get a recent profile viz? I know at least the RTI fix was merged (#682). Thanks.

dimitrasal · 2023-03-13T12:09:51Z

A 5 year / 40k initial population run of scale_run.py on 6b9a1d5 gives the following SnakeViz plot of the profiling results:

with the overall simulation time at ~11500s on an Apple M1 chip.

The ten methods with the highest proportions of the overall time are the following:

labour.BirthandPostnatalOutcomesEvent.apply at 15.66% (1800s )
schisto.SchistoMatureWorms.apply at 10.10% (1160s)
healthseekingbehaviour.HealthseekingBehaviourPoll.apply at 8.95% (1030s)
healthsystem.HealthSystemScheduler.apply at 7.18% (824s)
postnatal_supervisor.PostnatalWeekOneNeonatalEvent.apply at 5.43% (623s)
labour.LabourOnsetEvent.apply at 5.20% (597s)
labour.LabourAtHomeEvent.apply at 4.58% (526s)
postnatal_supervisor.PostnatalSupervisorevent.apply at 4.37% (501s)
postnatal_supervisor.PostnatalWeekOneMaternalEvent.apply at 3.89% (447s)
diarrhoea.DiarrhoeaNaturalRecoveryEvent.apply at 3.6% (413s)

Snakeviz shows larger overheads than pyinstrument with the overall time of a similar run reduced to 9092s.
The results with pyinstrument for 5 year/40 k population can be accessed here
The results with pyinstrument for 5 year/20 k population can be accessed here

Please note that to make the html files accessible in their current format I had to make them public, let me know if this is an issue.

tamuri · 2023-03-15T10:51:33Z

Thanks, Dimitra. My vote for two worth exploring are the HealthSeekingBehaviourPoll (4) and then the SchistoMatureWorms event (2). The first seems to be a fairly lengthy LinearModel which is actually not called very often. It might improve if the LinearModel is setup using a custom function. The second looks like an individual event that is called many times. Need to discuss with @tbhallett (perhaps @tdm32 ?) about whether that can be replaced by population-level event without losing important behaviour of the model.

tbhallett · 2023-03-15T15:18:15Z

On the second point about SchistoMatureWorms, I am sure we could do something using a 'Poll' and do the progression for the whole population at once.

marghe-molaro · 2023-07-26T10:38:16Z

Making a note that scale runs, which so far have focused on mode_appt_constaints = 1, in the future should also consider mode_appt_constraints = 2 (see PR #986) as this may become more widely used by analysts.

Performance issue already identified is related to the length of the HSI queue, which is dependent on assumptions around tclose (see Issue #999. However note that since this issue was first written up some changes where made in the way the the hsi queue is queried under mode_appt_constraints = 2 which may have improved matters).

matt-graham · 2023-11-27T14:26:21Z

/run profiling

matt-graham changed the title ~~Updated scale_run profiling scripts runs at very slow rate~~ Updated scale_run profiling script runs at very slow rate May 21, 2021

matt-graham mentioned this issue May 21, 2021

Improving efficiency of get_appt_footprint_as_time_request #287

Merged

This was referenced Jun 9, 2021

Optimizing get_squeeze_factors implementation #307

Closed

Optimizing get_squeeze_factors implementation #308

Merged

matt-graham self-assigned this Aug 12, 2021

matt-graham mentioned this issue Aug 27, 2021

Add flags to declare properties of LinearModel predictor conditions #357

Merged

matt-graham mentioned this issue Sep 13, 2021

Update scale_run.py default configuration and automatically scale capabilities_coefficient #363

Merged

matt-graham changed the title ~~Updated scale_run profiling script runs at very slow rate~~ Improving performance: profiling scale_run simulations Sep 17, 2021

matt-graham mentioned this issue Nov 5, 2021

Allow 'batched' changes to multiple symptoms at once #409

Merged

This was referenced Aug 9, 2022

Changing rt_date_to_remove_daly property into multiple columns #682

Merged

Avoid use of apply in computing infection status in schisto module #683

Merged

tamuri mentioned this issue Nov 22, 2022

Improve performance #63

Closed

dimitrasal mentioned this issue Apr 3, 2023

changes linear model to custom function in healthseekingbehaviour.py #919

Merged

tamuri added framework performance labels Jul 26, 2023

marghe-molaro mentioned this issue May 10, 2024

Do profiling for mode_appt_constraints = 2 too #1334

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improving performance: profiling `scale_run` simulations #286

Improving performance: profiling `scale_run` simulations #286

matt-graham commented May 21, 2021

matt-graham commented Jun 8, 2021

matt-graham commented Jun 9, 2021

matt-graham commented Jun 11, 2021

matt-graham commented Jul 28, 2021

matt-graham commented Aug 25, 2021

matt-graham commented Aug 25, 2021 •

edited

Loading

matt-graham commented Aug 26, 2021

tbhallett commented Aug 31, 2021 •

edited

Loading

matt-graham commented Sep 17, 2021

matt-graham commented Sep 17, 2021

tbhallett commented Sep 17, 2021

matt-graham commented Sep 17, 2021

matt-graham commented Jul 14, 2022

tamuri commented Mar 9, 2023

dimitrasal commented Mar 13, 2023 •

edited

Loading

tamuri commented Mar 15, 2023 •

edited

Loading

tbhallett commented Mar 15, 2023

marghe-molaro commented Jul 26, 2023 •

edited

Loading

matt-graham commented Nov 27, 2023

Improving performance: profiling scale_run simulations #286

Improving performance: profiling scale_run simulations #286

Comments

matt-graham commented May 21, 2021

matt-graham commented Jun 8, 2021

matt-graham commented Jun 9, 2021

matt-graham commented Jun 11, 2021

matt-graham commented Jul 28, 2021

matt-graham commented Aug 25, 2021

matt-graham commented Aug 25, 2021 • edited Loading

matt-graham commented Aug 26, 2021

tbhallett commented Aug 31, 2021 • edited Loading

matt-graham commented Sep 17, 2021

matt-graham commented Sep 17, 2021

tbhallett commented Sep 17, 2021

matt-graham commented Sep 17, 2021

matt-graham commented Jul 14, 2022

tamuri commented Mar 9, 2023

dimitrasal commented Mar 13, 2023 • edited Loading

tamuri commented Mar 15, 2023 • edited Loading

tbhallett commented Mar 15, 2023

marghe-molaro commented Jul 26, 2023 • edited Loading

matt-graham commented Nov 27, 2023

Improving performance: profiling `scale_run` simulations #286

Improving performance: profiling `scale_run` simulations #286

matt-graham commented Aug 25, 2021 •

edited

Loading

tbhallett commented Aug 31, 2021 •

edited

Loading

dimitrasal commented Mar 13, 2023 •

edited

Loading

tamuri commented Mar 15, 2023 •

edited

Loading

marghe-molaro commented Jul 26, 2023 •

edited

Loading