Allow 'batched' changes to multiple symptoms at once #409

matt-graham · 2021-11-05T17:26:44Z

Based on the analysis of profiling runs in #286 (comment) it appears that there are several points in the code where multiple symptoms are being updated in a loop as SymptomManager.change_symptom allows specifiying on a single symptom to update, and that these operations are consuming relatively large proportions of the overall simulation time. Similarly the current SymptomManager.clear_symptoms only allows clearing the symptoms for a single individual at a time, and its use within loops iterating over sets of person IDs is also a bottleneck.

This PR updates the SymptomManager.change_symptom method to optionally accept a list / set of symptom strings instead of a single symptom string, with all the corresponding symptom bits then simultaneously set or unset. As the SymptomManager_AutoOnsetEvent and SymptomManager_AutoResolveEvent events also internally use SymptomManager.change_symptom to update the symptoms, this change also means that these events also can be 'batched' with a single onset / resolve event, updating multiple symptoms at once, meaning one event can be scheduled when previously an event would be separately scheduled for each symptom.

To enable this batch updating, the BitsetHandler class in tlo.util has been updated to allow the operations to be performed on multiple integer columns corresponding to bitsets with the same set of elements. Rather than specifying a single column when constructing the bitset handler object, a list of column names corresponding to the bitsets to update / test can optionally instead be passed to the relevant BitsetHandler methods. The test suite for the BitsetHandler class in tests/test_bitset.py has been updated to also cover these multiple column operations, and the previous tests for the single column case split into smaller self-contained test cases.

The SymptomManager.clear_symptoms method is also updated to allow passing multiple person IDs to clear symptoms for at once. As SymptomManager.clear_symptoms was found to be a particular bottleneck, the internal logic here has been slightly changed though I think it will still have the same behaviour. Specifically, previously the set of symptoms for which the specified disease module had a bit set for was previously first constructed, and then the bit corresponding to the disease module unset for each of these symptoms. This required several access to the bitset columns to first check which symptoms have the bit set for the disease module, and then subsequently unset these bits. In the new implementation the bits for the specified disease module are unset for all registered symptoms. As unsetting a bit which is unset has no effect this should be equivalent to just unsetting the bits that were set for the specified disease module I think, but saves the additional check operations.

Finally, the downstream usages within loops of SymptomManager.change_symptom / SymptomManager.clear_symptoms in disease modules which were identified in #286 (comment) as being bottlenecks are updated to take advantage of the new batched operations, with specifically Alri, Diarrhoea, Malaria and Measles being updated. As in the Malaria and Measles cases the decision of which persons get which symptoms is probabilistic, I also did a bit of refactoring of the operations to sample the appropriate persons / symptoms to allow using the batched operations.

I am currently running the scale_run.py profiling script after these changes to check the effect on run times and will update with the profiling results when finished. Due to the issue in #385, it is not possible to currently perform runs with the Diarrhoea module registered so I have manually removed this module from the scale_run.py configuration for both the runs before and after the changes in this PR.

matt-graham · 2021-11-05T17:35:59Z

src/tlo/methods/malaria.py

+
+        for symptom in symptom_list_severe:
+            # Let u ~ Uniform(0, 1) and p ~ Uniform(prop_lower, prop_upper),
+            # then the probability of the event (u < p) is (prop_lower + prop_upper) / 2
+            # That is the probability of b == True in the following code snippet
+            #     b = rng.uniform() < rng.uniform(low=prop_lower, high=prop_upper)
+            # and this one
+            #     b = rng.uniform() < (prop_lower + prop_upper) / 2
+            # are equivalent.
+            persons_gaining_symptom = severe_index[
+                rng.uniform(size=len(severe_index))
+                < (
+                    range_symp.at[symptom, "prop_lower"]
+                    + range_symp.at[symptom, "prop_upper"]
+                ) / 2
+            ]
+            # schedule symptom onset
+            self.sim.modules["SymptomManager"].change_symptom(
+                person_id=persons_gaining_symptom,
+                symptom_string=symptom,
+                add_or_remove="+",
+                disease_module=self,
+                duration_in_days=None,
+            )


Previously the decision of whether a person gained a symptom or not was made based on whether a random variable drawn from Uniform(0, 1) was less than a second independent random variable drawn from Uniform(a, b) where 0 < a < b < 1 with a the lower bound for a range of a probability and b the upper bound. The marginal probability of the gaining symptom event under this two stage process is equal to (a + b) / 2 so I have switched to just drawing a single set of Uniform(0, 1) variables here. In probability the behaviour should be identical to previously, but as the sequence of random variates drawn from the pseudo-random number generator will differ the exact decisions made for a given starting seed will differ so for example the final population dataframe checksums for equivalent runs will not match before and after the changes in this PR.

Tagging @tdm32 here just to confirm if this is the logic intended originally.

matt-graham · 2021-11-08T10:18:35Z

Profiling results suggest these changes have actually slowed things down 😬. Total time for a 5 year profiled run of scale_run.py before the changes in this PR on ed25ab1 took 6280s in total with 660s spent in SymptomManager.clear_symptoms and 789s spent in SymptomManager.change_symptoms. Total time for a 5 year profiled run of scale_run.py after the changes in this PR on 4fe211a took 6404s in total with 83s spent in SymptomManager.clear_symptoms and 1148s spent in SymptomManager.change_symptoms. Currently investigating what is causing this and trying to figure out a solution!

matt-graham · 2021-11-16T11:51:43Z

I finally managed to track down what was causing the performance regression. As well as extending SymptomManager.clear_symptoms and SymptomManager.change_symptom to allow batched updates, I also changed how the indexing operations to select the relevant subsets of individuals to apply changes to / use in tests were performed, with the (it turned out) naïve impression that my changes would be faster.

Specifically I changed

person_id = df.index[df.is_alive & (df.index.isin(person_id))]

to

persons = df.loc[person_id]
person_id = persons[persons.is_alive].index

in SymptomManager.change_symptoms. I thought that first selecting only those rows corresponding to person_id by directly indexing by the list of indices and then filtering by is_alive on this smaller subset would be quicker than computing a boolean index on the whole dataframe as df.is_alive & (df.index.isin(person_id)), however this turned out to be a bad assumption with isin much quicker than I'd assumed, and the use of df.loc much slower than performing the indexing operations on df.index.

I also changed

df = self.sim.population.props
group_indices = {
    'children': df.index[df.is_alive & (df.age_years < 15)],
    'adults': df.index[df.is_alive & (df.age_years >= 15)]
}
# For each generic symptom, impose it on a random sample of persons who do not have that symptom currently:
for symp in sorted(self.module.generic_symptoms):
    do_not_have_symptom = self.module.who_not_have(symptom_string=symp)
    for group in ['children', 'adults']:
        ...
        persons_eligible_to_get_symptom = group_indices[group][
            group_indices[group].isin(do_not_have_symptom)
        ]
                ...

to

df = self.sim.population.props
group_selector = {'children': df.age_years < 15, 'adults': df.age_years >= 15}
# For each generic symptom, impose it on a random sample of persons who do not have that symptom currently:
for symp in sorted(self.module.generic_symptoms):
    do_not_have_symptom = self.module.who_not_have(symptom_string=symp)
    for group in ['children', 'adults']:
        ...
        eligible_to_get_symptom = group_selector[group] & do_not_have_symptom
        persons_eligible_to_get_symptom = df.index[eligible_to_get_symptom]
        ...

in SymptomManager.clear_symptoms. Here I thought that performing everything with boolean indices and avoiding isin would be quicker but again this turned out to be wrong.

Prior to the changes in this PR (ed25ab1) a 5 year profiled run of scale_run.py took 6280s in total with 789s spent in SymptomManager.change_symptom and ~515s spent in SymptomManager.clear_symptoms (excluding time spent in SymptomManager.change_symptom which was previously called within SymptomManager.clear_symptoms).

After the updates to revert the changes causing the slowdown detailed above (33ad2a5), a 5 year profiled run of scale_run.py took 5591s in total with 421s spent in SymptomManager.change_symptom and 83s spent in SymptomManager.clear_symptoms.

I've also added some further tests to tests/test_symptommanager.py to check using change_symptom and clear_symptoms to perform batch updates (and also to add some unit tests for methods which didn't previously have separate tests).

tbhallett

Looks great to me! Thanks @matt-graham
Please could you also update the brief documentation we have on SymptomManager to explain the new functionality?

src/tlo/methods/measles.py

src/tlo/methods/symptommanager.py

matt-graham · 2021-11-17T12:49:05Z

Looks great to me! Thanks @matt-graham Please could you also update the brief documentation we have on SymptomManager to explain the new functionality?

Thanks @tbhallett. Yes I will update the wiki with a description of the new functionality.

Access module more cleanly in measles event Remove redundant assertion in change_symptoms

matt-graham · 2021-11-17T15:36:28Z

Looks great to me! Thanks @matt-graham Please could you also update the brief documentation we have on SymptomManager to explain the new functionality?

Thanks @tbhallett. Yes I will update the wiki with a description of the new functionality.

Now added some documentation at https://github.com/UCL/TLOmodel/wiki/Symptoms-and-the-SymptomManager (also changed the name of the page to reflect its now more general purpose)

tamuri

Looks great. Love the overhauled BitsetHandler.

matt-graham added 14 commits November 2, 2021 15:18

Simplifying bitset logic

1afb8dc

Split up bitset handler tests into smaller units

c703d49

Add additional sanity checks for bitset handler

1682485

Generalise bitset handler to allow for updating multiple columns

b848eb2

Remove unused compress method

4f149a9

Add tests for multiple column operations

7e3c512

Allow more generic sequence types

9fc5d51

Reduce number of dataframe accesses

235265f

Fix typo in comment

fa0f9ec

Use new multi-col bitset handler in symptom manager

2341714

Generalise change_symptom to allow updating multiple symptoms

7455d55

Improve efficiency of symptom onset and resolve events

b3a81c0

Change clear_symptoms to batch update symptoms and persons

922c020

Update disease modules to batch change symptoms

4fe211a

matt-graham commented Nov 5, 2021

View reviewed changes

matt-graham marked this pull request as draft November 8, 2021 10:19

matt-graham added 2 commits November 16, 2021 09:42

Reverting some changes based on profiling

33ad2a5

Additional tests for symptom manager

2bbe849

matt-graham marked this pull request as ready for review November 16, 2021 11:28

Change import order to satisfy isort

5c302af

matt-graham requested review from tbhallett and tamuri November 16, 2021 13:49

tbhallett approved these changes Nov 16, 2021

View reviewed changes

src/tlo/methods/measles.py Outdated Show resolved Hide resolved

src/tlo/methods/symptommanager.py Outdated Show resolved Hide resolved

Changes from review suggestions

1dfabbf

Access module more cleanly in measles event Remove redundant assertion in change_symptoms

tamuri approved these changes Nov 23, 2021

View reviewed changes

tamuri merged commit 89e9b50 into master Nov 23, 2021

tamuri deleted the mmg/batch-symptom-change branch November 23, 2021 10:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow 'batched' changes to multiple symptoms at once #409

Allow 'batched' changes to multiple symptoms at once #409

matt-graham commented Nov 5, 2021

matt-graham Nov 5, 2021

tbhallett Nov 8, 2021

matt-graham commented Nov 8, 2021 •

edited

Loading

matt-graham commented Nov 16, 2021 •

edited

Loading

tbhallett left a comment

matt-graham commented Nov 17, 2021

matt-graham commented Nov 17, 2021

tamuri left a comment

Allow 'batched' changes to multiple symptoms at once #409

Allow 'batched' changes to multiple symptoms at once #409

Conversation

matt-graham commented Nov 5, 2021

matt-graham Nov 5, 2021

Choose a reason for hiding this comment

tbhallett Nov 8, 2021

Choose a reason for hiding this comment

matt-graham commented Nov 8, 2021 • edited Loading

matt-graham commented Nov 16, 2021 • edited Loading

tbhallett left a comment

Choose a reason for hiding this comment

matt-graham commented Nov 17, 2021

matt-graham commented Nov 17, 2021

tamuri left a comment

Choose a reason for hiding this comment

matt-graham commented Nov 8, 2021 •

edited

Loading

matt-graham commented Nov 16, 2021 •

edited

Loading