-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Caching appointment time requests #322
Conversation
Cache computed appointment time requests generated from EXPECTED_APPT_FOOTPRINT in the HSI_Event instance
@matt-graham this is great. Really amazed at the high number of calls and the magnitude of this improvement. |
Yes I was also surprised at the number of calls! I plan to do a bit of digging to figure out why so many calls are being generated in the first place and see if there are any potential optimizations to the scheduler which could help reduce this load, but this seemed a reasonable interim solution, and turned out to be surprisingly effective 😅.
Thanks that is good to know! We could potentially semi-enforce this by making |
Thanks. Just to note that the configuration of that profiling script is designed to make the capacity of the healthsystem low meaning that many appointments are forced to be held over (so it’s a worst case, by design). It might be (I hope it’s not) because many instances of a kind of appointment is scheduled that is somehow never possible and the HSI has no ‘tclose’ arguenent, meaning that it’s tried, and frustrated, on every day for the rest of the simulation. Maybe the default on ‘tclose’ should be less generous than that: eg to one week after the ‘topen’….? But I will be lead by your findings of course.
Cool. I’d welcome that enforcement. thanks very much |
From an attempt at making the There are a series of cases in TLOmodel/tests/test_dxmanager.py Lines 253 to 259 in 7416d56
TLOmodel/tests/test_dxmanager.py Lines 263 to 271 in 7416d56
TLOmodel/tests/test_dxmanager.py Lines 310 to 324 in 7416d56
There is also a case in TLOmodel/src/tlo/methods/postnatal_supervisor.py Lines 1164 to 1209 in 7416d56
where it appears that a In both set of cases it seems an alternative solution that wouldn't require updating the An alternative (or complementary) approach would be to make the caching mechanism more intelligent by making the cache a dictionary keyed by a tuple of the relevant attributes of the_appt_footprint = hsi_event.EXPECTED_APPT_FOOTPRINT if actual_appt_footprint is None else actual_appt_footprint This caching mechanism would probably have more overhead than the current simpler approach due to the dictionary indexing operations and construction of the key. Equally it would potentially result in fewer cache misses (post #323 being merged) if we took the latter approach of allowing keying by the |
Okay, obviously that is just me doing the testing in a clumsy/lazy way: this would be considered an abuse of the HSI system rather than its use! I suppose that what I should have done is create and HSI event specific to each person rather than "recycling" the same HSI event.
So, I'm not really clear on what is going on here, so have to call on @joehcollins again. @joehcollins - is it intentional that the HSI target is switched between the mother and child in this way? Is it being done to overcome some constraint you hit on in the DxManager perhaps? It's hard to judge/think about the other solutions without understanding this use case. I have a feeling that this may not be necessary.
This sounds reasonable to me, but I defer @matt-graham and @tamuri to make the judgement call (after it's been established that we actually need to have such a mechanism (to accommodate target being changed) -- my instinct being atm that it isn't). |
@tbhallett - yes this was because in this version of the code I was delivering interventions to newborns and mothers within the same HSI (assuming that postnatal care would be delivered to both individuals as they would attend together and to keep the number of HSIs down) so had to change the hsi target when calling the dx_test for it to work on the newborn (maybe that was a bit hacky) Conveniently i've actually removed this logic in the version that i'm currently working on and now I actually do have a separate PNC HSI for mothers and newborns (as coverage is actually different for mothers and newborns). I've checked my modules and theres no where im changing the hsi |
Thanks @joehcollins -- great that this doesn't happen anymore. We could have made the simple change that the @matt-graham - what's the best way forward? |
Thanks @joehcollins and @tbhallett for the explanations! My inclination would probably be to switch to the more complicated caching mechanism based on the event attributes, rather than trying to make the relevant While (modulo needing to make the changes to |
Profiling results from a run with the updated cache mechanism suggested above suggest the overhead from constructing the key and the dictionary accesses are minimal. Specifically 1 year run of and concentrating specifically on the time spent in Compared to the previous simpler caching mechanism, there appears to be a greater speedup in the overall breakdown, though as the time spent in |
Profiling of a 1 year run of
scale_run.py
shows that considerable time is still being spent inHealthSystem.get_appt_footprint_as_time_request
even with the optimizations in #287, #303 and #313.The SnakeViz output for a 1 year
scale_run.py
run shows an overall breakdownwith
HealthSystem.get_appt_footprint_as_time_request
corresponding to the dark orange bar beneath the light bluehealthsystem.py1394(<listcomp>)
bar on the far left, with the breakdown withinHealthSystem.get_appt_footprint_as_time_request
as followsMost of time seems to be spent dataframe access operations in
HealthSystem.get_facility_info
.The high number of calls of
get_appt_footprint_as_time_request
(149496460 in this 1 year run) I think suggests that it is being called multiple times for eachHSI_Event
due to the event being put in the hold-over list for running on subsequent days by the health system scheduler. This suggests we can cache the appointment time request generated on the first call toget_appt_footprint_as_time_request
for a HSI event within the event and return it on any subsequent calls on the same event, with this idea being the basis of this PR.In general the generated time request can depend on both a HSI event and a appointment footprint which may differ from the
EXPECTED_APPT_FOOTPRINT
of the event. For the sake of simplicity here I only cache time requests generated from theEXPECTED_APPT_FOOTPRINT
footprint rather than any updated footprint passed via theactual_appt_footprint
argument toget_appt_footprint_as_time_request
, with a non-None
value for this argument causing the time request to always be recomputed.The caching mechanism here assumes that the
HSI_Event
is essentially static after creation - in particular it assumes that neither theEXPECTED_APPT_FOOTPRINT
target
attributes are changed after initial creation such that a previously computed time request will remain valid. There is also an assumption that the district of residence of the target cannot change (as this would change the facility information returned byget_facility_info
). As far as I can tell these assumptions are valid currently, with this backed up by the final population dataframe after a 1 year run being the same before and after the changes in this PR. These assumptions could however potentially be invalidated by future changes to the code. While we could potentially make theEXPECTED_APPT_FOOTPRINT
andtarget
attributes of aHSI_Event
immutable to guard against this, this would still not protect against the possibility of the target's district being updated in the dataframe (and explicitly checking that this value is the same to check if the cached time request is valid would remove a lot of the performance gain given the proportion of time spent currently inget_facility_info
).The SnakeViz output for a 1 year
scale_run.py
run after the changes made in this PR1 gives the following overall breakdownand specifically looking at
get_appt_footprint_as_time_request
From which we can see that the time spent in computing
get_appt_footprint_as_time_request
has been drastically reduced (from 4330s to 120s).1. The profiling run was in fact for a slightly different version of the code in this PR which didn't create a
_cached_time_request
attribute with valueNone
in the__init__
method ofHSI_Event
but instead simply created this attribute after the first call toget_appt_footprint_as_time_request
and checked for the presence of this attribute withhasattr
when deciding whether to return a cached value, hence the presence of thehasattr
block in the SnakeViz output and slight mismatch with the line numbers.