equipment pkgs #1378

EvaJanouskova · 2024-05-28T16:13:53Z

No description provided.

tbhallett

As these call (to lookup_item_codes_from_pkg_name()) are relatively "expensive" (string comparisons for every row in the pandas data frame), I think we want to avoid repeating them. However, as written currently, every time the HSI_Event is run, we have to do the look-up.

So my proposal would be to do this:

In Equipment.__init__(), add something like:

self._pkg_lookup = self._create_pkg_lookup()

In self._create_pkg_lookup(), use the ResourceFile to return a dict of the form: {PackageName: Set(item_codes_in_package)}.
Re-write lookup_item_codes_from_pkg_name() so that it is looking inside that dict (self._pkg_lookup) rather than searching through the ResourceFile. e.g.

def lookup_item_codes_from_pkg_name(self, pkg_name: str) -> Set[int]:
   if pkg_name in self._pkg_lookup:
        return self._pkg_lookup.get(pkg_name)
   else:
        raise ValueError(f"Package Name not recognised: {pkg_name}")

Lastly, we could rename lookup_item_codes_from_pkg_name to from_pkg_name to make it slicker.

What do you think?

EvaJanouskova · 2024-05-28T23:30:43Z

@tbhallett, I mostly agree. Only to call the pkgs, it is still too long:
self.add_equipment(self.healthcare_system.equipment.from_pkg_name('Major Surgery'))

I think, we could make it more user friendly, if we have either

two fncs: add_equipment() and add_equipment_from_pkg(), or
an input parametr for add_equipment() fnc, allowing to use pkg_names instead of item_names, smt like
def add_equipment(self, item_codes: Union[int, str, Iterable[int], Iterable[str]], from_pkg=False) -> None:

Also, shouldn't we call the input item_codes differently since it can be code(s) or name(s) of the item(s)? Maybe just items or equip_items?

tbhallett · 2024-05-29T07:48:10Z

@tbhallett, I mostly agree. Only to call the pkgs, it is still too long:

self.add_equipment(self.healthcare_system.equipment.from_pkg_name('Major Surgery'))

I think, we could make it more user friendly, if we have either

two fncs: add_equipment() and add_equipment_from_pkg(), or

an input parametr for add_equipment() fnc, allowing to use pkg_names instead of item_names, smt like

def add_equipment(self, item_codes: Union[int, str, Iterable[int], Iterable[str]], from_pkg=False) -> None:

I know what you mean, but I think it's ok. That statement may be a bit long to write but is the most transparent and avoids us making big changes to the equipment module at this point.

(Originally the two steps were intended to be separate (like consumables) where the module looks up the item codes and save them, and the HSI EVENT just refers to the save information on the module. But now we have the cached look-up this is not necessary.)

I think the ultimate thing would be just to pass in a package name string as any other string and for it to be automatically detected as being a package.

Please could you raise this as an issue? (Could click "create issue from comment" on this box!)

Also, shouldn't we call the input item_codes differently since it can be code(s) or name(s) of the item(s)? Maybe just items or equip_items?

Yes, good point. We could call the argument where we accept int
or str just 'items'. This can be a separate PR to tidy that up if you'd like to raise it. Thanks.

EvaJanouskova · 2024-05-30T23:13:16Z

I did rebase this on hallett/equipment_changes_and_structure and updated the test_core_functionality_of_equipment_class in test_equipment along with that.

It seems my commits broke the tests. I'm looking into it but so far no success.

EvaJanouskova · 2024-05-30T23:32:44Z

I did rebase this on hallett/equipment_changes_and_structure and updated the test_core_functionality_of_equipment_class in test_equipment along with that.

It seems my commits broke the tests. I'm looking into it but so far no success.

I found one thing: I added two items to the RF_EquipmentCatalogue, but they also need to be added in the data_availability resource file. Feel free to proceed with that if you find time before I do. (I'll continue tomorrow in the late morning/early afternoon.)

@sakshi, @tbhallett, I added the new items (last 2 lines: Endoscope, item code 402; Electrocardiogram, item code 403) of equipment in the ResourceFile_EquipmentCatalogue.csv which were added in the original PR of equipment implementation but were lost in the process of creating the new PR.

Could you please add these items along with estimated availability into the script generating the ResourceFile_Equipment_Availability_Estimates.csv and regenerate this RF?

joehcollins · 2024-05-31T10:54:03Z

@EvaJanouskova - i've added ANC packages but the tests are failing due to an assert error at line 126 of equipment.py. Are test passing on your machine?

EvaJanouskova · 2024-05-31T11:04:11Z

@EvaJanouskova - i've added ANC packages but the tests are failing due to an assert error at line 126 of equipment.py. Are test passing on your machine?

Thank you Joe.

No, the tests are failing because of what I state in the comment above.

If you like to test it on your machine, you can remove last 2 lines in the ResourceFile_EquipmentCatalogue.csv and then all the tests in test_equipment should pass.

…within multiple pkgs

…t its functionality diff --git src/tlo/methods/equipment.py src/tlo/methods/equipment.py index 1759a994f..099226748 100644 --- src/tlo/methods/equipment.py +++ src/tlo/methods/equipment.py @@ -253,8 +253,17 @@ class Equipment: It is expected that this is used by the disease module once and then the resulting equipment item_codes are saved on the module.""" df = self.catalogue + item_codes = set() - if pkg_name not in df['Pkg_Name'].unique().split(", "): + item_codes.update(df.loc[df['Pkg_Name'] == pkg_name, 'Item_Code'].values) + + all_pkg_names = set(df['Pkg_Name'].unique()[~pd.isnull(df['Pkg_Name'].unique())]) + all_multiple_pkg_names = [name for name in all_pkg_names if ", " in name] + for multiple_pkg_name in all_multiple_pkg_names: + if pkg_name in multiple_pkg_name.split(", "): + item_codes.update(df.loc[df['Pkg_Name'] == multiple_pkg_name, 'Item_Code'].values) + + if item_codes: + return item_codes + else: raise ValueError(f'That Pkg_Name is not in the catalogue: {pkg_name=}') - - return set(df.loc[df['Pkg_Name'] == pkg_name, 'Item_Code'].values) diff --git tests/test_equipment.py tests/test_equipment.py index a02ea282f..a39124454 100644 --- tests/test_equipment.py +++ tests/test_equipment.py @@ -22,14 +22,18 @@ def test_core_functionality_of_equipment_class(seed): # Create toy data catalogue = pd.DataFrame( + # PkgWith0+1 stands alone or as multiple pkgs for one item; PkgWith1 is only as multiple pkgs + # for one item; PkgWith3 only stands alone [ {"Item_Description": "ItemZero", "Item_Code": 0, "Pkg_Name": 'PkgWith0+1'}, - {"Item_Description": "ItemOne", "Item_Code": 1, "Pkg_Name": 'PkgWith0+1'}, + {"Item_Description": "ItemOne", "Item_Code": 1, "Pkg_Name": 'PkgWith0+1, PkgWith1'}, {"Item_Description": "ItemTwo", "Item_Code": 2, "Pkg_Name": float('nan')}, + {"Item_Description": "ItemThree", "Item_Code": 3, "Pkg_Name": float('PkgWith3')}, ] ) data_availability = pd.DataFrame( - # item 0 is not available anywhere; item 1 is available everywhere; item 2 is available only at facility_id=1 + # item 0 is not available anywhere; item 1 is available everywhere; item 2 is available only at facility_id=1; + # availability not defined for item 3 [ {"Item_Code": 0, "Facility_ID": 0, "Pr_Available": 0.0}, {"Item_Code": 0, "Facility_ID": 1, "Pr_Available": 0.0}, @@ -134,17 +138,22 @@ def test_core_functionality_of_equipment_class(seed): # Lookup the item_codes that belong in a particular package. # - When package is recognised - assert {0, 1} == eq_default.lookup_item_codes_from_pkg_name(pkg_name='PkgWith0+1') # these items are in the same - # package + # if items are in the same package (once standing alone, once within multiple pkgs defined for item) + assert {0, 1} == eq_default.lookup_item_codes_from_pkg_name(pkg_name='PkgWith0+1') + # if the pkg within multiple pkgs defined for item + assert {1} == eq_default.lookup_item_codes_from_pkg_name(pkg_name='PkgWith1') + # if the pkg only stands alone + assert {3} == eq_default.lookup_item_codes_from_pkg_name(pkg_name='PkgWith3') + # - Error thrown when package is not recognised with pytest.raises(ValueError): eq_default.lookup_item_codes_from_pkg_name(pkg_name='') - equipment_item_code_that_is_available = [0, 1, ] equipment_item_code_that_is_not_available = [2, 3,] + def run_simulation_and_return_log( seed, tmpdir, equipment_in_init, equipment_in_apply ) -> Dict:

…pensive" version, lookup fnc renamed to from_pkg_names() diff --git src/tlo/methods/equipment.py src/tlo/methods/equipment.py index 099226748..3a4fb24ba 100644 --- src/tlo/methods/equipment.py +++ src/tlo/methods/equipment.py @@ -1,6 +1,6 @@ import warnings from collections import defaultdict -from typing import Counter, Iterable, Literal, Set, Union +from typing import Counter, Dict, Iterable, Literal, Set, Union import numpy as np import pandas as pd @@ -77,6 +77,7 @@ class Equipment: # - Data structures for quick look-ups for items and descriptors self._item_code_lookup = self.catalogue.set_index('Item_Description')['Item_Code'].to_dict() + self._pkg_lookup = self._create_pkg_lookup() self._all_item_descriptors = set(self._item_code_lookup.keys()) self._all_item_codes = set(self._item_code_lookup.values()) self._all_fac_ids = self.master_facilities_list['Facility_ID'].unique() @@ -248,22 +249,37 @@ class Equipment: data=row.to_dict(), ) - def lookup_item_codes_from_pkg_name(self, pkg_name: str) -> Set[int]: - """Convenience function to find the set of item_codes that are grouped under a package name in the catalogue. - It is expected that this is used by the disease module once and then the resulting equipment item_codes are - saved on the module.""" - df = self.catalogue - item_codes = set() - - item_codes.update(df.loc[df['Pkg_Name'] == pkg_name, 'Item_Code'].values) - - all_pkg_names = set(df['Pkg_Name'].unique()[~pd.isnull(df['Pkg_Name'].unique())]) - all_multiple_pkg_names = [name for name in all_pkg_names if ", " in name] - for multiple_pkg_name in all_multiple_pkg_names: - if pkg_name in multiple_pkg_name.split(", "): - item_codes.update(df.loc[df['Pkg_Name'] == multiple_pkg_name, 'Item_Code'].values) - - if item_codes: - return item_codes + def from_pkg_names(self, pkg_names: Union[str, Iterable[str]]) -> Set[int]: + """Convenience function to find the set of item_codes that are grouped under requested package name(s) in the + catalogue.""" + # Make into a set if it is not one already + if isinstance(pkg_names, (str, int)): + pkg_names = set([pkg_names]) else: - raise ValueError(f'That Pkg_Name is not in the catalogue: {pkg_name=}') + pkg_names = set(pkg_names) + + item_codes = set() + for pkg_name in pkg_names: + if pkg_name in self._pkg_lookup.keys(): + item_codes.update(self._pkg_lookup[pkg_name]) + else: + raise ValueError(f'That Pkg_Name is not in the catalogue: {pkg_name=}') + + return item_codes + + def _create_pkg_lookup(self) -> Dict[str, Set[int]]: + df = self.catalogue + pkg_lookup = dict() + + pkg_names_raw = set(df['Pkg_Name'].unique()[~pd.isnull(df['Pkg_Name'].unique())]) + all_multiple_pkg_names = set(name for name in pkg_names_raw if ", " in name) + all_pkg_names = pkg_names_raw - all_multiple_pkg_names + for pkg_name in all_pkg_names: + pkg_lookup[pkg_name] = set(df.loc[df['Pkg_Name'] == pkg_name, 'Item_Code'].values) + for multiple_pkg_name in all_multiple_pkg_names: + for pkg_name in multiple_pkg_name.split(", "): + if pkg_name not in all_pkg_names: + pkg_lookup[pkg_name] = set() + all_pkg_names.update({pkg_name}) + pkg_lookup[pkg_name].update(set(df.loc[df['Pkg_Name'] == multiple_pkg_name, 'Item_Code'].values)) + return pkg_lookup diff --git tests/test_equipment.py tests/test_equipment.py index a39124454..0a462f08f 100644 --- tests/test_equipment.py +++ tests/test_equipment.py @@ -139,15 +139,15 @@ def test_core_functionality_of_equipment_class(seed): # Lookup the item_codes that belong in a particular package. # - When package is recognised # if items are in the same package (once standing alone, once within multiple pkgs defined for item) - assert {0, 1} == eq_default.lookup_item_codes_from_pkg_name(pkg_name='PkgWith0+1') + assert {0, 1} == eq_default.from_pkg_names(pkg_name='PkgWith0+1') # if the pkg within multiple pkgs defined for item - assert {1} == eq_default.lookup_item_codes_from_pkg_name(pkg_name='PkgWith1') + assert {1} == eq_default.from_pkg_names(pkg_name='PkgWith1') # if the pkg only stands alone - assert {3} == eq_default.lookup_item_codes_from_pkg_name(pkg_name='PkgWith3') + assert {3} == eq_default.from_pkg_names(pkg_name='PkgWith3') # - Error thrown when package is not recognised with pytest.raises(ValueError): - eq_default.lookup_item_codes_from_pkg_name(pkg_name='') + eq_default.from_pkg_names(pkg_names='') equipment_item_code_that_is_available = [0, 1, ]

…cified pkgs

EvaJanouskova · 2024-05-31T11:14:39Z

rebased on hallett/equipment_changes_and_structure branch

tbhallett · 2024-05-31T11:34:51Z

I did rebase this on hallett/equipment_changes_and_structure and updated the test_core_functionality_of_equipment_class in test_equipment along with that.
It seems my commits broke the tests. I'm looking into it but so far no success.

I found one thing: I added two items to the RF_EquipmentCatalogue, but they also need to be added in the data_availability resource file. Feel free to proceed with that if you find time before I do. (I'll continue tomorrow in the late morning/early afternoon.)

@sakshi, @tbhallett, I added the new items (last 2 lines: Endoscope, item code 402; Electrocardiogram, item code 403) of equipment in the ResourceFile_EquipmentCatalogue.csv which were added in the original PR of equipment implementation but were lost in the process of creating the new PR.

Could you please add these items along with estimated availability into the script generating the ResourceFile_Equipment_Availability_Estimates.csv and regenerate this RF?

The file is this one: src/scripts/data_file_processing/healthsystem/equipment/equipment_availability_estimation.py. It should run for you. It will automatically incorproate item_codes you have added to the catalogue.

EvaJanouskova · 2024-05-31T12:19:29Z

I did rebase this on hallett/equipment_changes_and_structure and updated the test_core_functionality_of_equipment_class in test_equipment along with that.
It seems my commits broke the tests. I'm looking into it but so far no success.

I found one thing: I added two items to the RF_EquipmentCatalogue, but they also need to be added in the data_availability resource file. Feel free to proceed with that if you find time before I do. (I'll continue tomorrow in the late morning/early afternoon.)

@sakshi, @tbhallett, I added the new items (last 2 lines: Endoscope, item code 402; Electrocardiogram, item code 403) of equipment in the ResourceFile_EquipmentCatalogue.csv which were added in the original PR of equipment implementation but were lost in the process of creating the new PR.
Could you please add these items along with estimated availability into the script generating the ResourceFile_Equipment_Availability_Estimates.csv and regenerate this RF?

The file is this one: src/scripts/data_file_processing/healthsystem/equipment/equipment_availability_estimation.py. It should run for you. It will automatically incorproate item_codes you have added to the catalogue.

Doesn't it require to define availability for those items somewhere? If so, where and to what values?

It gives me this error message:

Traceback (most recent call last):
  File "/home/eva/PycharmProjects/TLOmodel/src/scripts/data_file_processing/healthsystem/equipment/equipment_availability_estimation.py", line 337, in <module>
    assert final_equipment_availability_export_full.notnull().all()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError

…cript

tbhallett reviewed May 28, 2024

View reviewed changes

EvaJanouskova force-pushed the hallett/equipment_changes_and_structure_UpdateEJ branch from 7745176 to 6687db9 Compare May 30, 2024 22:22

EvaJanouskova mentioned this pull request May 31, 2024

Equipment: Integration into modules #1341

Merged

EvaJanouskova added 10 commits May 31, 2024 12:12

equip: update lookup pkg names to consider option of having one item …

23251d3

…within multiple pkgs

RF_EquipCatalogue: pkgs defined

5ee5535

modules: usages of equip pkgs included

8ad31d6

RF_EquipCatalogue: Endoscope and ECG added

cacba8b

equip: warning messages clarified

3aae961

equip lookup fnc renamed to from_pkg_names()

297deaf

test_equip: fixes from rebase

f592e2f

[no ci] test_equip: test correct item codes returned for multiple spe…

7e5940f

…cified pkgs

EvaJanouskova force-pushed the hallett/equipment_changes_and_structure_UpdateEJ branch from b34fb01 to 7e5940f Compare May 31, 2024 11:13

joehcollins and others added 3 commits May 31, 2024 13:48

add ANC packages

696db7a

edit equipment_availability_estimation.py to manually add cost category

c762d6a

upate ResourceFile_Equipment_Availability_Estimates.csv running new s…

ba478ce

…cript

tbhallett merged commit 135fc2e into hallett/equipment_changes_and_structure May 31, 2024
10 of 11 checks passed

tbhallett deleted the hallett/equipment_changes_and_structure_UpdateEJ branch May 31, 2024 15:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

equipment pkgs #1378

equipment pkgs #1378

EvaJanouskova commented May 28, 2024

tbhallett left a comment •

edited

Loading

EvaJanouskova commented May 28, 2024 •

edited

Loading

tbhallett commented May 29, 2024

EvaJanouskova commented May 30, 2024

EvaJanouskova commented May 30, 2024 •

edited

Loading

joehcollins commented May 31, 2024

EvaJanouskova commented May 31, 2024 •

edited

Loading

EvaJanouskova commented May 31, 2024

tbhallett commented May 31, 2024

EvaJanouskova commented May 31, 2024

equipment pkgs #1378

equipment pkgs #1378

Conversation

EvaJanouskova commented May 28, 2024

tbhallett left a comment • edited Loading

Choose a reason for hiding this comment

EvaJanouskova commented May 28, 2024 • edited Loading

tbhallett commented May 29, 2024

EvaJanouskova commented May 30, 2024

EvaJanouskova commented May 30, 2024 • edited Loading

joehcollins commented May 31, 2024

EvaJanouskova commented May 31, 2024 • edited Loading

EvaJanouskova commented May 31, 2024

tbhallett commented May 31, 2024

EvaJanouskova commented May 31, 2024

tbhallett left a comment •

edited

Loading

EvaJanouskova commented May 28, 2024 •

edited

Loading

EvaJanouskova commented May 30, 2024 •

edited

Loading

EvaJanouskova commented May 31, 2024 •

edited

Loading