Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert excel files to csv files epi_hiv_malaria_measles_schisto_tb #1484

Merged
merged 13 commits into from
Dec 17, 2024

Conversation

mnjowe
Copy link
Collaborator

@mnjowe mnjowe commented Oct 9, 2024

This PR aims at converting excel files in epi, hiv, malaria, measles, schisto and tb to csv files and read the newly created csv files using the newly created read_csv_file method.

Person Responsible - Tara Mangal

@mnjowe mnjowe requested review from tdm32 and tbhallett October 9, 2024 12:30
@mnjowe mnjowe requested review from matt-graham and tamuri October 10, 2024 08:40
@mnjowe
Copy link
Collaborator Author

mnjowe commented Oct 10, 2024

Hi @tdm32, @tbhallett, @matt-graham and @tamuri . I've started converting the excel files to csv and read them using the read_csv_files method. I've currently started with Tara's modules in this PR. As per checks above one test tests/test_htm_scaleup.py is failing for a reason @matt-graham we discussed before which is reading parameter values directly from resourcefile and compare them with parsed parameter values in simulation.

Because we have converted files to csv, all values are converted to strings and read as strings as such assert simulation_param_value = resourcefile_param_value will be False as one will always be in string format whilst another in a format assigned to in read parameters section. Should I convert both values to string i.e. assert str(simulation_param_value) = str(resourcefile_param_value)?

@mnjowe
Copy link
Collaborator Author

mnjowe commented Oct 10, 2024

@matt-graham, In run checks, there are Pylint errors in test_hiv. I can't figure out why Pylint is reconising the output of this line as a dictionary instead of a dataframe. All tests in test_hiv are running just fine

tests/test_hiv.py:1216:27: E1101: Instance of 'dict' has no 'loc' member (no-member)
tests/test_hiv.py:1217:9: E1101: Instance of 'dict' has no 'Year' member (no-member)
tests/test_hiv.py:1218:28: E1101: Instance of 'dict' has no 'loc' member (no-member)
tests/test_hiv.py:1219:9: E1101: Instance of 'dict' has no 'Year' member (no-member)
tests/test_hiv.py:1220:26: E1101: Instance of 'dict' has no 'loc' member (no-member)
tests/test_hiv.py:1221:9: E1101: Instance of 'dict' has no 'Year' member (no-member)

@mnjowe mnjowe self-assigned this Oct 10, 2024
@mnjowe mnjowe added the epi label Oct 10, 2024
@tdm32
Copy link
Collaborator

tdm32 commented Dec 3, 2024

Hi @mnjowe, thanks for doing all of this files. All sheets are present. Just one query, ResourceFile_schisto.xlsx is soon to be replaced with a newer version with many more sheets contained in it. When I do the PR for those changes, should I try and follow your logic here separating each sheet into a csv file?

@mnjowe
Copy link
Collaborator Author

mnjowe commented Dec 3, 2024

Hi @tdm32 . Yes, you can follow same logic and use read_csv_files to read from those newly converted CSV files. Only when you just want to read from one CSV file, using pd.read_csv directly will be more preferred.

a quick way to convert from excel files to CSV files(put it in a test file and run it as a test)

def test_convert_files():
    # get a path to a folder containing the resource file
    resourcefilepath = Path(__file__).parent / '../resources'
    # provide a name for the recource file(s) you want to convert.
    excel_file = ['excel_file_name.xlsx']
    # call a method to convert to csv files setting an optional argument delete excel files to True if you want the excel file deleted
    convert_excel_files_to_csv(resourcefilepath, files=excel_file, delete_excel_files=True)

@tamuri tamuri merged commit 2e6cb9c into master Dec 17, 2024
62 checks passed
@tamuri tamuri deleted the mnjowe/convert_excel_files_to_csv_tara_modules branch December 17, 2024 11:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants