Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better Handling of 'None' in ARG Drug Table #194

Merged
merged 11 commits into from
Aug 16, 2023
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
# Unreleased

* Fixed an issue where the string "None" in the drug table would be parsed differently by different versions of pandas (#175).
* Upgraded to pandas version 2.
* Added CGE-predicted phenotypes to Pointfinder output.
* The resfinder.tsv and pointfinder.tsv outputs now contain a Notes column.
Expand Down
12 changes: 10 additions & 2 deletions staramr/databases/resistance/ARGDrugTable.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,9 @@ def __init__(self, file=None, info_file=DEFAULT_INFO_FILE):
self._file = file

if file is not None:
self._data = pd.read_csv(file, sep='\t', dtype=self.DTYPES)
# "None" is recognized as a NA/NaN string in pandas 2.
# However, in pandas < 2, "None" is not a default NA value, so we must be explicit.
self._data = pd.read_csv(file, sep='\t', dtype=self.DTYPES, na_values="None")

def get_resistance_table_info(self):
"""
Expand All @@ -41,4 +43,10 @@ def _drug_string_to_correct_separators(self, drug):
:param drug: The drug string.
:return: The drug string with correct separators/spacing.
"""
return ', '.join(drug.split(','))

if type(drug) is str:
result = ', '.join(drug.split(','))
else:
result = drug

return result
20 changes: 20 additions & 0 deletions staramr/tests/unit/results/test_ARGDrugTablePointfinder.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
import logging
import unittest
import pandas

from staramr.databases.resistance.pointfinder.ARGDrugTablePointfinder import ARGDrugTablePointfinder

logger = logging.getLogger('ARGDrugTablePointfinderTest')


class ARGDrugTablePointfinderTest(unittest.TestCase):

def setUp(self):
self.arg_drug_table = ARGDrugTablePointfinder()

def testNoneEntry(self):
# Tests when the entry for the drug is "None"
drug = self.arg_drug_table.get_drug("escherichia_coli", "parC", 57)

# Specifically, we're interested in it not crashing.
self.assertTrue(pandas.isna(drug))