-
-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(struc): database modularization and code improvement #137
Conversation
Could you please add some docstrings to the code? |
pysus/ftp/__init__.py
Outdated
@@ -26,8 +26,16 @@ class File: | |||
def __init__( | |||
self, path: str, name: str, size: int, date: datetime.datetime | |||
) -> None: | |||
try: | |||
name, extension = name.split(".") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe it is more robust to os.path.splitext(name)
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, that's exactly what I needed. Thanks!
pysus/ftp/__init__.py
Outdated
name, extension = name.split(".") | ||
self.name = name | ||
self.extension = extension | ||
self.basename = (".").join([name, extension]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can use os.path.join
here
if file.name.startswith("SRC"): | ||
dis_code = file.name[:3] | ||
elif file.name == "LEIBR22": | ||
dis_code = "LEIV" # MISPELLED FILE NAME |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not really a mispelling the V corresponds to Visceral Leishmaniasis, but still does not follow the pattern of the other file names, so it is ok to add this special treatment here.
ER="AIH Rejeitada com erro", | ||
SP="Serviços Profissionais", | ||
CH="Cadastro Hospitalar", | ||
CM="", # TODO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fccoelho do you know, by any chance, what CM
means here? I couldn't find any reference to it. My guess is Cadastro Municipal
, but I'm not sure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No Idea, I have checked the documentation in the ftp but there is no mention of it. Leave it blank for now.
efe1695
to
a63de45
Compare
@fccoelho I've added a 5 seconds limit to every test because they were taking too much time and stopping the CI before the end of the tests. They should be rather mocked or split into smaller tests in the future. I've also started splitting the dependencies using extras, so now the preprocessing methods would have to install |
2cba0b9
to
92f0ad2
Compare
490b14a
to
b4756e4
Compare
b4756e4
to
e5d2df2
Compare
c2ceea0
to
73a6239
Compare
In [1]: db = SINAN()
In [2]: len(db.files)
Out[2]: 757
In [3]: !ls -la ~/pysus
total 8
drwxrwxr-x 2 bida bida 4096 set 2 13:39 .
drwxr-x--- 58 bida bida 4096 set 2 13:39 ..
In [4]: file = db.files[0]
In [5]: db.describe(file)
Out[5]:
{'name': 'ACBIBR06.dbc',
'disease': 'Acidente de trabalho com material biológico',
'year': 2006,
'size': '28.3 kB',
'last_update': '01-16-2023 02:15PM'}
In [6]: file.download()
Out[6]: '/home/bida/pysus/ACBIBR06.dbc'
In [7]: file = db.files[1]
In [8]: await file.async_download()
In [9]: !ls -la ~/pysus
total 696
drwxrwxr-x 2 bida bida 4096 set 2 13:41 .
drwxr-x--- 58 bida bida 4096 set 2 13:39 ..
-rw-rw-r-- 1 bida bida 28326 set 2 13:40 ACBIBR06.dbc
-rw-rw-r-- 1 bida bida 673314 set 2 13:41 ACBIBR07.dbc
In [10]: db.get_files(dis_codes=["ZIKA", "DENG"])
Out[10]:
[DENGBR00.dbc,
DENGBR01.dbc,
DENGBR02.dbc,
DENGBR03.dbc,
DENGBR04.dbc,
DENGBR05.dbc,
DENGBR06.dbc,
DENGBR07.dbc,
DENGBR08.dbc,
DENGBR09.dbc,
DENGBR10.dbc,
DENGBR11.dbc,
DENGBR12.dbc,
DENGBR13.dbc,
DENGBR14.dbc,
DENGBR15.dbc,
DENGBR16.dbc,
DENGBR17.dbc,
DENGBR18.dbc,
DENGBR19.dbc,
DENGBR20.dbc,
DENGBR21.dbc,
ZIKABR16.dbc,
ZIKABR17.dbc,
ZIKABR18.dbc,
ZIKABR19.dbc,
ZIKABR20.dbc,
ZIKABR21.dbc,
ZIKABR22.dbc,
DENGBR22.dbc,
ZIKABR23.dbc]
In [11]: db.get_files(dis_codes=["ZIKA", "DENG"], years=2017)
Out[11]: [DENGBR17.dbc, ZIKABR17.dbc]
In [12]: file = db.get_files(dis_codes=["ZIKA", "DENG"], years=2017)[1]
In [13]: file
Out[13]: ZIKABR17.dbc
In [14]: file.download()
Out[14]: '/home/bida/pysus/ZIKABR17.dbc'
In [15]: file.info
Out[15]:
{'size': '646938',
'type': 'file',
'modify': datetime.datetime(2021, 11, 23, 18, 3)}
In [16]: db.format(file)
Out[16]: ('ZIKA', '17') |
15b1d8b
to
436dc2b
Compare
436dc2b
to
54949a7
Compare
❯ ipython -i pysus/online_data/SINAN.py
In [1]: download("ZIKA", "2023")
Out[1]: ['/home/bida/pysus/ZIKABR23.dbc']
In [2]: download("ZIKA", ["2023", "22"])
Out[2]: ['/home/bida/pysus/ZIKABR22.dbc', '/home/bida/pysus/ZIKABR23.dbc']
In [3]: download("ZIKA", ["2023", "22", 21])
Out[3]:
['/home/bida/pysus/ZIKABR21.dbc',
'/home/bida/pysus/ZIKABR22.dbc',
'/home/bida/pysus/ZIKABR23.dbc']
In [4]: download(["ZIKA", "CHIK"], ["2023", "22", 21])
Out[4]:
['/home/bida/pysus/CHIKBR21.dbc',
'/home/bida/pysus/CHIKBR22.dbc',
'/home/bida/pysus/ZIKABR21.dbc',
'/home/bida/pysus/ZIKABR22.dbc',
'/home/bida/pysus/CHIKBR23.dbc',
'/home/bida/pysus/ZIKABR23.dbc']
In [5]: get_available_years("DENG")
Out[5]:
[DENGBR00.dbc,
DENGBR01.dbc,
DENGBR02.dbc,
DENGBR03.dbc,
DENGBR04.dbc,
DENGBR05.dbc,
DENGBR06.dbc,
DENGBR07.dbc,
DENGBR08.dbc,
DENGBR09.dbc,
DENGBR10.dbc,
DENGBR11.dbc,
DENGBR12.dbc,
DENGBR13.dbc,
DENGBR14.dbc,
DENGBR15.dbc,
DENGBR16.dbc,
DENGBR17.dbc,
DENGBR18.dbc,
DENGBR19.dbc,
DENGBR20.dbc,
DENGBR21.dbc,
DENGBR22.dbc] |
❯ ipython -i pysus/online_data/SIM.py
In [1]: download("SP", 2020)
Out[1]: ['/home/bida/pysus/DOSP2020.dbc']
In [2]: download("MS", 2015)
Out[2]: ['/home/bida/pysus/DOMS2015.dbc']
In [3]: download(["SC", "PE"], [2015, "04"])
Out[3]:
['/home/bida/pysus/DOPE2004.dbc',
'/home/bida/pysus/DOPE2015.dbc',
'/home/bida/pysus/DOSC2004.dbc',
'/home/bida/pysus/DOSC2015.dbc'] |
94266b5
to
ff28b85
Compare
In [1]: get_available_years("AC")
Out[1]:
[DNRAC1994.dbc,
DNRAC94.DBC,
DNRAC95.DBC,
DNRAC1995.dbc,
DNAC1996.DBC,
DNAC1997.DBC,
DNAC1998.DBC,
DNAC1999.DBC,
DNAC2000.DBC,
DNAC2001.DBC,
DNAC2002.DBC,
DNAC2003.DBC,
DNAC2004.DBC,
DNAC2005.DBC,
DNAC2006.DBC,
DNAC2007.DBC,
DNAC2008.DBC,
DNAC2009.DBC,
DNAC2010.DBC,
DNAC2011.DBC,
DNAC2012.DBC,
DNAC2013.dbc,
DNAC2014.dbc,
DNAC2015.dbc,
DNAC2016.dbc,
DNAC2017.dbc,
DNAC2018.dbc,
DNAC2019.dbc,
DNAC2020.dbc,
DNAC2021.dbc] |
❯ ipython -i pysus/online_data/SIH.py
In [1]: sih.groups
Out[1]:
{'RD': 'AIH Reduzida',
'RJ': 'AIH Rejeitada',
'ER': 'AIH Rejeitada com erro',
'SP': 'Serviços Profissionais',
'CH': 'Cadastro Hospitalar',
'CM': ''}
In [2]: download(["SC", "PE"], [2015, "04"], [1,2,3], ["RD", "RJ"])
Out[2]:
['/home/bida/pysus/RDPE0104.dbc',
'/home/bida/pysus/RDPE0204.dbc',
'/home/bida/pysus/RDPE0304.dbc',
'/home/bida/pysus/RDSC0104.dbc',
'/home/bida/pysus/RDSC0204.dbc',
'/home/bida/pysus/RDSC0304.dbc'] |
In [1]: sia.groups
Out[1]:
{'AB': 'Laudo de Acompanhamento a Cirurgia Bariátrica',
'ABO': 'Acompanhamento Pós Cirurgia Bariátrica',
'ACF': 'Confeção de Fístula Arteriovenosa',
'AD': 'Laudos Diversos',
'AM': 'Laudo de Medicamentos',
'AMP': '',
'AN': 'Laudo de Nefrologia',
'AQ': 'Laudo de Quimioterapia',
'AR': 'Laudo de Radioterapia',
'ATD': 'Tratamento Dialítico',
'BI': 'Boletim Individual',
'IMPBO': '',
'PA': 'Produção Ambulatorial',
'PAM': '',
'PAR': '',
'PAS': '',
'PS': 'Psicossocial',
'SAD': 'Atenção Domiciliar'}
In [2]: download(["AM", "RJ"], [2001, "04"], [1,2,3], ["aq", "PA", "BI"])
Out[2]: ['/home/bida/pysus/PAAM0304.dbc', '/home/bida/pysus/PARJ0304.dbc']
In [3]: download(["AM", "RJ"], [2001, "04", 2010], [1,2,3], ["aq", "PA", "BI"])
Out[3]:
['/home/bida/pysus/PAAM0110.dbc',
'/home/bida/pysus/PAAM0210.dbc',
'/home/bida/pysus/PAAM0304.dbc',
'/home/bida/pysus/PARJ0110.dbc',
'/home/bida/pysus/PARJ0210.dbc',
'/home/bida/pysus/PARJ0304.dbc',
'/home/bida/pysus/PARJ0310.dbc'] |
9241650
to
6f9f999
Compare
🎉 This PR is included in version 0.10.0 🎉 The release is available on:
Your semantic-release bot 📦🚀 |
TODOs:
Base modules
Databases
Functions:
Docs:
Tests:
Listing files & downloading
Parsing