Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CHC Signal #320

Merged
merged 7 commits into from
Oct 21, 2020
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
73 changes: 73 additions & 0 deletions changehc/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
# Change Healthcare Indicator

COVID-19 indicator using outpatient visits from Change Healthcare claims data.
Reads claims data into pandas dataframe.
Makes appropriate date shifts, adjusts for backfilling, and smooths estimates.
Writes results to csvs.


## Running the Indicator

The indicator is run by directly executing the Python module contained in this
directory. The safest way to do this is to create a virtual environment,
installed the common DELPHI tools, and then install the module and its
dependencies. To do this, run the following code from this directory:

```
python -m venv env
source env/bin/activate
pip install ../_delphi_utils_python/.
pip install .
```

*Note*: you may need to install blas, in Ubuntu do
```
sudo apt-get install libatlas-base-dev gfortran
```

All of the user-changable parameters are stored in `params.json`. To execute
chinandrew marked this conversation as resolved.
Show resolved Hide resolved
the module and produce the output datasets (by default, in `receiving`), run
the following:

```
env/bin/python -m delphi_changehc
```

Once you are finished with the code, you can deactivate the virtual environment
and (optionally) remove the environment itself.

```
deactivate
rm -r env
```

## Testing the code

To do a static test of the code style, it is recommended to run **pylint** on
the module. To do this, run the following from the main module directory:

```
env/bin/pylint delphi_changehc
```

The most aggressive checks are turned off; only relatively important issues
should be raised and they should be manually checked (or better, fixed).

Unit tests are also included in the module. To execute these, run the following
command from this directory:

```
(cd tests && ../env/bin/pytest --cov=delphi_changehc --cov-report=term-missing)
```

The output will show the number of unit tests that passed and failed, along
with the percentage of code covered by the tests. None of the tests should
fail and the code lines that are not covered by unit tests should be small and
should not include critical sub-routines.

## Code tour
chinandrew marked this conversation as resolved.
Show resolved Hide resolved

- update_sensor.py: ChangeHCSensorUpdator: reads the data, makes transformations,
- sensor.py: ChangeHCSensor: methods for transforming data, including backfill and smoothing
- load_data.py: methods for loading claims and EHR data
- geo_maps.py: geo reindexing
39 changes: 39 additions & 0 deletions changehc/REVIEW.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
## Code Review (Python)

A code review of this module should include a careful look at the code and the
output. To assist in the process, but certainly not in replace of it, please
check the following items.

**Documentation**

- [ ] the README.md file template is filled out and currently accurate; it is
possible to load and test the code using only the instructions given
- [ ] minimal docstrings (one line describing what the function does) are
included for all functions; full docstrings describing the inputs and expected
outputs should be given for non-trivial functions

**Structure**

- [ ] code should use 4 spaces for indentation; other style decisions are
flexible, but be consistent within a module
- [ ] any required metadata files are checked into the repository and placed
within the directory `static`
- [ ] any intermediate files that are created and stored by the module should
be placed in the directory `cache`
- [ ] final expected output files to be uploaded to the API are placed in the
`receiving` directory; output files should not be committed to the respository
- [ ] all options and API keys are passed through the file `params.json`
- [ ] template parameter file (`params.json.template`) is checked into the
code; no personal (i.e., usernames) or private (i.e., API keys) information is
included in this template file

**Testing**

- [ ] module can be installed in a new virtual environment
- [ ] pylint with the default `.pylint` settings run over the module produces
minimal warnings; warnings that do exist have been confirmed as false positives
- [ ] reasonably high level of unit test coverage covering all of the main logic
of the code (e.g., missing coverage for raised errors that do not currently seem
possible to reach are okay; missing coverage for options that will be needed are
not)
- [ ] all unit tests run without errors
Empty file added changehc/cache/.gitignore
Empty file.
20 changes: 20 additions & 0 deletions changehc/delphi_changehc/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# -*- coding: utf-8 -*-
"""Module to pull and clean indicators from the CHC source.

This file defines the functions that are made public by the module. As the
module is intended to be executed though the main method, these are primarily
for testing.
"""

from __future__ import absolute_import

from . import config
from . import geo_maps
from . import load_data
from . import run
from . import sensor
from . import smooth
from . import update_sensor
from . import weekday

__version__ = "0.0.0"
11 changes: 11 additions & 0 deletions changehc/delphi_changehc/__main__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# -*- coding: utf-8 -*-
"""Call the function run_module when executed.

This file indicates that calling the module (`python -m MODULE_NAME`) will
call the function `run_module` found within the run.py file. There should be
no need to change this template.
"""

from .run import run_module # pragma: no cover

run_module() # pragma: no cover
57 changes: 57 additions & 0 deletions changehc/delphi_changehc/config.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
"""
This file contains configuration variables used to generate the CHC signal.

Author: Aaron Rumack
Created: 2020-10-14
"""

from datetime import datetime, timedelta
import numpy as np


class Config:
"""Static configuration variables.
"""

## dates
FIRST_DATA_DATE = datetime(2020, 1, 1)

# number of days training needs to produce estimate
# (one day needed for smoother to produce values)
BURN_IN_PERIOD = timedelta(days=1)

# shift dates forward for labeling purposes
DAY_SHIFT = timedelta(days=1)

## data columns
COVID_COL = "COVID"
DENOM_COL = "Denominator"
COUNT_COLS = ["COVID"] + ["Denominator"]
DATE_COL = "date"
GEO_COL = "fips"
ID_COLS = [DATE_COL] + [GEO_COL]
FILT_COLS = ID_COLS + COUNT_COLS
DENOM_COLS = [GEO_COL, DATE_COL, DENOM_COL]
COVID_COLS = [GEO_COL, DATE_COL, COVID_COL]
DENOM_DTYPES = {"date": str, "Denominator": str, "fips": str}
COVID_DTYPES = {"date": str, "COVID": str, "fips": str}

SMOOTHER_BANDWIDTH = 100 # bandwidth for the linear left Gaussian filter
MIN_DEN = 100 # number of total visits needed to produce a sensor
MAX_BACKFILL_WINDOW = (
7 # maximum number of days used to average a backfill correction
)
MIN_CUM_VISITS = 500 # need to observe at least 500 counts before averaging


class Constants:
# number of counties in usa, including megacounties
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add a docstring here on what these constants are for, especially since there's also a constants.py file?

Copy link
Contributor

@chinandrew chinandrew Oct 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this + linter and should be good to go. EDIT: there's also a few more instances of single vs double quoting, should be a relatively quick search and replace

NUM_COUNTIES = 3141 + 52
NUM_HRRS = 308
NUM_MSAS = 392 + 52 # MSA + States
NUM_STATES = 52 # including DC and PR

MAX_GEO = {"county": NUM_COUNTIES,
"hrr": NUM_HRRS,
"msa": NUM_MSAS,
"state": NUM_STATES}
7 changes: 7 additions & 0 deletions changehc/delphi_changehc/constants.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
"""Registry for signal names and geo types"""
SMOOTHED = "smoothed_chc"
SMOOTHED_ADJ = "smoothed_adj_chc"
SIGNALS = [SMOOTHED, SMOOTHED_ADJ]
NA = "NA"
HRR = "hrr"
FIPS = "fips"
Loading