A framework for dealing with the physionet 2019 sepsis dataset and making predictions to be scored against the pre-defined utility function.
My suggestion is to setup a virtual environment folder at /env/ in the root directory of this project. Install the requirements with
pip install -r requirements.txt
Create a data/ folder in the project directory, make this a symlink if you wish to store it somewhere else, then create a data/raw folder.
- Run
src/data/download.pyto download the raw data intodata/raw. - Run
src/data/make_frame.pyto convert the data into a dataframe format useful for visualisation in notebooks. - Run
src/data/preprocess.pyto perform various preprocessing steps (that one may want to change if running your own models) that includes simple feature derivation, and converts the ragged data into a nan filled tensor. - Run
src/models/predict_model.pyto run a simple model and get cross-validated scores.