This repo contains the experimental code for the paper "Automated Generation of Hospital Discharge Summaries Using Clinical Guidelines and Large Language Models"
In brief the method is:
- Convert a set of guidelines from the Royal College of Physcians London (RCP) into a json schema
- Convert an example also provided by RCP into a 1 shot prompt
- De-duplicate a set of physician notes (MIMIC-III used for experiments)
- Use the above as a prompt to an LLM (GPT-4-turbo)
Requirements
poetry install
wget -r -N -c -np --user simonellershawucl --ask-password -P ./mimic_experiments/inputs/ https://physionet.org/files/mimiciii/1.4/NOTEEVENTS.csv.gz
gzip -d mimic_experiments/inputs/physionet.org/files/mimiciii/1.4/NOTEEVENTS.csv.gz
- Deployment steps are set out here
- To recreate results must be "gpt-4" version "1106-Preview"
- IMPORTANT: Turn off content filter to follow MIMIC's terms of use
Note gpt-4-turbo is only available in certain regions (see docs)
Changing out the values for your personal credentials
echo "AZURE_OPENAI_KEY_1 = <YOUR_AZURE_OPENAI_KEY>" >> .env
echo "AZURE_OPENAI_ENDPOINT_1 = <YOUR_AZURE_OPENAI_ENDPOINT>" >> .env
As all experiments used MIMIC-III we cannot distribute the produced summaries and evaluation (conducted by a team of clinicians).
But the notebooks (from 1-4) in mimic_experiments/
allows for recreation of all the discharge summaries evaluated in the paper including as an excel for human annotations.
Also the code used to generate the metrics (notebook 5) is given for transparency however cannot be run without access to the clinical annotation. If this is of interest and you are a credentialed MIMIC user please reach out.
- Simple Streamlit demo
- Some code decisions were suboptimal but are entrenched for reproducbility (e.g. dealing with empty json values when creating annotator excels rather than when saving to json)
If using any of the code or ideas in this repo please cite us!
@inproceedings{ellershaw2024automated,
title={Automated Generation of Hospital Discharge Summaries Using Clinical Guidelines and Large Language Models},
author={Ellershaw, Simon and Tomlinson, Christopher and Burton, Oliver E and Frost, Thomas and Hanrahan, John Gerrard and Khan, Danyal Zaman and Horsfall, Hugo Layard and Little, Mollie and Malgapo, Evaleen and Starup-Hansen, Joachim and others},
booktitle={AAAI 2024 Spring Symposium on Clinical Foundation Models},
year={2024}
}
Please contact simon.ellershaw.20@ucl.ac.uk with any questions