Skip to content

Code for the paper "Automated Generation of Hospital Discharge Summaries Using Clinical Guidelines and Large Language Models"

License

Notifications You must be signed in to change notification settings

simonEllershaw/llm-discharge-summaries

Repository files navigation

LLM Discharge Summaries

This repo contains the experimental code for the paper "Automated Generation of Hospital Discharge Summaries Using Clinical Guidelines and Large Language Models"

Method Diagrame

In brief the method is:

  1. Convert a set of guidelines from the Royal College of Physcians London (RCP) into a json schema
  2. Convert an example also provided by RCP into a 1 shot prompt
  3. De-duplicate a set of physician notes (MIMIC-III used for experiments)
  4. Use the above as a prompt to an LLM (GPT-4-turbo)

Setup

Requirements

  1. Installed poetry and to
  2. Approval to access MIMIC-III
  3. Ability to deploy Azure OpenAI models

Install required packages

poetry install

Download and unzip MIMIC-III notes

wget -r -N -c -np --user simonellershawucl --ask-password -P ./mimic_experiments/inputs/ https://physionet.org/files/mimiciii/1.4/NOTEEVENTS.csv.gz
gzip -d mimic_experiments/inputs/physionet.org/files/mimiciii/1.4/NOTEEVENTS.csv.gz

Deploy GPT-4-turbo through Azure

  1. Deployment steps are set out here
  2. To recreate results must be "gpt-4" version "1106-Preview"
  3. IMPORTANT: Turn off content filter to follow MIMIC's terms of use

Note gpt-4-turbo is only available in certain regions (see docs)

Setup Azure OpenAI Credentials

Changing out the values for your personal credentials

echo "AZURE_OPENAI_KEY_1 = <YOUR_AZURE_OPENAI_KEY>" >> .env

echo "AZURE_OPENAI_ENDPOINT_1 = <YOUR_AZURE_OPENAI_ENDPOINT>" >> .env

Running

As all experiments used MIMIC-III we cannot distribute the produced summaries and evaluation (conducted by a team of clinicians).

But the notebooks (from 1-4) in mimic_experiments/ allows for recreation of all the discharge summaries evaluated in the paper including as an excel for human annotations.

Also the code used to generate the metrics (notebook 5) is given for transparency however cannot be run without access to the clinical annotation. If this is of interest and you are a credentialed MIMIC user please reach out.

Future Work

  • Simple Streamlit demo
  • Some code decisions were suboptimal but are entrenched for reproducbility (e.g. dealing with empty json values when creating annotator excels rather than when saving to json)

Citing

If using any of the code or ideas in this repo please cite us!

@inproceedings{ellershaw2024automated,
  title={Automated Generation of Hospital Discharge Summaries Using Clinical Guidelines and Large Language Models},
  author={Ellershaw, Simon and Tomlinson, Christopher and Burton, Oliver E and Frost, Thomas and Hanrahan, John Gerrard and Khan, Danyal Zaman and Horsfall, Hugo Layard and Little, Mollie and Malgapo, Evaleen and Starup-Hansen, Joachim and others},
  booktitle={AAAI 2024 Spring Symposium on Clinical Foundation Models},
  year={2024}
}

Contact

Please contact simon.ellershaw.20@ucl.ac.uk with any questions

About

Code for the paper "Automated Generation of Hospital Discharge Summaries Using Clinical Guidelines and Large Language Models"

Resources

License

Stars

Watchers

Forks