Identifying medical conditions in clinical notes can add business value to health care organizations. Pregnancy is one example of a condition that can be identified.
This project used the HuggingFace transformers
library to fine-tune a large language model to identify pregnancy in clinical notes, and used Gradio to publish an app interface.
Key steps in this project:
- Used the
kaggle
API to download a publicly available dataset of clincial notes - Augmented the dataset with synthetic data, and used upsampling to address class imbalance
- Used the HuggingFace
transformers
module to fine-tune an existing large language model to the classification task - Used
sklearn
metrics to evaluate model performance on a test set - Used the
gradio
module to develop an app to enable testing of use cases - Published the fine-tuned model and Gradio app on HuggingFace