ML Model and Live Website to detect incomplete cancer pathology reports in real-time.
The aim of this project was to create and deploy a machine learning model via a web application to allow pathologists to submit a bladder cancer report and get real time alerts via email if the submitted report was classified as incomplete.
The key features required were:
- A NLP machine learning model to parse and classify a pathology report. This was achieved using a custom vectorizer and multi-model architecture to maximize the utilization of the medical data provided.
- A live web application to deploy the model as a proof of concept for a viable and scalable solution to provide pathologists access to feedback. This was achieved by developing a web application using Flask and deploying on Heroku.
- During EDA and model development, significant keywords were found as key identifiers in the reports that tied well with the CRGC's mission to motivate better structured reports. Therefore, the keywords used are also displayed for each submitted path report.
Multi-layer ML model achieved 96% accurancy is classifying bladder cancer path. reports as incomplete. The deployment of the model on the website serves as a viable proof of concept to reduce path report error correction time by 5-6 days and save an estimated of 200-250 lives per year.
Live Website: https://crgc-mvp.herokuapp.com/ (Under development)
Final Deliverable Demo: https://www.youtube.com/embed/gZLGlP98EsA
Final Deliverable Presentation: https://drive.google.com/open?id=1srR26ON6Vu-ygoowqm9AqW7AJcM6NW0Y03QRXd-njM4
The following students worked on this project:
- Peru Dayani
- Developed model architecture and models using python and sckit-learn.
- Developed custom TF-IDF text vectorizer for medical data.
- Developed web app using Flask, python, HTML, bootstrap and JS.
- Deployed web app on Heroku and maintains it.
- Lead team meetings with CRGC representatives.
- Carlos Calderon
- Compared ML model architectures
- Saumya Choudhary
- Conducted EDA on data
- Developed data cleaning pipeline