Skip to content

This application uses machine learning to predict the likelihood of Parkinson's Disease based on voice measurements.

License

Notifications You must be signed in to change notification settings

Mr-Imperium/Parkinson-Disease-Pred

Repository files navigation

Parkinson's Disease Prediction Model

Parkinson's Disease Python Streamlit scikit-learn

A machine learning web application that predicts the likelihood of Parkinson's Disease based on biomedical voice measurements.

📋 Overview

This project implements a Random Forest Classifier to predict Parkinson's Disease using various voice measurement features. The model is deployed as a web application using Streamlit, making it accessible for medical professionals and researchers.

🔍 Features

  • Trained on the UC Irvine ML Repository Parkinson's Dataset
  • Uses Random Forest Classifier with optimized hyperparameters
  • Interactive web interface for real-time predictions
  • Visualization of feature importance and predictions
  • Cross-validated model with detailed performance metrics

🛠️ Technical Architecture

  • Model: Random Forest Classifier with GridSearchCV optimization
  • Web Framework: Streamlit
  • Data Processing: Pandas, NumPy
  • Visualization: Plotly Express
  • Model Persistence: Pickle, Joblib

📊 Model Performance

The model is trained using GridSearchCV with the following hyperparameters:

  • Number of estimators: [100, 200]
  • Max depth: [10, 15, 20]
  • Min samples split: [2, 5]
  • Min samples leaf: [1, 2]
  • Max features: ['sqrt', 'log2']

Performance metrics on the test set are available in the training output.

🚀 Getting Started

Prerequisites

  • Python 3.11 or higher
  • pip package manager

Installation

  1. Clone the repository:
git clone https://github.com/yourusername/parkinson-disease-pred.git
cd parkinson-disease-pred
  1. Install required packages:
pip install -r requirements.txt
  1. Train the model (optional, pre-trained model included):
python train_model.py
  1. Run the web application:
streamlit run app.py

Dataset

The dataset used for training includes various biomedical voice measurements from subjects with and without Parkinson's Disease. Key features include:

  • MDVP measurements (Fo, Fhi, Flo, etc.)
  • Jitter and Shimmer measurements
  • NHR and HNR ratios
  • Status (health status of the subject)

📖 Usage

  1. Launch the web application
  2. Input the required voice measurements
  3. Click "Predict" to get the model's prediction
  4. View the visualization of feature importance and prediction results

🔧 Model Training Process

The train_model.py script performs the following steps:

  1. Loads and preprocesses the dataset
  2. Splits data into training and test sets
  3. Performs GridSearchCV for hyperparameter optimization
  4. Trains the Random Forest model
  5. Evaluates model performance
  6. Saves the trained model

📈 Feature Importance

The model identifies the most significant voice measurements for prediction. The top features and their importance scores are displayed in the training output.

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the project
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

📝 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

  • UCI Machine Learning Repository for the Parkinson's Disease dataset
  • Streamlit team for the excellent web framework
  • scikit-learn developers for the machine learning tools

📞 Contact

For questions and feedback, please open an issue in the GitHub repository.

⚠️ Disclaimer

This tool is for research and educational purposes only. It should not be used as a substitute for professional medical diagnosis. Always consult with healthcare professionals for medical advice and diagnosis.

About

This application uses machine learning to predict the likelihood of Parkinson's Disease based on voice measurements.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages