Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Asthma disease #724

Merged
merged 4 commits into from
Jul 28, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2,393 changes: 2,393 additions & 0 deletions Asthma Disease Detection/Dataset/asthma_disease_data.csv

Large diffs are not rendered by default.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
59 changes: 59 additions & 0 deletions Asthma Disease Detection/Models/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
# Asthma Disease Detection - Models

## Models Implemented
- Logistic Regression
- Random Forest
- Gradient Boosting
- Support Vector Machine
- XGBoost
- K-Nearest Neighbors
- AdaBoost
- Extra Trees
- Bagging
- CatBoost
- LightGBM
- Naive Bayes
- Decision Tree
- Stacking Classifier

## Performance of the Models based on Accuracy Scores
- Logistic Regression: 95.20%
- Random Forest: 95.20%
- Gradient Boosting: 94.99%
- Support Vector Machine: 95.20%
- XGBoost: 95.20%
- K-Nearest Neighbors: 95.20%
- AdaBoost: 95.20%
- Extra Trees: 95.20%
- Bagging: 94.78%
- CatBoost: 95.20%
- LightGBM: 95.20%
- Naive Bayes: 95.20%
- Decision Tree: 87.47%
- Stacking Classifier: 95.20%

![EDA](https://github.com/adi271001/ML-Crate/blob/Asthma-Disease/Asthma%20Disease%20Detection/Images/__results___23_2.png?raw=true)
![EDA](https://github.com/adi271001/ML-Crate/blob/Asthma-Disease/Asthma%20Disease%20Detection/Images/__results___23_4.png?raw=true)
![EDA](https://github.com/adi271001/ML-Crate/blob/Asthma-Disease/Asthma%20Disease%20Detection/Images/__results___23_6.png?raw=true)
![EDA](https://github.com/adi271001/ML-Crate/blob/Asthma-Disease/Asthma%20Disease%20Detection/Images/__results___23_8.png?raw=true)
![EDA](https://github.com/adi271001/ML-Crate/blob/Asthma-Disease/Asthma%20Disease%20Detection/Images/__results___23_10.png?raw=true)
![EDA](https://github.com/adi271001/ML-Crate/blob/Asthma-Disease/Asthma%20Disease%20Detection/Images/__results___23_12.png?raw=true)
![EDA](https://github.com/adi271001/ML-Crate/blob/Asthma-Disease/Asthma%20Disease%20Detection/Images/__results___23_14.png?raw=true)
![EDA](https://github.com/adi271001/ML-Crate/blob/Asthma-Disease/Asthma%20Disease%20Detection/Images/__results___23_16.png?raw=true)
![EDA](https://github.com/adi271001/ML-Crate/blob/Asthma-Disease/Asthma%20Disease%20Detection/Images/__results___23_18.png?raw=true)
![EDA](https://github.com/adi271001/ML-Crate/blob/Asthma-Disease/Asthma%20Disease%20Detection/Images/__results___23_20.png?raw=true)
![EDA](https://github.com/adi271001/ML-Crate/blob/Asthma-Disease/Asthma%20Disease%20Detection/Images/__results___23_22.png?raw=true)
![EDA](https://github.com/adi271001/ML-Crate/blob/Asthma-Disease/Asthma%20Disease%20Detection/Images/__results___23_24.png?raw=true)
![EDA](https://github.com/adi271001/ML-Crate/blob/Asthma-Disease/Asthma%20Disease%20Detection/Images/__results___23_26.png?raw=true)
![EDA](https://github.com/adi271001/ML-Crate/blob/Asthma-Disease/Asthma%20Disease%20Detection/Images/__results___23_28.png?raw=true)
![EDA](https://github.com/adi271001/ML-Crate/blob/Asthma-Disease/Asthma%20Disease%20Detection/Images/__results___24_0.png?raw=true)

## Conclusion
The Logistic Regression, Random Forest, Support Vector Machine, XGBoost, K-Nearest Neighbors, AdaBoost, Extra Trees, CatBoost, LightGBM, Naive Bayes, and Stacking Classifier all achieved the highest accuracy of 95.20%. The Decision Tree model performed the worst with an accuracy of 87.47%. Ensemble methods and gradient boosting techniques tend to perform well on this dataset, indicating their robustness in handling complex patterns and interactions within the data.

## Signature
**Name:** Aditya D
**Github:** [https://www.github.com/adi271001](https://www.github.com/adi271001)
**LinkedIn:** [https://www.linkedin.com/in/aditya-d-23453a179/](https://www.linkedin.com/in/aditya-d-23453a179/)
**Topmate:** [https://topmate.io/aditya_d/](https://topmate.io/aditya_d/)
**Twitter:** [https://x.com/ADITYAD29257528](https://x.com/ADITYAD29257528)

Large diffs are not rendered by default.

91 changes: 91 additions & 0 deletions Asthma Disease Detection/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
# Asthma Disease Detection

## Goal
The goal of this project is to build a machine learning model to accurately detect asthma disease using various classification algorithms.

## Dataset
The dataset used in this project is sourced from [Asthma Disease Dataset on Kaggle](https://www.kaggle.com/datasets/rabieelkharoua/asthma-disease-dataset). It contains data relevant to asthma disease detection.

## Description
This project involves training and evaluating multiple machine learning models to detect asthma disease. The models' performance is compared using metrics such as accuracy and ROC AUC score. Confusion matrices are also plotted for each model to visualize the performance.

## What I Had Done
1. Loaded the asthma disease dataset from Kaggle.
2. Split the dataset into training and testing sets.
3. Implemented multiple machine learning models.
4. Trained and evaluated each model.
5. Plotted confusion matrices for each model.
6. Saved the results (accuracy and ROC AUC) to a CSV file.
7. Generated a comparison plot of the models' performance.

## Models Implemented
- Logistic Regression
- Random Forest
- Gradient Boosting
- Support Vector Machine
- XGBoost
- K-Nearest Neighbors
- AdaBoost
- Extra Trees
- Bagging
- CatBoost
- LightGBM
- Naive Bayes
- Decision Tree
- Stacking Classifier

## Libraries Needed
- pandas
- matplotlib
- seaborn
- scikit-learn
- xgboost
- catboost
- lightgbm
- mlxtend

## EDA Results
Exploratory Data Analysis (EDA) revealed the following key points:
- Feature distributions and correlations
- Class imbalance in the dataset
- Identification of important features for asthma disease detection

![eda](https://github.com/adi271001/ML-Crate/blob/Asthma-Disease/Asthma%20Disease%20Detection/Images/__results___5_1.png?raw=true)
![eda](https://github.com/adi271001/ML-Crate/blob/Asthma-Disease/Asthma%20Disease%20Detection/Images/__results___6_0.png?raw=true)
![eda](https://github.com/adi271001/ML-Crate/blob/Asthma-Disease/Asthma%20Disease%20Detection/Images/__results___8_2.png?raw=true)
![eda](https://github.com/adi271001/ML-Crate/blob/Asthma-Disease/Asthma%20Disease%20Detection/Images/__results___9_1.png?raw=true)
![eda](https://github.com/adi271001/ML-Crate/blob/Asthma-Disease/Asthma%20Disease%20Detection/Images/__results___10_1.png?raw=true)
![eda](https://github.com/adi271001/ML-Crate/blob/Asthma-Disease/Asthma%20Disease%20Detection/Images/__results___11_1.png?raw=true)
![eda](https://github.com/adi271001/ML-Crate/blob/Asthma-Disease/Asthma%20Disease%20Detection/Images/__results___12_1.png?raw=true)
![eda](https://github.com/adi271001/ML-Crate/blob/Asthma-Disease/Asthma%20Disease%20Detection/Images/__results___13_1.png?raw=true)
![eda](https://github.com/adi271001/ML-Crate/blob/Asthma-Disease/Asthma%20Disease%20Detection/Images/__results___14_1.png?raw=true)
![eda](https://github.com/adi271001/ML-Crate/blob/Asthma-Disease/Asthma%20Disease%20Detection/Images/__results___15_1.png?raw=true)
![eda](https://github.com/adi271001/ML-Crate/blob/Asthma-Disease/Asthma%20Disease%20Detection/Images/__results___16_2.png?raw=true)
![eda](https://github.com/adi271001/ML-Crate/blob/Asthma-Disease/Asthma%20Disease%20Detection/Images/__results___17_1.png?raw=true)
![eda](https://github.com/adi271001/ML-Crate/blob/Asthma-Disease/Asthma%20Disease%20Detection/Images/__results___18_0.png?raw=true)

## Performance of the Models based on Accuracy Scores
- Logistic Regression: 95.20%
- Random Forest: 95.20%
- Gradient Boosting: 94.99%
- Support Vector Machine: 95.20%
- XGBoost: 95.20%
- K-Nearest Neighbors: 95.20%
- AdaBoost: 95.20%
- Extra Trees: 95.20%
- Bagging: 94.78%
- CatBoost: 95.20%
- LightGBM: 95.20%
- Naive Bayes: 95.20%
- Decision Tree: 87.47%
- Stacking Classifier: 95.20%

## Conclusion
The Logistic Regression, Random Forest, Support Vector Machine, XGBoost, K-Nearest Neighbors, AdaBoost, Extra Trees, CatBoost, LightGBM, Naive Bayes, and Stacking Classifier all achieved the highest accuracy of 95.20%. The Decision Tree model performed the worst with an accuracy of 87.47%. These results indicate that ensemble methods and advanced gradient boosting techniques tend to perform well on this dataset.

## Signature
**Name:** Aditya D
**Github:** [https://www.github.com/adi271001](https://www.github.com/adi271001)
**LinkedIn:** [https://www.linkedin.com/in/aditya-d-23453a179/](https://www.linkedin.com/in/aditya-d-23453a179/)
**Topmate:** [https://topmate.io/aditya_d/](https://topmate.io/aditya_d/)
**Twitter:** [https://x.com/ADITYAD29257528](https://x.com/ADITYAD29257528)
15 changes: 15 additions & 0 deletions Asthma Disease Detection/Results/model_results.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
Model,Accuracy,ROC AUC
Logistic Regression,0.9519832985386222,0.5225972540045767
Random Forest,0.9519832985386222,0.3812452326468344
Gradient Boosting,0.9498956158663883,0.4405034324942792
Support Vector Machine,0.9519832985386222,0.5087719298245613
XGBoost,0.9519832985386222,0.4032227307398932
K-Nearest Neighbors,0.9519832985386222,0.5416666666666666
AdaBoost,0.9519832985386222,0.4804538520213577
Extra Trees,0.9519832985386222,0.4488939740655987
Bagging,0.9478079331941545,0.4977116704805492
CatBoost,0.9519832985386222,0.4680587337909992
LightGBM,0.9519832985386222,0.4597635392829901
Naive Bayes,0.9519832985386222,0.5242181540808543
Decision Tree,0.8747390396659708,0.5007151029748284
Stacking Classifier,0.9519832985386222,0.49890350877192985
9 changes: 9 additions & 0 deletions Asthma Disease Detection/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
pandas==2.0.3
numpy==1.25.0
matplotlib==3.7.2
seaborn==0.12.2
scikit-learn==1.3.0
xgboost==1.7.6
catboost==1.2
lightgbm==4.0.0
mlxtend==0.21.0
Loading