Diabetes Prediction

A machine learning model that predicts how likely a patient has diabetes or not

Introduction

The prevalence of diabetes is increasing globally. Diabetes is a significant public health concern due to its associated health complications and economic burden on individuals, families, and healthcare systems. As a result, there is a growing need for data-driven approaches to diabetes prevention, early detection, and management.

Diabetes is a chronic metabolic disorder characterized by high blood glucose levels resulting from defects in insulin secretion, insulin action, or both. Insulin is a hormone produced by the pancreas that regulates blood glucose levels by facilitating the uptake of glucose from the blood into cells for energy production or storage. In individuals with diabetes, the body either does not produce enough insulin or cannot effectively use the insulin produced, resulting in high blood glucose levels, which can lead to various complications, such as damage to the kidneys, nerves, eyes, and blood vessels.

Overview of the data

The dataset used for this project was from Github. This dataset is used to predict how likely a patient has diabetes or not based on the input parameters like Pregnancies, Glucose, Blood pressure, Insulin, Body mass index, etc. All patients used in this data are at least 21 years old.

Features of the dataset:

The dataset contains 2000 individuals data with 9 features set. The detailed description of all the features are as follows:

Pregnancies: indicates the number of pregnancies
Glucose: indicates the plasma glucose concentration
Blood Pressure: indicates diastolic blood pressure in mm/Hg
Skin Thickness: indicates triceps skinfold thickness in mm
Insulin: indicates insulin in U/mL
BMI: indicates the body mass index in kg/m2
Diabetes Pedigree Function: indicates the function which scores likelihood of diabetes based on family history
Age: indicates the age of the person
Outcome: indicates if the patient had a diabetes or not (1 = yes, 0 = no)

Libraries Used

Pandas- used for data manipulation and analysis
Numpy- used for N-dimensional arrays, matrices and linear algebra
Seaborn- used for data visualization
Matplotlib- used for data visualization
Scikit Learn- used for machine learning algorithms

Techniques Used

Data Cleaning
Data Visualization
Machine Learning Modeling

Algorithm Used

Decision Tree

Metrics

Accuracy
Precision
Sensitivity (Recall)
Specificity
F-score

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Diabetes_Prediction_Model.ipynb		Diabetes_Prediction_Model.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Diabetes Prediction

Introduction

Overview of the data

Libraries Used

Techniques Used

Algorithm Used

Metrics

About

Releases

Packages

Languages

Mariamajib/Diabetes-Prediction

Folders and files

Latest commit

History

Repository files navigation

Diabetes Prediction

Introduction

Overview of the data

Libraries Used

Techniques Used

Algorithm Used

Metrics

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages