Skip to content

Latest commit

 

History

History
46 lines (30 loc) · 2.16 KB

README.md

File metadata and controls

46 lines (30 loc) · 2.16 KB

Linear Regression gcash donation paypal donation

python version scikit version

For this linear regression example, we will be using the heart disease dataset, which is a public health dataset that can be retrieved from Kaggle.

title

For this particular example, we will be only using two fields, the trestbps (resting blood pressure in mm/hg) and thalach (maximum heart rate achieved). There isn't much correlation between the data but for demonstration purposes, we will be using them to estimate linear regression using existing scikit libraries and also by using manual calculations in Python.

To calculate for the intercept or the b in y = mx + b, we use the following formula:

  • Intercept = [(ΣY)(ΣX2) – (ΣX)(ΣXY)] / [n(ΣX2) – (ΣX)^2]

To calculate for the slope or the m in y = mx + b, we use the following formula:

  • Slope = [n(ΣXY) – (ΣX)(ΣY)] / [n(ΣX2) – (ΣX)2]

We then compared our values to what is being calculated in sk-learn.

import matplotlib.pyplot as plt
from scipy import stats

slope, intercept, r, p, std_err = stats.linregress(linear_table["X"], linear_table["Y"])
print(f"y = {slope}x + {intercept}")