This project demonstrates how to predict height based on weight using a simple linear regression model in Python. It utilizes libraries like Pandas, Matplotlib, NumPy, Seaborn, and Scikit-learn.
The project uses the "Weight-Height Polynomial Dataset.csv" file, which contains weight and height data.
- Pandas: For data manipulation and analysis.
- Matplotlib: For creating visualizations.
- NumPy: For numerical computations.
- Seaborn: For enhanced visualizations.
- Scikit-learn: For building and evaluating the linear regression model.
- Data Loading and Exploration: The dataset is loaded using Pandas, and basic exploratory data analysis is performed.
- Data Visualization: Scatter plots and pair plots are used to visualize the relationship between weight and height.
- Data Preprocessing: The data is split into training and testing sets using
train_test_split
. The weight feature is standardized usingStandardScaler
. - Model Building: A linear regression model is created and trained using the training data.
- Model Evaluation: The model's performance is evaluated using metrics like Mean Squared Error (MSE), Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and R-squared.
- Prediction: The trained model is used to predict height for new weight values.
- Clone the repository:
git clone <repository_url>
- Install dependencies:
pip install pandas matplotlib numpy seaborn scikit-learn
- Run the Jupyter Notebook: Open and run the notebook to see the code and results.
The model achieves an R-squared score of [insert R-squared score here], indicating a [good/moderate/poor] fit to the data.