DSBA_int_Spark

A Repository for my works during my Data Science internship at The Sparks Foundation.

Task 1: Supervised Machine Learning

Predicting the marks percentage of a student, based on the no. of study hours, using Simple Linear Regression.
Given Dataset: student_scores.csv
Libraries used: Numpy, Pandas, Matplotlib, Scikit-learn.

Task 2: EDA

Performing Exploratory Data Analysis on Sample Superstore Dataset and derive valuable business problems and weak areas to improve profit.
Given Dataset: SampleSuperstore.csv
Libraries used: Numpy, Pandas, Matplotlib, Seaborn.

Task 3: Unsupervised Machine Learning

Predicting the optimum number of clusters from the given Iris dataset and representing it visually.
Clustered the Species of iris flowers, using K-Means Clustering.
Given Dataset: Iris.csv
Libraries used: Numpy, Pandas, Matplotlib, Seaborn and Scikit-learn.