Skip to content

University project for unsupervised and supervised ML algorithm using scikit-learn

Notifications You must be signed in to change notification settings

Stormy95/TP_ML_Scikit_learn

Repository files navigation

TP 1 = Unsupervised : PCA et Clustering (Kmean et AgglomerativeClustering)

  • By using Scikit Learn python library we analyse 2 datasets (crimes and startup) and the impact of PCA dimension reduction on the data and how we can rewrite the data in a lesser dimension.
  • In a second part on the villes.csv dataset we compare the use of Kmeans, AgglomerativeClustering:ward and AgglomerativeClustering:average on predicting appropriate clusters of cities after applying PCA dimension reduction on the datasets

TP 2 = Supervised : MLP, RandomForest, Decision Tree, Adaboost, Bagging, KNN

  • 2 university projects on supervised learning using Scikit Learn library.
  • First project comparing classification tasks with decision tree and KNN without any data pre-processing. In a second part we compare the performance between gradient boosting; random forests and logistic regression for classification task.
  • Second project we compare: decision tree: CART and ID3, KNN, NaivesBayes, Random Forest, Bagging, MultilayerPerceptron and Adaboost on a classification task, we fine-tune the hyper parameters of each algorithm and we set-up pipelines. In a second part we learn how to deal with heterogeneous datset and how to deal with missing data for numerical and categorical. Then in a last part we learn how to use text as input for a classification task SPAM or NOT SPAM

TP 3 = Anomalie detection : Isolation forest

  • By using Scikit Learn python library we deploy a isolation forest model for anomaly detection a a mickey mouse figure. We learn how to fine-tune the hyper parameters of the algorithm.

About

University project for unsupervised and supervised ML algorithm using scikit-learn

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published