Skip to content

Latest commit

 

History

History

Python-MachineLearning

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 

In the remaining weeks, we will do some machine learning exercises with Python. The main package we will be using is scikit-learn. This is the de facto standard machine learning framework for python. Because there are so many excellent tutorials on scikit-learn, I wanted to avoid reinventing the wheel from scratch and decided to use some of the existing ones. I will be emphasizing important points through experience.

We will start with the scikit-learn basic tutorial. Scikit-learn is famous for its excellent documentation which is also a great resource for machine learning in general. This simple tutorial walks us through the organization of scikit-learn and introduces the generic methods fit() and predict() implemented for various classifiers.

We will then go to a realistic example written up by an experienced data scientist. This time we will walk through the process of obtaining, cleaning up and normalizing less-then-perfect data and comparing various classifiers with stratified cross validation.

A copy of the churn dataset used in the write up is included in this repo for your convenience.

During these two weeks I will also go over some basic concepts such as :

Finally, in the 3rd week, you will train your own classifiers using scikit-learn and the p53 Mutants Data Set and I will be walking around answering your questions.