i529/Python-MachineLearning at master · littleblackfish/i529

History

Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
churn.csv		churn.csv

README.md

In the remaining weeks, we will do some machine learning exercises with Python. The main package we will be using is scikit-learn. This is the de facto standard machine learning framework for python. Because there are so many excellent tutorials on scikit-learn, I wanted to avoid reinventing the wheel from scratch and decided to use some of the existing ones. I will be emphasizing important points through experience.

We will start with the scikit-learn basic tutorial. Scikit-learn is famous for its excellent documentation which is also a great resource for machine learning in general. This simple tutorial walks us through the organization of scikit-learn and introduces the generic methods fit() and predict() implemented for various classifiers.

We will then go to a realistic example written up by an experienced data scientist. This time we will walk through the process of obtaining, cleaning up and normalizing less-then-perfect data and comparing various classifiers with stratified cross validation.

A copy of the churn dataset used in the write up is included in this repo for your convenience.

During these two weeks I will also go over some basic concepts such as :

Cross validation
ROC and area under ROC
Confusion matrix
Some other concepts regarding evaluation of classifiers

Finally, in the 3rd week, you will train your own classifiers using scikit-learn and the p53 Mutants Data Set and I will be walking around answering your questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python-MachineLearning

Python-MachineLearning

README.md

Files

Python-MachineLearning

Directory actions

More options

Directory actions

More options

Latest commit

History

Python-MachineLearning

Folders and files

parent directory

README.md