GitHub - ubcs3/2016-Fall: UBC Scientific Software Seminar: Machine Learning in Python with scikit-learn

UBC Scientific Software Seminar

The UBC Scientific Software Seminar is inspired by Software Carpentry and its goal is to help students, graduates, fellows and faculty at UBC develop software skills for science.

Fall 2016: Machine Learning in Python with scikit-learn

OUTLINE

What are the learning goals?
- To learn how to use scikit-learn to solve machine learning problems
- To master Python programming for scientific computing
- To learn mathematics and statistics applied to data science and machine learning
- To meet and collaborate with other students and faculty interested in scientific computing
What software tools are we going to use?
- scikit-learn: machine learning in Python
- SciPy Stack: scientific computing with NumPy, SciPy, matplotlib and pandas
- Python
- Jupyter Notebooks: execute code with accompanying text, markdown and LaTeX all in the browser
- Git/GitHub: manage projects locally from the command line with Git and collaborate online with GitHub
What scientific topics will we study?
- Machine learning fundamentals (following tutorials provided by scikit-learn.org):
  - Regression, classification, clustering, dimensionality reduction
- Special topics:
  - Natural Language Processing
Where do we start? What are the prerequisites?
- UBCS3 Fall 2016 is a continuation of UBCS3 Summer 2016 which included:
  - Bash shell
  - Git/GitHub
  - Python programming
  - SciPy stack: NumPy, Scipy, matplotlib and pandas
  - Basic examples using scikit-learn
- Calculus, linear algebra, probability and statistics
Who is the target audience?
- Everyone is invited!
- If the outline above is at your level, perfect! Get ready to write a lot of code!
- If the outline above seems too intimidating, come anyway! You'll learn things just by being exposed to new tools and ideas, and meeting new people!
- If you have experience with all the topics outlined above, come anyway! You'll become more of an expert by participating as a helper/instructor!

SCHEDULE

Fall 2016 will consist of weekly 1-hour meetings held from October until mid-December. The regular scheduled time is Friday 1-2pm (with additional hour 3-4pm for those who cannot attend 1-2pm).

Week 1 - Friday October 7 - 1-2pm - LSK 121 [Notes]
- Overview of machine learning problems
- Exploring the scikit-learn documentation
- Getting to know the scikit-learn API
- First examples with builtin example datasets
Week 2 - Friday October 14 - 1-2pm - LSK 121 [Notes]
- Regression Example: Diabetes dataset
  - A closer look at least squares linear regression calculations
  - Can we improve R2? Let's create more features
  - Splitting the dataset: Training data and testing data
- Classification Example: Hand-written digits dataset
  - K-nearest neighbors classifier
  - Evaluating the model
Week 3 - Friday October 21 - 1-2pm - LSK 121 [Notes]
- Dimensionality reduction
- Principal component analysis
- Visualizing the digits dataset
- Linear algebra behind principal component analysis
Week 4 - Friday October 28 - 1-2pm - LSK 121 [Notes]
- PCA revisted
  - Visualizing principal components
- Unsupervised learning
  - Clustering with K-means
  - Digits dataset: How many different kinds of 1s are there?
  - Combining KMeans with PCA
Week 5 - Friday November 4 - 1-2pm - LSK 121 [Notes]
- Kernel density estimation and Gaussian processes - Presented by @sempwn
Remembrance Day - No meeting November 11
Week 6 - Friday November 18 - 1-2pm - UCLL 109
- Natural Language Processing with nltk: Movie Review Classification - Presented by @dbhaskar92
Week 7 - Friday November 25 - 1-2pm - UCLL 109 [Notes]
- Natural Language Processing with nltk: Movie Review Classification (Continued)
  - Working with nltk movie review dataset
  - Using regular expressions to remove punctuation and stopwords
  - Creating feature vectors from movie reviews
  - Applying a Naive Bayes classifier

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
2016-10-07-notes.ipynb		2016-10-07-notes.ipynb
2016-10-14-notes.ipynb		2016-10-14-notes.ipynb
2016-10-21-notes.ipynb		2016-10-21-notes.ipynb
2016-10-28-notes.ipynb		2016-10-28-notes.ipynb
2016-11-04-notes.ipynb		2016-11-04-notes.ipynb
2016-11-25-notes.ipynb		2016-11-25-notes.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

UBC Scientific Software Seminar

Fall 2016: Machine Learning in Python with scikit-learn

OUTLINE

SCHEDULE

About

Releases

Packages

Contributors 2

Languages

ubcs3/2016-Fall

Folders and files

Latest commit

History

Repository files navigation

UBC Scientific Software Seminar

Fall 2016: Machine Learning in Python with scikit-learn

OUTLINE

SCHEDULE

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages