Skip to content

Latest commit

 

History

History
 
 

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 

collaboratory logo

Winter 2019-2020 Data Science Bootcamp

Day 3 Introduction to Text Mining in Python

This interactive course covers the basics of text mining using Python. Unstructured text (e.g. news) contains a vast amount of information but can be overwhelming to process. Text mining techniques can facilitate your navigation, organization, and insights discovery with unstructured text data. After the course, you should understand basic text manipulation in Python, standard pre-processing methods for English text, common transformations of unstructured text to quantitative data, and intuitive statistics to help navigate through a corpus.

  • When: Friday, January 17th, 2020
  • Where: Room 903 SSW
  • Instructor info:

Mandatory pre-assignment:

  1. Software setup: In this course we'll be using Anaconda to manage dependencies and Jupyter Notebooks to run code. Please follow these instructions to ensure you have the correct setup.

  2. If you haven't used Python at all previously, I recommend starting with the tutorials on learnpython.org until the regular expression lesson then moving on to DataCamp's free Introduction to Python course for more practice.

Schedule

9:00am - 10:00am: Morning coffee

10:00am - 12:00am Lecture + Lab:

12:00pm - 1:00pm Lunch on own

1:00pm - 1:45pm Lecture:

1:45pm - 2:30pm Lab

2:30pm - 3:00pm Break

Optional lectures or lab time:

3:00pm - 4:00 pm

Prerequisites

  • Basics of statistics (mean, variance, t-test etc.)
  • Basic programming skills in Python
  • Basic understanding of data structures (data frames)