These materials constitute the textbook for NICO 101 - Introduction to Programming for Big Data. This course will teach you the basics of programming in Python, visualizing data, and web-scraping as well as analyzing unstructured text, structured data, and images.
This course does not explicitly use any 'big' datasets during the quarter, what it does do is teach you the fundamentals of programming and analysis that can then be scaled to any size data. As a part of this we will discuss the basics of statistical analysis and how that can be applied to datasets.
Any comments, questions, or concerns can be directed to:
- Luis A.N. Amaral amaral@northwestern.edu
- Adam R. Pah adamrpah@gmail.com
This bootcamp uses the Anaconda Python 3.8 distribution (Important!!! Install the Python 3.8 distribution, which is the right hand option for each operating system).
There are videos to help you understand the installation process; however, it is a simple installer package that should be similar to any other program (so don't be afraid!).
- OS X https://www.youtube.com/watch?v=UQhOyZXHkxI
- Windows https://www.youtube.com/watch?v=w16iUU6IA5E
You must have Anaconda Python 3.5 installed before the first day of class
We also require that you have a relatively modern operating system.
For Windows, you must be using Windows 7 or later.
For Mac, you must be using OS X 10.9 or later.
The course materials can be downloaded from the repository's github page.
Just download the zip file, unzip it onto your Desktop, and rename the directory NICO-101
.
This text and the majority of the course will conducted with Jupyter Notebook http://jupyter.org. Jupyter Notebook is a 'web-based interactive computational environment', meaning that it allows to write and execute python code in a web page from your own computers. Jupyter Notebook is a relatively new tool and we believe that is an excellent way to teach the basics of python programming and computational data analysis.
Jupyter Notebook is installed by default with the Anaconda Python distribution and can be laucnhed from the Anaconda Navigator program. We have an introductory video that details how to launch and use Jupyter notebook.
This course has been built through the efforts of many that have served as teaching assistants and lecturers in the many iterations of the course. We would like to thank (in alphabetical order):
- Alessandro Febretti
- Justin Finkle
- Adam Hockenberry
- Hyojun Lee
- Jeff Lunt
- Joao Moreira
- Jackie Milhans
- Aaron Oppenheimer
- Nick Timkovich
- Max Wasserman
- Peter Winter
- Jia Wu