Homebrew
is a mac installer for packages/libraries/etc that works alongside Apple's installers. We need it for git. Install oneliner:
ruby -e "$(curl -fsSL https://raw.github.com/mxcl/homebrew/go/install)"
If you recieve an SSL certificate error:
ruby -e "$(curl --insecure -fsSL https://raw.github.com/mxcl/homebrew/go/install)"
- Macs:
brew install git
- Windows: Install git bash http://openhatch.org/missions/windows-setup/install-git-bash
- The default options will probably work well for you
- Linux: If you're on Linux you should already know how to do this with your package manager. On Ubuntu you can use
apt-get install git
, otherwise find your distribution
Note: If you have issues with brew install
because of an XCode error, try using this Heroku Toolbelt installation that will include git, or choose an OS based installation from this guide: http://git-scm.com/book/en/Getting-Started-Installing-Git
Once you've setup git and github, clone your fork of the class repository. We'll be using the Fork and Pull git model. You will be pushing changes to your forked repository, and submitting pull requests to the class repository.
From the github help page:
The Fork & Pull Model lets anyone fork an existing repository and push changes to their personal fork without requiring access be granted to the source repository. The changes must then be pulled into the source repository by the project maintainer.
cd ~/; git clone git@github.com/gavinmh/GADS12-NYC.git
For example:
cd ~/; git clone git@github.com/gavinmh/GADS12-NYC.git
The easiest install is Anaconda's Python. Download and install here for your computer.
Note to Engineers: If you prefer to not have anaconda's distribution as your primary python, comment out the PATH
line for anaconda in ~/.bash_profile
and add an alias for anaconda's python, ipython and conda package handler:
alias apython="~/anaconda/bin/python"
alias ipython="~/anaconda/bin/ipython"
alias conda="~/anaconda/bin/conda"
For visualizations we'll primarily use matplotlib and yhat's version of ggplot for python:
conda install -c https://conda.binstar.org/public ggplot
Users experiencing ggplot package errors should try pip (this problem was observed on Ubuntu and Windows):
pip install ggplot
Make the directory GADS12-NYC-Students/lab_submissions/lab01
. Make a directory with your first name and last name.
DIR='firstname_lastname'; cd ~/GADS12-NYC-Students/lab_submissions/lab01; mkdir $DIR; open $DIR
With a text or markdown editor, create and save a markdown file with the following content:
- Your name and what you do
- One liner about your coding and math background
- Any social web you use and don't mind sharing (twitter link, for example)
- A data blog post you read recently for sharing with the class
create a branch of the repository with a unique name, and then commit to that repo
git checkout -b my_name_class_1
git add .
git commit -m 'my first git commit!'
git push origin my_name_class_1
Add a pull request. This is the actual submission of your work. You can do this on github by finding your branch and clicking "Create pull request." Developers, feel free to use some command line tool for this if you prefer it.
Again, a link to github documentation on the Fork and Pull git model.
We will always recommend 4 or 5 readings or other support materials for every class, either to supplement the current material, prep for the next class, or covering previous material that students still have questions on.
Reading and other Materials
- Quora: What is it like to be a data scientist?
- Josh Wills: The Life of a Data Scientist
- P-Value.info, Carl Anderson's blog (Director of Data at Warby Parker)
- A look at OKCupid and their detailed work in trends
- DJ Patil, Building Data Science Teams
- Ben Fry's Dissertation: Computational Information Design