Pratt Institute, Center for Continuing and Professional Studies Spatial Analysis and Visualization Initiative (SAVI)
Instructor: Neil Freeman
Location: ISC Building, Lower Level, Room 003
Continuing Education Units (C.E.U.s): 3.0
- Course Overview
- Course Requirements
- Course Readings
- Class Format
- Submitting Assignments
- Course Outline
- Resources
This course introduces the tools, techniques, and general approaches used to acquire, clean, analyze, and visualize open data, with particular emphasis on using web-based technologies and open-source tools at each step of the process.
- You will learn to formulate and articulate a meaningful research question with public open data, as well as meaningfully critique the work of others
- You will learn how to acquire data through open data portals, application programmer interfaces (APIs), and scraping data from web sites
- You will learn how to clean data using open source tools in preparation for analysis
- You will learn how to conduct exploratory data analysis using descriptive statistics
- You will learn to visualize your analytical findings in meaningful and visually-engaging graphics, as well as meaningfully critique the work of others
- You will learn the basics of cartographic design as it relates to visualizing open data
All students will need to bring their own laptop for exercises during class. Time will be set aside to help install, configure, and run the programs necessary for all assignments, projects, and exercises. Where possible, all programs will be free and open-source. All assigned work using services hosted online can be run using free accounts. Please update your system to the latest version of your prefered operating system prior to the first day of class to ensure you're able to successfully install and use the tools in class.
You will be required to have free accounts with Carto.
In addition, please install the text editor of your choice. Some (free) suggestions:
- Sublime Text (All systems)
- Atom (All systems)
- Notepad++ (Windows)
The required readings for this course consist of book chapters, newspaper articles, and short blog posts. The intention is to help give you a foundation in the critical skills ahead of class lectures. All required readings are available online or will be made available through the class portal. Recommended readings are suggestions if you wish to study further the topics covered in class. The books listed in the Suggested Readings section below offer even more depth and an extended discussion of the material we cover in class.
Class runs from 9:30am to 5:30pm. Each day will be consist of 80-to-90–minute blocks broken up by 10-minute breaks and a half-hour break for lunch. Class will be a mix of lecture and exercise work, emphasizing the application of skills covered in the lecture portion of the class. You will have ample time in class to work on practical exercises based on the information presented in lectures.
All assignments will be submitted to lms.pratt.edu. Assignments must be submitted by 9 pm of the Friday before class.
Area | Total Points |
---|---|
class participation | 25 |
weekly critiques | 25 |
weekly projects | 25 |
final project | 25 |
Total | 100 |
Regular, prompt attendence is required.
Your engagement makes class sessions richer and more fulfilling for everyone. Questions are encouraged, and active participation in class discussion and in-class exercises is very important.
Topics will be covered that day in class. Reading assignments are to be read before class in preparation of the lecture and exercises. Assignments are due before the start of the next class and build on the information presented in class.
Find an interesting or visually compelling map (interactive or static) or visualization online and write 2-3 paragraphs on the visualization, discussing the data source(s), the visual style, the creator's goals and audience, and how well the data was represented. Feel free to use the visualization resources listed below. Submit your analysis (include a link to the visualization) to this repository before each class. Come prepared to informally present the project to your classmates.
- Introduction
- Open data
- Introduction to mapping and cartography
- Introduction to CARTO
- Introduction to HTML and CSS
- Introduction to the Unix command line
- Complete the CARTO “Online Mapping for Beginners” course.
- Identify a research question that you would like to explore in this class, with the intention of creating maps and visualizations that will help answer question or clarify the topic.
- Write a short summary of the your topic. What questions would you like to answer? What audience would you like to reach? What data would you like to explore?
- Create a basic CARTO map with one data layer that connects to your topic.
- Embed the map in a basic HTML document with your write-up.
- Write a paragraph describing the dataset. Namely, what data it contains, who created it and how and why they did so. Use information from the dataset's metadata. If that's incomplete, use additional research. Give your best guess if you can't find complete metadata.
- On the same page, include the link to an interesting map or visualization and add your weekly critique.
- Thomas Levine, Introduction to web scraping
- Introduction to APIs ch 1-5
- Ben Wellington "Mapping the Sharing Economy"
- Alex Hern. New York taxi details can be extracted from anonymised data, researchers say, the Guardian, 27 June 2014
- Beth Simone Noveck. Is Open Data the Death of FOIA?, Yale Law Journal, November 21, 2016.
- Jeffrey Heer, Michael Bostock and Vadim Ogievetsky. "A tour through the visualization zoo." Commun. ACM 53.6 (2010): 59-67.
- Manual Web scraping
- Introduction to APIs
- Data types and formats
- Parsing data with csvkit
- Introduction to the Census Factfinder
- Complete the SQL and PostGIS in CARTO course.
- Create a second map, using new data scraped from the web or pulled via an API.
- This map should include data pulled from two sources
- Embed the map in a new HTML document.
- Include a paragraph discussing any challenges you encountered working with the data and/or creating your map in CARTO.
- Weekly critique
- Jeremy B. Merrill, Heart of Nerd Darkness: Why Updating Dollars for Docs Was So Difficult
- The Quartz guide to bad data (ok to skim)
- Quick look at TIGER and American Factfinder
- Introduction to Python
- Geocoding
- Introduction to SQL and spatial SQL
- Opening closed data with Tabula
- Work through "The Basics" at Learn Python (you can skip "String Formatting". If you're feeling good, jump ahead to "List Comprehensions")
- Update your interactive map to include data that you've joined, filtered or modified with an SQL query. Plan a 10-minute presentation explaining the topic your map addresses, the data sources you used, and your methodology.
- Weekly critique
- The Mapbox Guide to Map Design, pages 1-50
- Paul Cote, Mapping with Aggregated Statistics
- Class presentations
- Python for scraping the web
- Quantitative maps on the web
- Review
- Make any desired revisions to your map. Your final project should be embedded on an HTML page that includes an introduction and description of your topic, as well as a description your process and methodology.
- Stack Overflow question & answer community of tech
- GIS Stack Exchange same as above for mapping
- JSON to CSV converter
- Table to TSV bookmarklet (drag to toolbar or "save as bookmark")
- What is the Command Line (series of pages with links to history articles)
- Lifehacker guide to the command line
- Basic Unix commands
- Photos of historic command line interfaces
- Instructions for activating the Linux shell in Windows 10
- For other versions of Windows, install git-bash
- Codecademy Learn to Code for APIs
- Mapbox API Documentation
- transit.land documentation
- OpenWeatherMap API
- OpenWeatherMap API
- NOAA Tides API
- Codecademy Python Course
- MIT Introduction to Computer Science and Programming with Python (free course)
- Learn Python the Hard Way
- U.S. Government open data
- data.ny.gov
- New York City Open Data Portal
- Census TIGER geodata
- UK open data
- Awesome Public Datasets
- Kirk Bourne's list of open data sources
- NYPL Space/Time Directory
- data.sfgov.org
- data.cityofchicago.org
- data.cityofboston.gov
- data.seattle.gov
- data.kcmo.org
- data.lexingtonky.gov
- Carto Academy
- Elements of Cartographic Style by Paul Cote
- Flowing Data
- Census Data Visualization Gallery
- IQuantNY
- bl.ocks
- Propublica News Apps
- Interactive maps collected by the NYPL
- Source (blog of OpenNews)
- The Displacement Alert Project Map
- 80s.nyc
- Displacement Alert Project Map
- Land Cover Trends Field Photo Map
- Where Are The Jobs?
- Where Pedestrians and Bicyclists Are Injured, and Why
- Hubcap
- NYC Transit Explorer
- OnTheMap
- Spies in the Skies (related article)
- Music Fandom
- The Best and Worst New York Neighborhoods...
- A Census Time Machine: Sioux Falls Is the Past, Staten Island the Present, Las Vegas the Future
- White Collar Crime Prediction Map
- Who Wins and Who Loses Under Republicans’ Health Care Plan
- Fry, Ben. Visualizing Data: Exploring and Explaining Data with the Processing Environment. O'Reilly Media, Inc., 2007.
- Garrad, Chris. Geoprocessing with Python. Manning Publications Co., forthcoming. Janert, Philipp K. Data analysis with open source tools. O'Reilly Media, Inc., 2010.
- McCallum, Q. Ethan. Bad Data Handbook: Cleaning Up The Data So You Can Get Back To Work. O'Reilly Media, Inc., 2012.
- Munzner, Tamara. Visualization Analysis and Design. AK Peters, 2014.
- Murray, Scott. Interactive data visualization for the Web. O'Reilly Media, Inc., 2013.
- Tufte, Edward R., and P. R. Graves-Morris. The visual display of quantitative information. Vol. 2. Cheshire, CT: Graphics press, 1983.
This course builds from material prepared by Richard Dunks under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.