Skip to content

Latest commit

 

History

History
22 lines (14 loc) · 2.41 KB

File metadata and controls

22 lines (14 loc) · 2.41 KB

<<< Previous | Next >>>

Representing Data

Data scientists often use visualizations to represent data. Visualizations attempt to show patterns in data by rendering data in a visual medium. Some of the most common types of visualizations include:

  • Bar Chart — Uses a set of lines to represent numeric or count data. A common use for bar charts is to represent a count of categorical data. For example, a bar chart might show how many times that cats, dogs, or birds occur in a dataset where each entity is an animal. The number of occurrences of each count can be compared by comparing the lengths of each line.
  • Pie Chart — Pie charts are used for many of the same purposes as a bar chart. They show the proportions in which each count of a categorical dataset appear. If our data set were 60% dogs, 30% cats, and 10% birds, a pie chart might represent each of these proportions as a slice in a larger circle.
  • Line Chart — A line chart shows a change over a set of ordered numbers, most frequently time. For example, a line chart might show how adoption rates of cats changed over the course of a year.
  • Scatterplot — A scatterplot compares how two sets of numeric data co occur. Scatterplots are useful for representing an association between two sets of numeric data. For example, we might use a scatterplot to show an association between outdoor temperature and ice cream sales.

Blind and visually impaired individuals can access visualizations in a number of ways:

  • Tactile graphics represent data through raised surfaces that can be felt manually.
  • Sonifications present data through changes in qualities of sound, such as pitch, over time.
  • Descriptions or alternative text present data in natural language.

While these approaches are rich ways to access visualizations as directly as possible, they are not always available. In this workshop, we'll focus on ways to summarize data by manipulating it directly in the Python REPL. That is, we'll attempt to have a conversation with our data. This approach is in some ways more limited, and in some ways richer, than other approaches to representing data.

In the following sections, we'll learn about ways of representing data that do not involve visualization. We'll also learn about some ways to explicitly replace specific visualizations, such as bar or line charts, in our analyses.

<<< Previous | Next >>>