Skip to content

Commit

Permalink
Merge pull request #43 from LaunchCodeEducation/tableau-2
Browse files Browse the repository at this point in the history
Tableau Part 2
  • Loading branch information
gildedgardenia authored Jun 18, 2024
2 parents e6fd0db + 44df6ee commit c644548
Show file tree
Hide file tree
Showing 9 changed files with 579 additions and 0 deletions.
48 changes: 48 additions & 0 deletions content/tableau-part-2/_index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
+++
pre = "<b>25. </b>"
chapter = true
title = "Tableau Part 2: Data Preparation"
date = 2024-05-13T11:44:50-05:00
draft = false
weight = 25
+++

## Learning Objectives

Upon completing all the content in this lesson, you should be able to do the following:

1. Use Tableau's filtering and sorting features to improve visualizations.
1. Use Tableau to arrange data in a custom hierarchical structure.
1. Use groups and sets to organize data for visualizations.

## Key Terminology

Here is a list of key terms for this chapter broken down by the page the term first appears on. Make note of each term and its definition.

### Filtering and Sorting

1. query pipeline
1. extract
1. extract filter
1. data source
1. data source filter
1. context filter
1. dimension filter
1. measure filter
1. table calculation filter

### Hierarchies

1. hierarchy

### Groups and Sets

1. group
1. set
1. dynamic set
1. fixed set
1. In/Out

## Content Links

{{% children %}}
68 changes: 68 additions & 0 deletions content/tableau-part-2/exercises/_index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
+++
title = "Exercises"
date = 2021-10-01T09:28:27-05:00
draft = false
weight = 2
+++

## Getting Started

1. Download this [Hotel Data Set](https://www.kaggle.com/jessemostipak/hotel-booking-demand).
1. Open the downloaded dataset in Tableau Public.
1. Create a new Tableau Public project to answer the below questions.
1. You should have **8 worksheets** in your project by the time you complete the exercises.
1. For more context and information about the data collected, check out this [article about the data](https://www.sciencedirect.com/science/article/pii/S2352340918315191).

## Part A: Hierarchy

1. What is the total number of adult hotel bookings according to the Reservation Status Date dimension?

1. Drill down to the Months level.

1. What is the average daily rate by customer type for booking hotels compared to the arrival day of the month, week number, and year?

1. Create a hierarchy using:

1. Arrival Day of the Month.
1. Arrival Date of the Week Number.
1. Arrival Date Year Measures.

## Part B: Filtering

1. How many total adults and children booked hotel rooms between 2015-2017?

1. Create a filter for “Arrival Date Year” using the either DD or Filter card.

1. What countries had a total of 1,000 total adult hotel bookings in 2016?

1. Hint: set the conditions of your filters.

## Part C: Grouping

1. What months were the most popular for adult hotel bookings only in South America?

1. Create a group of South American countries and place the group on the shelf.

1. Which country in South America had the highest number of adult hotel bookings total?

## Part D: Sets

1. What countries have hotel bookings that occurred within 10 days or less of arrival?

1. Hint: Create a conditional.

1. You can do this by filtering your set.
1. You should see the options: “General”, “Conditional”, and “Top”.
1. Select “Conditional”, by field and then select the desired field and the operator and the value.

1. Student Choice: Create a hierarchy of sets to explore the ADR of a country you would like to visit.

1. Start with the continent.
1. Then a region.
1. Then the country.

## Submitting Your Work

When finished make sure to save and publish your work to your Tableau Public account. Copy the URL to your published Tableau project and paste it into the submission box in
Canvas for **Exercises: Visualization with Tableau Part 2** and click *Submit*.

13 changes: 13 additions & 0 deletions content/tableau-part-2/next-steps.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
+++
title = "Next Steps"
date = 2021-10-01T09:28:27-05:00
draft = false
weight = 4
+++

You are ready to dive deeper into Tableau in next chapter! If there is something you want additional reinforcement on, check out our favorite additional resources:

1. [Filtering Data from Your Views](https://help.tableau.com/current/pro/desktop/en-us/filtering.htm)
1. [Create Hierarchies](https://help.tableau.com/current/pro/desktop/en-us/qs_hierarchies.htm)
1. [Group Your Data](https://help.tableau.com/current/pro/desktop/en-us/sortgroup_groups_creating.htm)
1. [Create Sets](https://help.tableau.com/current/pro/desktop/en-us/sortgroup_sets_create.htm)
10 changes: 10 additions & 0 deletions content/tableau-part-2/reading/_index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
+++
title = "Reading"
date = 2024-05-13T11:44:50-05:00
draft = false
weight = 1
+++

## Reading Content

{{% children %}}
121 changes: 121 additions & 0 deletions content/tableau-part-2/reading/filtering/_index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
+++
title = "Filtering and Sorting"
date = 2021-10-01T09:28:27-05:00
draft = false
weight = 1
+++

Previously, we talked about [keeping it simple]({{% relref "../../../data-visualization/reading/viz-best-practices" %}}) in Chapter 17 when we introduced data visualization best practices. Because we are using Tableau to put together our dashboards and stories as part of presenting our findings, we want to make sure that we are following best practices and only displaying the data we really need to. This is where data preparation comes in. Tableau has a number of features that we will explore throughout this chapter to keep our visualizations clean and simple.

Throughout the previous chapters on cleaning data, we talked about removing unnecessary data, however, filtering data is for when we want to keep the data, but there is too much data on the visualization. Tableau gives us a number of different ways we can filter our data, but in order for our filters to work, we have to pay attention to the type of filter we are using and the order in which Tableau will apply these filters.

## Tableau's Order of Operations

Tableau follows an order of operations, also known as the **query pipeline**. The query pipeline dictates the order in which filters are applied and if you do not follow these rules, your filters may not work as expected! Here is the order in which different Tableau filters are run:

1. Extract filters
1. Data source filters
1. Context filters
1. Dimension filters
1. Measure filters
1. Table calculation filters

Within each of the categories in the query pipeline, there are subcategories, so you may find this diagram helpful as you move through this and the following chapters.

![Diagram of Tableau's query pipeline](./pictures/tableau-query-pipeline.png)
*Image courtesy of [Tableau](https://help.tableau.com/current/pro/desktop/en-us/order_of_operations.htm)*

Let's review what each of these categories mean.

An **extract filter** is a filter applied to the **extract** of the data source or where the data originally comes from. If you work for an online retailer that specializes in jewelry and want to analyze earrings sales for the past 6 months, you may start by pulling in the data into Tableau from SQL Server. However, if you already know that you only need the data from the past 6 months, you may apply an extract filter to ensure that only the data from the past 6 months is brought into Tableau.

Once you load the data into Tableau, the data is known as the **data source**. A **data source filter** is a filter applied to the data source before a visualization needs to be made. You will find it very helpful when visualizing data to first review your data source and think hard about what you do and do not need. In the case of earrings sales, you might realize that the actual dimensions of the earrings are not as important as the category so you can apply a data source filter before you begin working on your visualizations.

A **context filter** and a **dimension filter** both do similar things so this is where the order of operations becomes vital! The context filter comes first in the order of operations and performs its action *before* the data is loaded and a dimension filter will perform its action *after* the data is loaded. Because of this, you may find a context filter handy if your data is taking a long time to load. If we have only a few thousand earrings sales to visualize, you may not notice a difference, but a few million can bog Tableau down. Both filters remove whole columns or rows from the dataset. As we dive into the visualizations, we might find it unhelpful to have a dimension for item name because some of the names are long and do not look nice when we assemble our visualizations. This would be a perfect use case for a dimension filter, because the data is already loaded and upon assembling visualizations, we have discovered that we do not need a whole column. You may not see context filters as often as dimension filters.

**Measure filters** remove specific cells that don't match a given condition. In the case of analyzing earring sales, you may want to perform some visualizations based on the price of the earrings sold. You can use a measure filter to only visualize earrings that are priced between $50 and $100.

Finally, we have **table calculation filters**. We will be covering table calculations in a later chapter, so for now, you just need to know that table calculations allow you to convert values in a table to suit your needs.

{{% notice blue Note %}}

You may not need all of these filter types immediately, but we want to drive home the order of operations now so you do not get tripped up later.

{{% /notice %}}

## Adding Filters to Tableau

You can apply filters in a few different ways in Tableau.

1. Selecting data points in an existing visualization.
1. Add a filter through the Actions menu.
1. Drag dimensions and filters to the Filter shelf.

For now, we are going to focus on the final method which is how we can add dimension and measure filters within Tableau. As you become more experienced with Tableau, you may find one of the other methods works better for you. We encourage you to keep exploring the platform beyond what we cover in the class!

### The Filter Shelf

In the previous chapter, you created your first dashboard and spent some time familiarizing yourself with the Data pane. Right next to the Data pane, we have the Filter shelf. To create a filter, you can drag a measure or dimension to the Filter shelf and answer the questions in the dialog box.

#### Dimension Filter

Dimensions contain qualitative data. When you drag a dimension to the filter shelf, a dialog box should appear with four tabs: General, Condition, Wildcard, and Top.
The General tab gives us categories of data that we can check or un-check to include or exclude. An example of how we might use the General tab to filter earrings sales would be to pull over an `Item Category` dimension and opt to exclude "hoops". The Wildcard tab allows us to establish a pattern that the qualitative data has to match for filtering, such as the item name having to include "Fall 2024". The Condition tab allows us to designate a specific condition for one of the dimensions for filtering, such as only allowing items that are a specific size. The Top tab works similar to the way `SELECT TOP` worked in SQL. We can designate that we only want the top 30 items in a price category dimension.

{{% notice blue Note %}}

Not all versions of Tableau have a Wildcard tab. Tableau Desktop is the main one that does.

{{% /notice %}}

#### Measure Filter

Measures contain quantitative data. When you drag a measure to the filter shelf, then you can select how you want to aggregate your data, such as sums or counts, and then you choose which one of the four types of quantitative filters you would like to use: Range of Values, At Least, At Most, and Special.

Range of Values specifies what range the value of your measure should fall in, whereas At Least and At Most specify the bottom and top of the range, respectively. Finally, Special allows users to specify filtering on such values as Null. We could use all of these filters to filter out data points in a `Units Sold` measure.

## Sorting Data

In addition to filtering data, sorting data can help make visualizations easier to read for your viewers. Sometimes you can just hover over the axis and click on the *Sort* icon to change how the visualization is sorted. You can also sort your dimensions from the toolbar by selecting the field you want to sort and then clicking the appropriate *Sort* icon, whether you want to sort the items in ascending or descending order.

## Check Your Understanding

{{% notice green Question %}}

Which filter is the first in Tableau’s order of operations:

1. Context filters
1. Data source filters
1. Extract filters
1. Measures
1. Dimensions

{{% /notice %}}

<!-- Extract filters -->

{{% notice green Question %}}

Willow wants to filter some qualitative data in a chart she is making about pets. Which of following is an example of qualitative data:

1. A list of common pet names
1. A map of Australia
1. A time field spanning over 5 years
1. Average number of animals

{{% /notice %}}

<!-- a list of common pet names -->

{{% notice green Question %}}

Willow wants to only show data about dogs. Her data set contains a column of string values with the type of pet: “dog”, “cat”, “rodent”, “bird”, “reptile”, “amphibian”, “rabbit”, and “other”. Which of the following filter features would allow her to do this?

1. Wildcard
1. Top
1. Conditional
1. General

{{% /notice %}}

<!-- Wildcard -->
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
104 changes: 104 additions & 0 deletions content/tableau-part-2/reading/groups-and-sets/_index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
+++
title = "Groups and Sets"
date = 2021-10-01T09:28:27-05:00
draft = false
weight = 3
+++


## Grouping Data Together

When we create a **group** in Tableau, we are combining multiple fields into one. For example, if we are trying to visualize different recipes, we can combine fields for pasta and chicken into a new group called Entrees.

To create a group, you need to start in the Data Pane.

1. Right-click on the field you want to group and click *Create* > *Group*.
1. In the dialog box that appears, you can select other fields that you want to add to the group and click *Group*.

Unlike hierarchies, Tableau creates a name for your group automatically. To rename it to something that makes more sense to you, you simply have to select it and click *Rename*.

With your group set up, you can begin to work with it. Tableau considers fields that are not a part of the group to be the "Other". If you want to add new fields to your group, you can right-click on the group in the Data Pane and click *Edit Group*. From the dialog box that appears, you can select new fields to add to the group. You can also opt to *Include Other* in the dialog box if that is helpful to your analysis. Finally, you can start removing members of the group in the same dialog box.

## Setting Data Aside

Whereas a group allows us to combine fields, a **set** allows us to assemble data from one field that meets a specific condition. For example, if we have one field called `Fiscal Year`, we can create a set for the data from the 2019 fiscal year.

If we are analyzing earrings sales for the past five years, we could use a set for each of the five years. While it is helpful for us to have an overarching look at the whole five years, having a set for 2019, for example, allows us to drill down and provide additional analysis and calculations. All rows that are for 2019 are called "In" and all rows that are for other years are called "Out".

### Dynamic Sets

A **dynamic set** is one where the set changes as the data changes. If we created a set for the current fiscal year of sales, we might want to use a dynamic set so as more sales data comes in, the set automatically updates.

To create a dynamic set, right-click on the dimension you are interested in. Select *Create* > *Set*. The dialog box that appears has three tabs. The *General* tab is where you can choose what to include. The *Condition* tab is where you can specify the condition that the members must meet to be included in the set. Finally, the *Top* tab allows you to place limits on the members to be included in the set. When you have everything configured how you wish, click *OK* and your new set can be found in the Data pane under "Sets".

You can add or remove data points later when you visualize the set by right-clicking on the data points and clciking on the Set icon. This action opens up the set dropdown menu where you can choose to either add the point(s) or remove them from the set.

{{% notice blue Note %}}

The above method also works for a fixed set, which we will talk more about now!

{{% /notice %}}

### Fixed Sets

A **fixed set** is a set of data where the members stay the same even if the data changes. You may want a fixed set for datasets that you know are stable such as the 2019 fiscal year set or if you are concerned that any change in the set could result in an invalid analysis.

You can create a fixed set by selecting a group of data points on your visualization and right-clicking on them. Click *Create Set* and give your new set a name.

### In/Out

When visualizing sets, Tableau defaults to visualizing the set in **In/Out** mode. One of the benefits of creating a set is easily comparing what is going on within the set and the rest of the dataset. If you want to see just the members of the set, you can right-click on the set name and choose *Show Members in Set*. If you want to go back to In/Out mode, you can right-click and choose *Show In/Out*.

## Check Your Understanding

{{% notice green Question %}}

Groups can be used for all of the following except:

1. Combine related members in a field
1. Correct errors
1. Answer "What if" questions
1. Organize data by what is "In" and what is "Out"

{{% /notice %}}

<!-- Organize data by what is In and what is Out -->

{{% notice green Question %}}

What other term is used to describe non-grouped members?

1. Out
1. Not part of the group
1. Other
1. Set

{{% /notice %}}

<!-- Other -->

{{% notice green Question %}}

Match the two types of sets:

| | |
|--|--|
| Dynamic | Sets that change when the data changes |
| Fixed | Sets that do not change, even if the data changes |

{{% /notice %}}

<!-- Dynamic sets are sets that update as data changes and fixed sets are sets that do not change even when the data changes -->

{{% notice green Question %}}

Match the members of a set:

| | |
|--|--|
| In | Members not in the set |
| Out | Members within the set |

{{% /notice %}}

<!-- In refers to members within the set and out refers to members not in the set -->
Loading

0 comments on commit c644548

Please sign in to comment.