Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create unpivot gem docs #449

Merged
merged 3 commits into from
Nov 26, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
40 changes: 40 additions & 0 deletions docs/Spark/gems/transform/unpivot.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
---
title: Unpivot
id: unpivot
description: Use the Unpivot Gem to transform your data from a wide format to a long format.
tags:
- gems
- unpivot
- wideformat
- longformat
---

Use the Unpivot Gem to transform your data from a wide format to a long format.

## Parameters

| Parameter | Description |
| ------------------------------- | ---------------------------------------------------------------------------------------------------------------------------- |
| Column(s) to use as identifiers | The column(s) that will identify to which group or entity the observation corresponds to. |
| Columns to unpivot | The columns (wide format) that you would like to transform into a single column (long format). |
| Variable column name | The name of the column that contains the names of the unpivoted columns. This helps describe the values in the value column. |
| Value column name | The name of the column that will contain the values from the unpivoted columns. |

## Example

Transforming your data into a long format can be beneficial when creating visualizations, comparing variables, handling dynamic data, and more.

Let's think about a time series example. If you have product sales data in a wide format, you may want to transform it into a long format before modeling the time series and analyzing the seasonal patterns in sales.

The image below shows sample input and output tables for this scenario.

![Wide and long formats of time series data](./img/unpivot-time-series.png)

This table describes how this transformation was achieved:

| Parameter | Input |
| ------------------------------- | ------------------------------------------------------------------------------------------------- |
| Column(s) to use as identifiers | The _Product_ column is the identifier because it defines which product the sales correspond to. |
| Columns to unpivot | All of the quarterly sales columns will be unpivoted. |
| Variable column name | The variable column is named _Quarter_ because it identifies the sales period. |
| Value column name | The value column is named _UnitsSold_ because it contains information about number of units sold. |