-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathChapter19.qmd
178 lines (148 loc) · 8.05 KB
/
Chapter19.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
---
title: "Chapter 19"
subtitle: "Internals of ggplot2"
author: "Aditya Dahiya"
date: "2024-09-26"
format:
html:
code-fold: true
code-copy: hover
code-link: true
mermaid:
theme: neutral
execute:
echo: true
warning: false
error: false
cache: true
filters:
- social-share
share:
permalink: "https://aditya-dahiya.github.io/ggplot2book3e/Chapter19.html"
description: "Solutions Manual (and Beyond) for ggplot2: Elegant Graphics for Data Analysis (3e)"
twitter: true
facebook: true
linkedin: true
email: true
mastodon: true
editor_options:
chunk_output_type: console
bibliography: references.bib
---
::: {.callout-tip appearance="minimal"}
This chapter has no exercises. So, I will try to summarize some concepts using `mermaid` flowcharts in Quarto
:::
```{r}
#| label: setup
library(tidyverse)
library(scales)
```
## 19.1 The ggplot2 Plot Rendering Process
This @fig-1 illustrates the 5 key steps involved in rendering a ggplot2 object into an image, generated using Quarto's [native support](https://quarto.org/docs/authoring/diagrams.html#overview) for [`mermaid`](https://mermaid.js.org/) diagrams. (Credits: Code help from [ChatGPT](https://chat.openai.com/) also.)
::: {#fig-1}
```{mermaid}
flowchart TD
A["1. Create ggplot object"] --> B["2. ggplot_build(): Prepare data for each layer"]
B --> C["3. ggplot_gtable(): Convert data to graphical elements (gtable)"]
C --> D["4. grid::grid.newpage(): Create new image page"]
D --> E["5. grid::grid.draw(): Draw gtable on the image"]
```
A flowchart with 5 steps on how `ggplot2` object is actually drawn into a graphic.
:::
## 19.2 Steps in the ggplot2 Build Process:
- **Data Preparation**:
- **ggplot_build()** starts by preparing data for each layer of the plot.
- Each layer can provide its own data, inherit the global data, or use a function to generate data.
- The data is passed through the plot's layout, which organizes coordinate systems and facets (different sections of the plot).
- The **PANEL** column is added to the data, which ensures that each row is linked to a specific plot panel.
- **Data Transformation**:
- Any transformations (e.g., log scaling) specified in the scales are applied first to the data.
- Position scales are applied next, such as continuous or discrete scales (e.g., for axes), which may remove out-of-bounds values or adjust data into bins.
- Statistical transformations (e.g., smoothing or regression) are then performed based on the data and layers.
- After this, the geom (geometry) layers adjust the positions and apply any necessary transformations (e.g., jittering).
- Finally, all non-positional aesthetics (e.g., colors, line types) are mapped, and the data is prepared for rendering.
- **Final Output**:
- The result of **ggplot_build()** is a structured list with the final prepared data, layout details, trained scales, and the original plot object, now ready for rendering.
::: grid
::: g-col-6
::: {#fig-2}
```{mermaid}
flowchart TD
A{{Data Preparation}} --> B{{Gather Data}}
B --> C{{Add PANEL Column}}
C --> D{{Data Transformation}}
D --> E{{Apply Scale Transformations}}
E --> F{{Map Position Aesthetics}}
F --> G{{Perform Statistical Transformations}}
G --> H{{Adjust Geometry Positions}}
H --> I{{Map Non-Positional Aesthetics}}
I --> J{{Final Output}}
```
Flowchart illustrating the step-by-step process of data preparation and transformation in ggplot2, leading to the final output of a plot.
:::
:::
::: g-col-6
::: {#fig-3}
```{mermaid}
flowchart TD
A{{Start}} --> B{{Layer Data Frame}}
B --> C{{Add PANEL Column}}
C --> D{{Coordinate Transformed Data}}
D --> E{{Faceted Data}}
E --> F{{Calculated Aesthetics}}
F --> G{{Position Mapped Data}}
G --> H{{Statistical Transformed Data}}
H --> I{{Position Adjusted Data}}
I --> J{{"Final Aesthetic Mapped Data (a list object)"}}
```
This flowchart lists each intermediate step along with the data-frames that are created or transformed.
:::
:::
:::
## 19.3 The `gtable` step
This section explains how `ggplot_gtable()` converts the output of the build step into a graphical table (`gtable`) for rendering. The @fig-4 illustrates the process of transforming plot data into graphical objects (grobs), assembling panels and legends, and adding final elements like titles and margins to produce a complete plot ready for rendering in `ggplot2`.
::: {#fig-4}
```{mermaid}
flowchart TD
A("ggplot_gtable()") --> B("Convert Data to Grobs")
B --> C("Split Data by PANEL and Group")
C --> D("Coordinate Transformation (Normalize Data)")
D --> E("Convert Layers to gList of Grobs")
E --> F("Facet Collects Grobs per Panel")
F --> G("Assemble Panels into gtable")
G --> H("Render Axes and Panels")
H --> I("Train and Merge Legends")
I --> J("Create Key Grobs for Legends")
J --> K("Assemble Legend gtable")
K --> L("Add Title, Subtitle, Caption, Tag")
L --> M("Add Background and Margins")
M --> N("Final gtable Object")
```
Flowchart illustrating the `gtable` step in `ggplot2`, where data is transformed into graphical objects, panels and legends are assembled, and final plot adornments are added, resulting in a fully rendered plot ready for grid-based drawing.
:::
## 19.4 Introducing `ggproto`
### What is a ggproto Object?
- ggproto is a system within ggplot2 that allows for the creation of R objects with both data and methods. It is a way to define classes and objects that encapsulate functionality and state in a more controlled manner.
- `ggproto` objects are similar to object-oriented programming (OOP) concepts, allowing for inheritance, encapsulation, and polymorphism, making it easier to build complex and reusable components.
### Structure of ggproto Objects
- A `ggproto` object is created using the `ggproto()` function. It typically includes:
- Fields: These can store data or parameters relevant to the object.
- Methods: These define the functions that can be performed on the object or by the object.
### Key Characteristics
1. **Inheritance:**
- `ggproto` allows one object to inherit properties and methods from another. For example, a specific geom (like `geom_point`) can inherit from a more general geom class, sharing common functionality while allowing for customization.
2. **Encapsulation:**
- Each `ggproto` object can have its own internal state (data and methods), making it self-contained. This helps manage complexity by organizing related functions and data together.
3. **Polymorphism:**
- Different `ggproto` objects can have methods with the same name, allowing them to behave differently based on their class. This means you can call the same function on different `ggproto` objects, and they will respond according to their specific implementation.
### Use in ggplot2
1. **Geoms, Stats, and Coordinates:**
- Each component of a plot in ggplot2 (like geoms, stats, and coordinate systems) is represented as a `ggproto` object. For instance:
- `geom_point` is a `ggproto` object that defines how to draw points on a plot.
- `stat_smooth` is another `ggproto` object that defines statistical transformations.
2. **Customization:**
- Users can create their own custom geoms or stats by defining a new `ggproto` object that extends the existing ones, making it easy to customize the behavior of plots without altering the core ggplot2 functionality.
3. **Efficiency:**
- `ggproto` objects are efficient in terms of performance because they allow for method dispatching without the overhead of traditional R object systems.
Thus, understanding `ggproto` is essential for anyone looking to customize or extend the ggplot2 system effectively.
> *Imagine `ggproto` as the quirky, mad scientist of the ggplot2 lab, mixing up wild potions of plots and graphs. Each `ggproto` object is like a unique recipe card that says, "Add a dash of data here, sprinkle some fancy methods there," allowing you to whip up custom visualizations that are both deliciously informative and visually appealing.*