-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathChapter7.qmd
524 lines (435 loc) · 14.8 KB
/
Chapter7.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
---
title: "Chapter 7"
subtitle: "Networks"
author: "Aditya Dahiya"
date: "2023-12-09"
format:
html:
code-fold: true
code-copy: hover
code-link: true
execute:
echo: true
warning: false
error: false
cache: true
filters:
- social-share
share:
permalink: "https://aditya-dahiya.github.io/ggplot2book3e/Chapter7.html"
description: "Solutions Manual (and Beyond) for ggplot2: Elegant Graphics for Data Analysis (3e)"
twitter: true
facebook: true
linkedin: true
email: true
mastodon: true
editor_options:
chunk_output_type: console
bibliography: references.bib
---
::: {.callout-note appearance="minimal"}
## Note
There are no exercises in this Chapter to generate solutions for. Instead, we create Network Graphs for Star Wars characters using [Star Wars Social Networks data](https://github.com/evelinag/star-wars-network-data).
:::
```{r}
#| label: setup
# Loading the required libraries
library(tidyverse)
library(ggraph)
library(tidygraph)
library(jsonlite)
```
# 7.1 Network Data
### 7.1.1 Tidygraph: **A tidy network manipulation API**
I demonstrate an example to import a *.json* file using the [Star Wars Social Networks data](https://github.com/evelinag/star-wars-network-data) from [here](https://github.com/evelinag/star-wars-network-data/) [@gabasova2016]. Then, I manipulate the data using `tidygraph` and `tidyverse` functions.
- Creating a `tbl_graph` object **(`tidygraph`)** [@tidygraph] from a *.json* file using `jsonlite` package. [@jsonlite].
```{r}
#| label: tbl_graph
#| echo: true
# The url for the data
url <- "https://raw.githubusercontent.com/evelinag/star-wars-network-data/master/starwars-full-interactions.json"
# Importing json data
starwars <- jsonlite::read_json(url, simplifyVector = TRUE)
# Getting the nodes (i.e., Star-Wars characters)
sw_nodes <- starwars[[1]] |>
as_tibble() |>
# An id to match nodes and links
mutate(id = row_number()) |>
relocate(id) |>
# Conventional naming
rename(
firstname = name,
name = id
)
# Getting the links (i.e. connections between characters)
sw_links <- starwars[[2]] |>
as_tibble() |>
# Network Graphs Tidygraph does not recognize 0 as a node, so adding 1.
mutate(
source = source + 1,
target = target + 1
) |>
# Conventional Naming
rename(
from = source,
to = target,
weightage = value
)
# Creating a tbl_graph object
sw_graph <- tbl_graph(
nodes = sw_nodes,
edges = sw_links,
node_key = "name",
directed = FALSE
)
sw_graph
```
### **7.1.3 Algorithms**
The real benefit of tbl_graph data is that we can do many operations on them, while preserving the underlying structure, for example: centrality calculation, ranking, grouping, etc.
Finding groups within Star Wars Characters using `group_leading_eigen()` which groups nodes based on the leading eigenvector of the modularity matrix using `igraph::cluster_leading_eigen()` . The variable `col_var` shows the group, which can be used to create groups.
```{r}
#| code-fold: false
sw_graph <- sw_graph |>
# Creating groups with group_leading_eigen()
activate(nodes) |>
mutate(col_var = tidygraph::group_leading_eigen())
```
The `group_leading_eigen()` creates groups, and as we can see below in @fig-eigen , the groups are of related characters who appear often together in the movie.
```{r}
#| label: fig-eigen
#| fig-cap: "Groups of Star Wars characters as formed by group_leading_eigen()"
sw_graph |>
activate(nodes) |>
as_tibble() |>
select(col_var, firstname, value) |>
mutate(
Group = paste0("Gp. ", col_var),
firstname = snakecase::to_title_case(firstname)
) |>
group_by(Group) |>
arrange(desc(value)) |>
summarise(Characters = paste(firstname, collapse = ", ")) |>
gt::gt() |>
gtExtras::gt_theme_espn()
```
# 7.2 Visualizing Networks
We can use the `tbl_graph` object with `ggraph` to create visualizations easily.
- In @fig-v1, we are visualizing Networks amongst most important characters (imporantance determined by \>40 appearances in scenes). As we can see, out `tidygraph` function `group_leading_eigen()` has successfully group the characters into groups, based on their interactions.
```{r}
#| label: fig-v1
#| fig-cap-location: top
#| fig-cap: "Linkages (defined by simultaneous appearance in a scene in the movie) amongst most important characters of Star Wars movies"
sw_graph |>
activate(nodes) |>
filter(value > 40) |>
ggraph(layout = "stress") +
geom_edge_link(
mapping = aes(width = weightage),
alpha = 0.5,
color = "grey"
) +
geom_node_point(
mapping = aes(size = value,
colour = as.factor(col_var))
) +
geom_node_text(
mapping = aes(label = firstname),
check_overlap = TRUE,
repel = TRUE
) +
scale_size_area(max_size = 10) +
scale_colour_brewer(palette = "Dark2") +
theme_void() +
theme(legend.position = "none")
```
- Below, we trying out different layouts in @fig-layouts using the argument `layout = ""` to the function `ggraph()`. The `ggraph` provides well over 20 different layouts[^1] to choose from, including (Credits: [`R` Documentation](https://www.rdocumentation.org/packages/ggraph/versions/2.1.0/topics/layout_tbl_graph_igraph) for `layout_tbl_graph_igraph`): ---
[^1]: The layouts have been picked up (copied) from [RDocumentation](https://www.rdocumentation.org/packages/ggraph/versions/2.1.0/topics/layout_tbl_graph_igraph) webpage: `ggraph` (version 2.1.0) `layout_tbl_graph_igraph`: using `igraph` layout algorithms for `layout_tbl_graph`
<!-- -->
- **Hierarchical layouts**
- **`tree`**: Uses the Reingold-Tilford algorithm to place the nodes below their parent with the parent centered above its children.
- **`sugiyama`**: Designed for directed acyclic graphs (that is, hierarchies where multiple parents are allowed) it minimizes the number of crossing edges.
- **Standard layouts**
- **`bipartite`**: Minimize edge-crossings in a simple two-row (or column) layout for bipartite graphs.
- **`star`**: Place one node in the center and the rest equidistantly around it.
- **`circle`**: Place nodes in a circle in the order of their index. Consider using layout_tbl_graph_linear() with circular=TRUE for more control.
- **`nicely`**: Tries to pick an appropriate layout. See igraph::nicely() for a description of the simple decision tree it uses
- **`dh`**: Uses Davidson and Harels simulated annealing algorithm to place nodes.
- **`gem`**: Place nodes on the plane using the GEM force-directed layout algorithm.
- **`graphopt`**: Uses the Graphopt algorithm based on alternating attraction and repulsion to place nodes.
- **`grid`**: Place nodes on a rectangular grid.
- **`mds`**: Perform a multidimensional scaling of nodes using either the shortest path or a user supplied distance.
- **`sphere`**: Place nodes uniformly on a sphere - less relevant for 2D visualizations of networks.
- **`randomly`**: Places nodes uniformly random.
- **`fr`**: Places nodes according to the force-directed algorithm of Fruchterman and Reingold.
- **`kk`**: Uses the spring-based algorithm by Kamada and Kawai to place nodes.
- **`drl`**: Uses the force directed algorithm from the DrL toolbox to place nodes.
- **`lgl`**: Uses the algorithm from Large Graph Layout to place nodes.
```{r}
#| label: fig-layouts
#| layout-ncol: 2
#| fig-asp: 1
#| fig-cap: "Different Layouts for the ggraph()"
#| fig-subcap:
#| - "drl"
#| - "circle"
#| - "nicely"
#| - "dh"
#| - "gem"
#| - "graphopt"
#| - "grid"
#| - "kk"
sw_graph_fun <- function(my_layout){
sw_graph |>
activate(nodes) |>
filter(value > 40) |>
ggraph(layout = my_layout) +
geom_edge_link(
mapping = aes(width = weightage),
alpha = 0.5,
color = "grey"
) +
geom_node_point(mapping = aes(size = value,
col = as.factor(col_var))) +
geom_node_text(
mapping = aes(label = firstname),
check_overlap = TRUE,
repel = TRUE,
fontface = "bold"
) +
labs(title = paste0("layout = \"", my_layout, "\"")) +
scale_size_area(max_size = 10) +
theme_void() +
theme(
legend.position = "none",
plot.title = element_text(
face = "bold",
family = "mono",
size = 20,
hjust = 0
)
)
}
sw_graph_fun("drl")
sw_graph_fun("circle")
sw_graph_fun("nicely")
sw_graph_fun("dh")
sw_graph_fun("gem")
sw_graph_fun("graphopt")
sw_graph_fun("grid")
sw_graph_fun("kk")
```
- We can further customizing Edge-widths, curvature and colours to represent strength of links between the nodes using `ggraph`, as depicted in @fig-edges.
```{r}
#| label: fig-edges
#| fig-cap: "Customizing the links between the nodes with geom_edge_link2()"
library(showtext)
library(fontawesome)
font_add_google("Poller One",
family = "font_sw")
font_add_google("PT Sans Narrow", "font_title")
font_add_google("Nova Square", "font_body")
text_col <- "black"
# Caption stuff
github <- ""
github_username <- "aditya-dahiya"
xtwitter <- ""
xtwitter_username <- "adityadahiyaias"
linkedin <- ""
linkedin_username <- "dr-aditya-dahiya-ias"
social_caption <- glue::glue("<span style='font-family:\"Font Awesome 6 Brands\";'>{github};</span> <span style='color: {text_col}'>{github_username} </span> <span style='font-family:\"Font Awesome 6 Brands\";'>{xtwitter};</span> <span style='color: {text_col}'>{xtwitter_username}</span> <span style='font-family:\"Font Awesome 6 Brands\";'>{linkedin};</span> <span style='color: {text_col}'>{linkedin_username}</span>")
showtext_auto()
set.seed(4)
sw_graph |>
activate(nodes) |>
mutate(col_var = as.character(col_var)) |>
filter(value > 40) |>
# Start plotting network graph
ggraph(layout = "nicely") +
# Edges
geom_edge_bend2(
aes(
colour = node.col_var,
width = weightage
),
lineend = "round"
) +
# Nodes
geom_node_point(
aes(
size = value,
fill = col_var
),
pch = 21,
colour = "white",
stroke = 2
) +
# Labels
geom_node_text(
aes(label = firstname),
check_overlap = TRUE,
repel = TRUE,
fontface = "bold"
) +
labs(title = "Network showing appearances of main characters in Star Wars films",
subtitle = "Size of dot indicates screen-time; width of link indicates common appearances in scenes",
caption = social_caption) +
scale_size_area(max_size = 15) +
theme_void() +
theme(
legend.position = "none",
plot.title = element_text(
family = "font_title",
face = "bold",
size = 21,
hjust = 0
),
plot.subtitle = element_text(
family = "font_body",
hjust = 0
),
plot.caption = element_text(
hjust = 1,
family = "font_title"
)
)
```
- **Faceting:** Lastly, we can also facet on the nodes or on the edges, and produce multiple network graphs, as shown in @fig-facet for the entire Star Wars cast (not just the most important characters): ---
```{r}
#| label: fig-facet
#| fig-width: 10
#| fig-asp: 1.5
#| fig-cap: "Different Network of Characters in Star Wars: A Faceted ggraph"
labels_facet = c(
"Jedi & Sith Lords",
"Prequel Trilogy",
"Original Trilogy",
"Sequel Trilogy",
"Villains",
"An outlier!"
)
labels_facet <- str_to_upper(labels_facet)
names(labels_facet) = 1:6
sw_graph |>
activate(nodes) |>
mutate(firstname = snakecase::to_title_case(firstname)) |>
ggraph() +
geom_edge_link(
mapping = aes(width = weightage),
alpha = 0.5,
color = "grey"
) +
geom_node_point(
aes(size = value,
col = as.factor(col_var))) +
geom_node_text(
aes(label = firstname,
size = value),
repel = TRUE
) +
scale_size_area(max_size = 10) +
scale_size_continuous(range = c(6, 15)) +
scale_color_brewer(palette = "Set2") +
facet_nodes(~ col_var,
scales = "free",
labeller = as_labeller(labels_facet),
ncol = 2) +
theme_void() +
theme(
legend.position = "none",
strip.text = element_text(
family = "font_sw",
hjust = 0.5,
size = 36
),
panel.border = element_rect(colour = "darkgrey",
fill = NA),
strip.background = element_rect(colour = "darkgrey",
fill = NA)
)
```
```{r}
#| label: Code-for-social-media-post
#| eval: false
#| echo: false
library(showtext)
library(fontawesome)
sysfonts::font_add(family = "Font Awesome 6 Brands",
regular = "C:/Users/dradi/Documents/Post-Harvard Academics/projects_presentations/docs/Font Awesome 6 Brands-Regular-400.otf")
font_add_google("Poller One",
family = "font_sw")
font_add_google("PT Sans Narrow", "font_title")
font_add_google("Nova Square", "font_body")
text_col <- "black"
# Caption stuff
github <- ""
github_username <- "aditya-dahiya"
xtwitter <- ""
xtwitter_username <- "adityadahiyaias"
linkedin <- ""
linkedin_username <- "dr-aditya-dahiya-ias"
social_caption <- glue::glue("<span style='font-family:\"Font Awesome 6 Brands\";'>{github};</span> <span style='color: {text_col}'>{github_username} </span> <span style='font-family:\"Font Awesome 6 Brands\";'>{xtwitter};</span> <span style='color: {text_col}'>{xtwitter_username}</span> <span style='font-family:\"Font Awesome 6 Brands\";'>{linkedin};</span> <span style='color: {text_col}'>{linkedin_username}</span>")
library(magick)
img_sw <- image_read("https://www.freepnglogos.com/uploads/star-wars-logo-31.png")
showtext_auto()
set.seed(4)
g <- sw_graph |>
activate(nodes) |>
mutate(col_var = as.character(col_var)) |>
filter(value > 40) |>
# Start plotting network graph
ggraph(layout = "nicely") +
# Edges
geom_edge_bend2(
aes(
colour = node.col_var,
width = weightage
),
lineend = "round"
) +
# Nodes
geom_node_point(
aes(
size = value,
fill = col_var
),
pch = 21,
colour = "white",
stroke = 2
) +
# Labels
geom_node_text(
aes(label = firstname),
check_overlap = TRUE,
repel = TRUE,
fontface = "bold"
) +
theme_classic() +
labs(title = "Network showing appearances of main characters in Star Wars films",
subtitle = "Size of dot indicates screen-time; width of link indicates common appearances in scenes",
caption = social_caption) +
scale_size_area(max_size = 15) +
theme_void() +
theme(
legend.position = "none",
plot.title = element_text(
family = "font_title",
face = "bold",
size = 21,
hjust = 0
),
plot.subtitle = element_text(
family = "font_body",
hjust = 0
),
plot.caption = ggtext::element_textbox(
hjust = 0.1,
family = "font_title"
)
)
g +
annotation_raster(img_sw,
xmin = -1.2,
xmax = 0,
ymin = -3,
ymax = -1.5)
```