-
Notifications
You must be signed in to change notification settings - Fork 1
/
assignment4.Rmd
105 lines (76 loc) · 3.6 KB
/
assignment4.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
---
title: "Statistical assignment 4"
author: "[add your name here] [add your candidate number here - mandatory]"
date: "[add date here]"
output: github_document
---
```{r setup, include=FALSE}
# Please note these options.
# This tells R Markdown that we want to show code in the output document.
knitr::opts_chunk$set(echo = TRUE)
# Switching off messages in the output document.
knitr::opts_chunk$set(message = FALSE)
knitr::opts_chunk$set(warning = FALSE)
# Switching on caching to make things faster (don't commit cache files on Github).
knitr::opts_chunk$set(cache = TRUE)
```
In this assignment you will need to reproduce 5 ggplot graphs. I supply graphs as images; you need to write the ggplot2 code to reproduce them and knit and submit a Markdown document with the reproduced graphs (as well as your .Rmd file).
First we will need to open and recode the data. I supply the code for this; you only need to change the file paths.
```{r}
library(tidyverse)
Data8 <- read_tsv("...h_indresp.tab")
Data8 <- Data8 %>%
select(pidp, h_age_dv, h_payn_dv, h_gor_dv)
Stable <- read_tsv("...xwavedat.tab")
Stable <- Stable %>%
select(pidp, sex_dv, ukborn, plbornc)
Data <- Data8 %>% left_join(Stable, "pidp")
rm(Data8, Stable)
Data <- Data %>%
mutate(sex_dv = ifelse(sex_dv == 1, "male",
ifelse(sex_dv == 2, "female", NA))) %>%
mutate(h_payn_dv = ifelse(h_payn_dv < 0, NA, h_payn_dv)) %>%
mutate(h_gor_dv = recode(h_gor_dv,
`-9` = NA_character_,
`1` = "North East",
`2` = "North West",
`3` = "Yorkshire",
`4` = "East Midlands",
`5` = "West Midlands",
`6` = "East of England",
`7` = "London",
`8` = "South East",
`9` = "South West",
`10` = "Wales",
`11` = "Scotland",
`12` = "Northern Ireland")) %>%
mutate(placeBorn = case_when(
ukborn == -9 ~ NA_character_,
ukborn < 5 ~ "UK",
plbornc == 5 ~ "Ireland",
plbornc == 18 ~ "India",
plbornc == 19 ~ "Pakistan",
plbornc == 20 ~ "Bangladesh",
plbornc == 10 ~ "Poland",
plbornc == 27 ~ "Jamaica",
plbornc == 24 ~ "Nigeria",
TRUE ~ "other")
)
```
Reproduce the following graphs as close as you can. For each graph, write two sentences (not more!) describing its main message.
(Note the position of the code chunks; each is preceded by four spaces. This helps display numbered lists correctly in the Markdown file with the output.)
1. Histogram (20 points)
```{r}
```
2. Scatter plot (20 points). The red line shows a linear fit; the blue line shows a quadratic fit. Note the size and position of points.
```{r}
```
3. Faceted density chart (20 points).
```{r}
```
4. Ordered bar chart of summary statistics (20 points).
```{r}
```
5. Map (20 points). This is the most difficult problem in this set. You will need to use the NUTS Level 1 shape file (available here -- https://data.gov.uk/dataset/2aa6727d-c5f0-462a-a367-904c750bbb34/nuts-level-1-january-2018-full-clipped-boundaries-in-the-united-kingdom) and a number of packages for producing maps from shape files. You will need to google additional information; there are multiple webpages with the code that produces similar maps.
```{r}
```