-
Notifications
You must be signed in to change notification settings - Fork 0
/
README.Rmd
110 lines (87 loc) · 3.7 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
---
title: "Commuting in LA"
subtitle: "An analysis of my time behind the wheel"
output: github_document
---
## Summary
Anyone living in LA will know that spending time behind the wheel is part of life in this part of the world. In this document I visualize some of the data I have collected over the last few months on my commuting times in the drive between Marina Del Rey and my office in Santa Monica.
To collect the data I just created a Google Form which I then saved a link to on my phone. I (try to) record my time every day I take the car (which is not every day) and the results are then automatically saved into a Google Sheet.
## Setup
### Load Packages
First we will load in the packages we need including the excellent `googlesheets` as well as our own `redbull` package for chart formatting.
```{r install redbull, echo=F, message=F, warning=FALSE, include=F}
devtools::install_github("deathbydata/redbull")
```
```{r load packages, message = FALSE}
library(googlesheets)
library(tidyverse)
library(magrittr)
library(redbull)
library(lubridate)
```
```{r authorise googlesheets, echo = FALSE, message = FALSE, eval=FALSE}
options(httr_oob_default = TRUE)
gs_auth()
gs_ls()
```
```{r pull google data helper functions}
source("R/get_commuting_data.R")
```
Connect to our Google Sheet and extract the data we need.
```{r set googlesheet key, echo=FALSE, message=F}
commuting_sheet_key <- Sys.getenv("COMMUTING_SHEET_KEY")
```
```{r extract data, message = FALSE}
comm_data <- get_commuting_data(commuting_sheet_key)
comm_data_clean <- clean_commuting_data(comm_data)
```
Ok so let's look at monthly commute time averages.
```{r plot commute by month, echo = F}
theme_set(theme_rb())
comm_data_clean %>%
mutate(Trip_Month = format(Trip_Date, "%Y-%m")) %>%
ggplot(aes(x = Trip_Month, y = Trip_Duration_Minutes, fill = Direction)) +
geom_boxplot() +
scale_fill_redbull("bloomberg") +
labs(title = "Coming home in September sucks!",
subtitle = "Average monthly commuting time between Marina Del Rey and Santa Monica",
x = "Month",
y = "Commute Time (minutes)")
```
And a more granular view...
```{r plot all commutes, message=FALSE, echo = F}
comm_data_clean %>%
ggplot(aes(x = Trip_Date, y = Trip_Duration_Minutes, colour = Direction)) +
geom_point() +
scale_colour_redbull("bloomberg") +
labs(title = "Looks like it's getting worse...",
subtitle = "Commuting time between Marina Del Rey and Santa Monica",
x = "Month",
y = "Commute Time (minutes)") +
geom_smooth(se = F, span = 0.75)
```
What is the impact of travelling on different weekdays?
```{r plot weekdays, message=FALSE, echo = F}
comm_data_clean %>%
ggplot(aes(x = Trip_Weekday, y = Trip_Duration_Minutes, fill = Direction)) +
geom_boxplot() +
scale_fill_redbull("bloomberg") +
labs(title = "I always thought Thursdays were worst!",
subtitle = "Average commuting time per weekday between Marina Del Rey and Santa Monica",
x = "Weekday",
y = "Commute Time (minutes)")
```
Finally - how much does the time I leave affect my commute?
```{r plot commute vs departure, message=FALSE, echo = F}
comm_data_clean %>%
filter((Trip_Time < hms("11:00:00") & Direction == "To Work") | (Trip_Time > hms("16:00:00") & Direction == 'From Work')) %>%
ggplot(aes(x = Trip_Time, y = Trip_Duration_Minutes, colour = Direction)) +
geom_point() +
geom_smooth(method = "lm", formula = y ~ poly(x, 2)) +
scale_color_redbull("bloomberg") +
labs(title = "Leave early or late - no excuses!",
subtitle = "Commuting time between Marina Del Rey and Santa Monica (quadratic fit)",
x = "Departure Time",
y = "Commute Time (minutes)") +
facet_grid(Trip_Weekday ~ Direction, scales = "free")
```