To identify home locations for users, we use four built-in recipes from
homelocator
package.
library(homelocator)
## Welcome to homelocator package!
df <- read_csv(here("analysis/data/derived_data/deidentified_sg_tweets.csv")) %>% #load de-identified dataset
mutate(created_at = with_tz(created_at, tzone = "Asia/Singapore")) # the tweets were sent in Singapore, so must convert the timezone to SGT, the default timezone is UTC!
Recipe ‘APDM’ (Rein Ahas et al., 2010) calculates both the average and standard deviation timestamps in each location for each user.
#must load neighbors when using APDM recipe
df_neighbors <- readRDS(here("analysis/data/derived_data/neighbors.rds"))
hm_apdm <- identify_location(df, user = "u_id", timestamp = "created_at", location = "grid_id",
tz = "Asia/Singapore", keep_score = F, recipe = "APDM")
Recipe ‘FREQ’ simply selects the most frequently ‘visited’ location as users’ home locations.
hm_freq <- identify_location(df, user = "u_id", timestamp = "created_at", location = "grid_id",
tz = "Asia/Singapore", show_n_loc = 1, recipe = "FREQ")
Recipe ‘HMLC’ weighs data points across multiple time frames to ‘score’ potentially meaningful locations.
hm_hmlc <- identify_location(df, user = "u_id", timestamp = "created_at", location = "grid_id",
tz = "Asia/Singapore", show_n_loc = 1, keep_score = F, recipe = "HMLC")
Recipe ‘OSNA’ (Efstathiades et al., 2015), only considers data points sent on weekdays and divides a day into three time frames - ‘rest time’, ‘leisure time’ and ‘active time’. The algorithm finds the most ‘popular’ location during ‘rest’ and ‘leisure’ time as the home locations for users.
hm_osna <- identify_location(df, user = "u_id", timestamp = "created_at", location = "grid_id",
tz = "Asia/Singapore", show_n_loc = 1, recipe = "OSNA")
The identified homes are in analysis/data/derived_data
.
write_csv(hm_apdm, path = here("analysis/data/derived_data/hm_apdm.csv"))
write_csv(hm_freq, path = here("analysis/data/derived_data/hm_freq.csv"))
write_csv(hm_hmlc, path = here("analysis/data/derived_data/hm_hmlc.csv"))
write_csv(hm_osna, path = here("analysis/data/derived_data/hm_osna.csv"))