Skip to content

Commit

Permalink
Cleaner Yardline Strings (#459)
Browse files Browse the repository at this point in the history
* Cleaner Yardline Strings

* Need to update test expectation
because we fix an empty string in the expectation
  • Loading branch information
mrcaseb authored Feb 29, 2024
1 parent 202800e commit 72f8961
Show file tree
Hide file tree
Showing 4 changed files with 28 additions and 3 deletions.
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Type: Package
Package: nflfastR
Title: Functions to Efficiently Access NFL Play by Play Data
Version: 4.6.1.9003
Version: 4.6.1.9004
Authors@R:
c(person(given = "Sebastian",
family = "Carl",
Expand Down
3 changes: 3 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,9 @@
- The function `calculate_player_stats_def` now returns `season_type` if argument `weekly` is set to `TRUE` for consistency with the other player stats functions. (#455)
- The function `missing_raw_pbp()` now allows filtering by season. (#457)
- More robust handling of player IDs in `decode_player_ids()`. (#458)
- Fixed rare cases where the value of the `yrdln` variable didn't equal `"MID 50"` at midfield. (#459)
- Fixed rare cases where `drive_start_yard_line` missed the blank space between team name and yard line number. (#459)
- Fixed play description in some 1999 and 2000 games where the string "D.Holland" replaced the kick distance. (#459)

# nflfastR 4.6.1

Expand Down
26 changes: 24 additions & 2 deletions R/helper_add_nflscrapr_mutations.R
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,12 @@ add_nflscrapr_mutations <- function(pbp) {
(.data$play_description == "END GAME" & is.na(.data$time)), "00:00", .data$time),
time = dplyr::if_else(.data$play_description == 'GAME', "15:00", .data$time),
# Create a column with the time in seconds remaining for the quarter:
quarter_seconds_remaining = time_to_seconds(.data$time)
quarter_seconds_remaining = time_to_seconds(.data$time),
play_description = dplyr::case_when(
stringr::str_detect(.data$play_description, "(?<=kicks )[:alpha:]{1,}.[:alpha:]{1,}(?= yards)") ~
stringr::str_replace(.data$play_description, "(?<=kicks )[:alpha:]{1,}.[:alpha:]{1,}(?= yards)", as.character(.data$kick_distance)),
TRUE ~ .data$play_description
)
) %>%
#put plays in the right order
dplyr::group_by(.data$game_id) %>%
Expand Down Expand Up @@ -196,7 +201,7 @@ add_nflscrapr_mutations <- function(pbp) {
.data$away_team, .data$home_team
),

yardline = dplyr::if_else(.data$yardline == "50", "MID 50", .data$yardline),
yardline = dplyr::if_else(stringr::str_detect(.data$yardline, "50"), "MID 50", .data$yardline),
yardline = dplyr::if_else(
nchar(.data$yardline) == 0 | is.null(.data$yardline) | .data$yardline == "NULL" | is.na(.data$yardline),
dplyr::lead(.data$yardline), .data$yardline
Expand Down Expand Up @@ -426,6 +431,23 @@ add_nflscrapr_mutations <- function(pbp) {
0, .data$away_timeout_used
)
) %>%
# replace empty strings in yard line variables
dplyr::mutate_at(
.vars = c("yardline", "drive_start_yard_line" ,"drive_end_yard_line"),
.funs = ~ dplyr::na_if(.x, "")
) %>%
# fix cases where a yardline variable misses the blank space between team name
# and yard number. At the point of adding this, the only spot where this happened
# was in the variable drive_start_yard_line in the games
# "2000_01_CAR_WAS", "2000_02_NE_NYJ", and "2000_03_ATL_CAR"
dplyr::mutate_at(
.vars = c("yardline", "drive_start_yard_line" ,"drive_end_yard_line"),
.funs = ~ dplyr::case_when(
stringr::str_detect(.x, "[:upper:]{2,3}(?=[:digit:]{1,2})") ~
stringr::str_c(stringr::str_extract(.x, "[:upper:]{2,3}"), stringr::str_extract(.x, "[:digit:]{1,2}"), sep = " "),
TRUE ~ .x
)
) %>%
# Group by the game_half to then create cumulative timeouts used for both
# the home and away teams:
dplyr::group_by(.data$game_id, .data$game_half) %>%
Expand Down
Binary file modified tests/testthat/expected_pbp.rds
Binary file not shown.

0 comments on commit 72f8961

Please sign in to comment.