Skip to content

Commit

Permalink
Polishing str_sub() (#571)
Browse files Browse the repository at this point in the history
* Better documentation for `start` and `end`. Fixes #547
* Add test for empty strings
* Check `value` length and add test
  • Loading branch information
hadley authored Aug 20, 2024
1 parent 0b92cfb commit 9304301
Show file tree
Hide file tree
Showing 5 changed files with 49 additions and 10 deletions.
1 change: 1 addition & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
# stringr (development version)

* `str_sub<-` now gives a more informative error if `value` is not the correct length.
* Add `sep` argument to `str_dup()` so that it is possible to repeat a string and
add a separator between every repeated value (@edward-burn, #564).
* `str_*` now errors if `pattern` includes any `NA`s (@nash-delcamp-slp, #546).
Expand Down
19 changes: 13 additions & 6 deletions R/sub.R
Original file line number Diff line number Diff line change
Expand Up @@ -6,18 +6,25 @@
#'
#' @inheritParams str_detect
#' @param start,end A pair of integer vectors defining the range of characters
#' to extract (inclusive).
#' to extract (inclusive). Positive values count from the left of the string,
#' and negative values count from the right. In other words, if `string` is
#' `"abcdef"` then 1 refers to `"a"` and -1 refers to `"f"`.
#'
#' Alternatively, instead of a pair of vectors, you can pass a matrix to
#' `start`. The matrix should have two columns, either labelled `start`
#' and `end`, or `start` and `length`.
#' and `end`, or `start` and `length`. This makes `str_sub()` work directly
#' with the output from [str_locate()] and friends.
#'
#' @param omit_na Single logical value. If `TRUE`, missing values in any of the
#' arguments provided will result in an unchanged input.
#' @param value replacement string
#' @param value Replacement string.
#' @return
#' * `str_sub()`: A character vector the same length as `string`/`start`/`end`.
#' * `str_sub_all()`: A list the same length as `string`. Each element is
#' a character vector the same length as `start`/`end`.
#'
#' If `end` comes before `start` or `start` is outside the range of `string`
#' then the corresponding output will be the empty string.
#' @seealso The underlying implementation in [stringi::stri_sub()]
#' @export
#' @examples
Expand All @@ -28,7 +35,7 @@
#' str_sub(hw, 8, 14)
#' str_sub(hw, 8)
#'
#' # Negative indices index from end of string
#' # Negative values index from end of string
#' str_sub(hw, -1)
#' str_sub(hw, -7)
#' str_sub(hw, end = -7)
Expand Down Expand Up @@ -67,8 +74,8 @@ str_sub <- function(string, start = 1L, end = -1L) {

#' @export
#' @rdname str_sub
"str_sub<-" <- function(string, start = 1L, end = -1L, omit_na = FALSE, value) {
vctrs::vec_size_common(string = string, start = start, end = end)
"str_sub<-" <- function(string, start = 1L, end = -1L, omit_na = FALSE, value) {
vctrs::vec_size_common(string = string, start = start, end = end, value = value)

if (is.matrix(start)) {
stri_sub(string, from = start, omit_na = omit_na) <- value
Expand Down
14 changes: 10 additions & 4 deletions man/str_sub.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

13 changes: 13 additions & 0 deletions tests/testthat/_snaps/sub.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# bad vectorisation gives informative error

Code
str_sub(x, 1:2, 1:3)
Condition
Error in `str_sub()`:
! Can't recycle `string` (size 2) to match `end` (size 3).
Code
str_sub(x, 1:2, 1:2) <- 1:3
Condition
Error in `str_sub<-`:
! Can't recycle `string` (size 2) to match `value` (size 3).

12 changes: 12 additions & 0 deletions tests/testthat/test-sub.R
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,11 @@ test_that("missing arguments give missing results", {

expect_equal(str_sub("test", NA, NA), NA_character_)
expect_equal(str_sub(c(NA, "test"), NA, NA), rep(NA_character_, 2))
})

test_that("negative length or out of range gives empty string", {
expect_equal(str_sub("abc", 2, 1), "")
expect_equal(str_sub("abc", 4, 5), "")
})

test_that("replacement works", {
Expand Down Expand Up @@ -101,3 +105,11 @@ test_that("replacement with NA works", {
str_sub(x, 1, 1, omit_na = TRUE) <- NA
expect_equal(x, "BBCDEF")
})

test_that("bad vectorisation gives informative error", {
x <- "a"
expect_snapshot(error = TRUE, {
str_sub(x, 1:2, 1:3)
str_sub(x, 1:2, 1:2) <- 1:3
})
})

0 comments on commit 9304301

Please sign in to comment.