Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply vctrs principles to map() and modify() #894

Merged
merged 27 commits into from
Sep 17, 2022
Merged
Show file tree
Hide file tree
Changes from 20 commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
9a82a56
Implement map_vec()
hadley Aug 27, 2022
194828f
Implementation feedback
hadley Aug 29, 2022
a717b6c
Merged origin/main into map-vec
hadley Aug 29, 2022
7733dc1
Add tests to clarify behaviour
hadley Aug 29, 2022
bdde69f
Merge commit '3b5add2db99a35ec1392ad23dc021b7ccadbbbbb'
hadley Sep 14, 2022
63119fd
Use new simplify tooling
hadley Sep 14, 2022
0aff22d
Remove unneeded import
hadley Sep 14, 2022
b852864
Merged origin/main into map-vec
hadley Sep 15, 2022
5671d9d
Update snapshot
hadley Sep 15, 2022
4d5b102
Update modify and modify_at
hadley Sep 15, 2022
ed04346
Add map2_vec() and pmap_vec()
hadley Sep 15, 2022
e093621
Move map_chr to correct position
hadley Sep 15, 2022
5167acc
Update modify2
hadley Sep 15, 2022
08fe6e6
And modify_if
hadley Sep 15, 2022
b2c09b3
Add news bullets
hadley Sep 15, 2022
903bdd3
Minimise map_vec() tests given implementation
hadley Sep 15, 2022
e203d7e
Update map2 tests
hadley Sep 15, 2022
3da5be6
Update pmap tests
hadley Sep 15, 2022
8f82675
At test for non-vector lists
hadley Sep 15, 2022
4983571
Test fallbacks
hadley Sep 15, 2022
961eeea
Tweak modify's handling of data frames
hadley Sep 16, 2022
2e7af8c
modify functions are no longer generics
hadley Sep 16, 2022
0b7ce2e
Polish modify tests
hadley Sep 16, 2022
ecae803
Tweak error message
hadley Sep 16, 2022
524e484
Add test for ptype
hadley Sep 16, 2022
9d45521
Tweak return type description some more
hadley Sep 16, 2022
87cd328
Merge commit '426acdd50424b8cd6029d237c4d4e81d94ec42a6'
hadley Sep 16, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 3 additions & 17 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -4,27 +4,10 @@ S3method(as_mapper,character)
S3method(as_mapper,default)
S3method(as_mapper,list)
S3method(as_mapper,numeric)
S3method(modify,character)
S3method(modify,default)
S3method(modify,double)
S3method(modify,integer)
S3method(modify,logical)
S3method(modify,pairlist)
S3method(modify2,character)
S3method(modify2,default)
S3method(modify2,double)
S3method(modify2,integer)
S3method(modify2,logical)
S3method(modify_at,character)
S3method(modify_at,default)
S3method(modify_at,double)
S3method(modify_at,integer)
S3method(modify_at,logical)
S3method(modify_if,character)
S3method(modify_if,default)
S3method(modify_if,double)
S3method(modify_if,integer)
S3method(modify_if,logical)
S3method(print,purrr_function_compose)
S3method(print,purrr_function_partial)
S3method(print,purrr_rate_backoff)
Expand Down Expand Up @@ -152,6 +135,7 @@ export(map2_dfr)
export(map2_int)
export(map2_lgl)
export(map2_raw)
export(map2_vec)
export(map_at)
export(map_chr)
export(map_dbl)
Expand All @@ -163,6 +147,7 @@ export(map_if)
export(map_int)
export(map_lgl)
export(map_raw)
export(map_vec)
export(modify)
export(modify2)
export(modify_at)
Expand All @@ -184,6 +169,7 @@ export(pmap_dfr)
export(pmap_int)
export(pmap_lgl)
export(pmap_raw)
export(pmap_vec)
export(possibly)
export(prepend)
export(pwalk)
Expand Down
10 changes: 10 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,10 @@
* `*_at()` can now take a function (or formula) that's passed the vector of
element names and returns the elements to select.

* New `map_vec()`, `map2_vec()`, and `pmap_vec()` work on all types of vectors,
extending `map_lgl()`, `map_int()`, and friends so that you can easily work
with dates, factors, date-times and more (#435).

* New `keep_at()` and `discard_at()` that work like `keep()` and `discard()`
but operation on element names rather than element contents (#817).

Expand Down Expand Up @@ -117,6 +121,10 @@
* `map2()` and `pmap()` now recycle names of their first input if
needed (#783).

* `modify()`, `modify_if()`, and `modify_at()` have been reimplemented using
vctrs principles. This shouldn't have an user facing impact, but it does
make the implementation much simpler.

### Plucking

* `vec_depth()` is now `pluck_depth()` and works with more types of input
Expand Down Expand Up @@ -155,6 +163,8 @@

## Minor improvements and bug fixes

* `modify()` no longer supports modifying calls or pairlists.

* `modify_depth()` is no longer a generic. This makes it more consistent
with `map_depth()`.

Expand Down
30 changes: 23 additions & 7 deletions R/map.R
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,9 @@
#' atomic vector of the indicated type (or die trying). For these functions,
#' `.f` must return a length-1 vector of the appropriate type.
#'
#' * `map_vec()` simplifies to the common type of the output. It works with
#' most types of simple vectors like Date, POSIXct, factors, etc.
#'
#' * `walk()` calls `.f` for its side-effect and returns
#' the input `.x`.
#'
Expand Down Expand Up @@ -40,9 +43,12 @@
#'
#' * No suffix: a list.
#'
#' * `_lgl`, `_int`, `_dbl`, `_chr` return a logical, integer, double,
#' * `_lgl()`, `_int()`, `_dbl()`, `_chr()` return a logical, integer, double,
#' or character vector respectively. It will be named if the input was named.
#'
#' * `_vec()` return an atomic or S3 vector, that is guaranteed to be
#' simpler than list.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is guaranteed to be simpler than list.

Well, yes, but also: 😬

> map_vec(list(list(1:3), list(5)), identity)
[[1]]
[1] 1 2 3

[[2]]
[1] 5

#'
#' * `walk()` returns the input `.x` (invisibly). This makes it easy to
#' use in a pipe.
#' @export
Expand Down Expand Up @@ -120,23 +126,33 @@ map_lgl <- function(.x, .f, ..., .progress = FALSE) {

#' @rdname map
#' @export
map_chr <- function(.x, .f, ..., .progress = FALSE) {
map_int <- function(.x, .f, ..., .progress = FALSE) {
.f <- as_mapper(.f, ...)
.Call(map_impl, environment(), ".x", ".f", "character", .progress)
.Call(map_impl, environment(), ".x", ".f", "integer", .progress)
}

#' @rdname map
#' @export
map_int <- function(.x, .f, ..., .progress = FALSE) {
map_dbl <- function(.x, .f, ..., .progress = FALSE) {
.f <- as_mapper(.f, ...)
.Call(map_impl, environment(), ".x", ".f", "integer", .progress)
.Call(map_impl, environment(), ".x", ".f", "double", .progress)
}

#' @rdname map
#' @export
map_dbl <- function(.x, .f, ..., .progress = FALSE) {
map_chr <- function(.x, .f, ..., .progress = FALSE) {
.f <- as_mapper(.f, ...)
.Call(map_impl, environment(), ".x", ".f", "double", .progress)
.Call(map_impl, environment(), ".x", ".f", "character", .progress)
}

#' @rdname map
#' @param .ptype If `NULL`, the default, the output type is the common type
#' of the elements of the result. Otherwise, supply a "prototype" giving
#' the desired type of output.
#' @export
map_vec <- function(.x, .f, ..., .ptype = NULL, .progress = FALSE) {
out <- map(.x, .f, ..., .progress = .progress)
simplify_impl(out, ptype = .ptype)
}

#' @rdname map
Expand Down
7 changes: 6 additions & 1 deletion R/map2.R
Original file line number Diff line number Diff line change
Expand Up @@ -64,11 +64,16 @@ map2_chr <- function(.x, .y, .f, ..., .progress = NULL) {
.Call(map2_impl, environment(), ".x", ".y", ".f", "character", .progress)
}

#' @rdname map2
#' @export
map2_vec <- function(.x, .y, .f, ..., .ptype = NULL, .progress = NULL) {
hadley marked this conversation as resolved.
Show resolved Hide resolved
out <- map2(.x, .y, .f, ..., .progress = .progress)
simplify_impl(out, ptype = .ptype)
}

#' @export
#' @rdname map2
walk2 <- function(.x, .y, .f, ...) {
map2(.x, .y, .f, ...)
invisible(.x)
}

176 changes: 50 additions & 126 deletions R/modify.R
Original file line number Diff line number Diff line change
Expand Up @@ -93,41 +93,18 @@ modify <- function(.x, .f, ...) {
modify.default <- function(.x, .f, ...) {
.f <- as_mapper(.f, ...)

for (i in seq_along(.x)) {
list_slice2(.x, i) <- .f(.x[[i]], ...)
if (vec_is_list(.x) || is.data.frame(.x)) {
out <- map(vec_proxy(.x), .f, ...)
vec_restore(out, .x)
hadley marked this conversation as resolved.
Show resolved Hide resolved
} else if (vec_is(.x)) {
map_vec(.x, .f, ..., .ptype = .x)
} else if (is.null(.x) || is.list(.x)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit strange to go through that path for .x = NULL. Maybe branch into return(NULL)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had this originally, but it also felt weird — it is nice that NULL just falls out as a special case of a zero length list.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW I also prefer the explicit is.null(x) branch

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to leave it with list because (a) it's not that import and (b) it needs the same recycling behaviour as a zero-length list in modify2().

.x[] <- map(.x, .f, ...)
.x
} else {
cli::cli_abort("Don't know how to modify {.obj_type_friendly {.x}}.")
Copy link
Member

@lionel- lionel- Sep 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use a "can't" form here?

! Can't modify a function.

Actually a "must" form works well I think:

! `.x` must be a vector, list, or data frame, not a function.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think I should just ignore the fact it's a generic? It is unlikely that people have provided methods. Do you think I verify that there aren't methods on CRAN then remove the genericity?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm yes I think that is a good idea.

Copy link
Member Author

@hadley hadley Sep 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are five packages with S3method(modify, in their namespace (https://cs.github.com/?scopeName=All+repos&scope=&q=org%3Acran+path%3A%2F%5ENAMESPACE%2F+%2FS3method%5C%28modify%2C%2F)

  • heemod — own modify generic
  • timbr — purrr generic
  • purrr — duh
  • tfarima — own generic
  • yamlet — own generic

timbr's modify.forest looks considerably more complex than our implementations, so it's probably ok for it to be its own function: https://github.com/UchidaMizuki/timbr/blob/main/R/purrr.R. Issue at UchidaMizuki/timbr#1

}

.x
}
# TODO: Replace all the following methods with a generic strategy that
# implements sane coercion rules for base vectors
#' @export
modify.integer <- function (.x, .f, ...) {
.x[] <- map_int(.x, .f, ...)
.x
}
#' @export
modify.double <- function (.x, .f, ...) {
.x[] <- map_dbl(.x, .f, ...)
.x
}
#' @export
modify.character <- function (.x, .f, ...) {
.x[] <- map_chr(.x, .f, ...)
.x
}
#' @export
modify.logical <- function (.x, .f, ...) {
.x[] <- map_lgl(.x, .f, ...)
.x
}
#' @export
modify.pairlist <- function(.x, .f, ...) {
as.pairlist(map(.x, .f, ...))
}


# modify_if ---------------------------------------------------------------

#' @rdname modify
#' @inheritParams map_if
Expand All @@ -139,52 +116,16 @@ modify_if <- function(.x, .p, .f, ..., .else = NULL) {
#' @export
modify_if.default <- function(.x, .p, .f, ..., .else = NULL) {
where <- where_if(.x, .p)
index <- seq_along(.x)
.x <- modify_where(.x, where, .f, ...)

.f <- as_mapper(.f, ...)
for (i in index[where]) {
list_slice2(.x, i) <- .f(.x[[i]], ...)
}

if (!is_null(.else)) {
if (!is.null(.else)) {
.else <- as_mapper(.else, ...)
for (i in index[!where]) {
list_slice2(.x, i) <- .else(.x[[i]], ...)
}
}

.x
}
#' @export
modify_if.integer <- function(.x, .p, .f, ..., .else = NULL) {
modify_if_base(map_int, .x, .p, .true = .f, .false = .else, ...)
}
#' @export
modify_if.double <- function(.x, .p, .f, ..., .else = NULL) {
modify_if_base(map_dbl, .x, .p, .true = .f, .false = .else, ...)
}
#' @export
modify_if.character <- function(.x, .p, .f, ..., .else = NULL) {
modify_if_base(map_chr, .x, .p, .true = .f, .false = .else, ...)
}
#' @export
modify_if.logical <- function(.x, .p, .f, ..., .else = NULL) {
modify_if_base(map_lgl, .x, .p, .true = .f, .false = .else, ...)
}

modify_if_base <- function(.fmap, .x, .p, .true, .false = NULL, ..., .error_call = caller_env()) {
where <- where_if(.x, .p, .error_call = .error_call)
.x[where] <- .fmap(.x[where], .true, ...)

if (!is.null(.false)) {
.x[!where] <- .fmap(.x[!where], .false, ...)
.x <- modify_where(.x, !where, .else, ...)
}

.x
}

# modify_at ---------------------------------------------------------------

#' @rdname modify
#' @inheritParams map_at
#' @export
Expand All @@ -195,76 +136,59 @@ modify_at <- function(.x, .at, .f, ...) {
#' @export
modify_at.default <- function(.x, .at, .f, ...) {
where <- where_at(.x, .at)
modify_if(.x, where, .f, ...)
}
#' @export
modify_at.integer <- function(.x, .at, .f, ...) {
where <- where_at(.x, .at)
.x[where] <- map_int(.x[where], .f, ...)
.x
}
#' @export
modify_at.double <- function(.x, .at, .f, ...) {
where <- where_at(.x, .at)
.x[where] <- map_dbl(.x[where], .f, ...)
.x
}
#' @export
modify_at.character <- function(.x, .at, .f, ...) {
where <- where_at(.x, .at)
.x[where] <- map_chr(.x[where], .f, ...)
.x
}
#' @export
modify_at.logical <- function(.x, .at, .f, ...) {
where <- where_at(.x, .at)
.x[where] <- map_lgl(.x[where], .f, ...)
.x
modify_where(.x, where, .f, ...)
}

# modify2 -----------------------------------------------------------------

#' @rdname modify
#' @export
modify2 <- function(.x, .y, .f, ...) {
UseMethod("modify2")
}
#' @export
modify2.default <- function(.x, .y, .f, ...) {
modify2_base(map2, .x, .y, .f, ...)
}
# TODO: Improve genericity (see above)
#' @export
modify2.integer <- function(.x, .y, .f, ...) {
modify2_base(map2_int, .x, .y, .f, ...)
}
#' @export
modify2.double <- function(.x, .y, .f, ...) {
modify2_base(map2_dbl, .x, .y, .f, ...)
}
#' @export
modify2.character <- function(.x, .y, .f, ...) {
modify2_base(map2_chr, .x, .y, .f, ...)
}
#' @export
modify2.logical <- function(.x, .y, .f, ...) {
modify2_base(map2_lgl, .x, .y, .f, ...)
}

modify2_base <- function(mapper, .x, .y, .f, ...) {
.f <- as_mapper(.f, ...)
out <- mapper(.x, .y, .f, ...)

# if .x got recycled by map2
if (length(out) > length(.x)) {
.x <- .x[rep(1L, length(out))]
if (vec_is_list(.x) || is.data.frame(.x)) {
out <- map2(vec_proxy(.x), .y, .f, ...)
vec_restore(out, .x)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here regarding df restoration.

} else if (vec_is(.x)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is simultaneously technically correct and also makes me sad that we generate the duplicated column names

modify2(data.frame(x = 1), 1:3, \(x, y) x)
#>   x x x
#> 1 1 1 1

map2_vec(.x, .y, .f, ..., .ptype = .x)
} else if (is.null(.x) || is.list(.x)) {
out <- map2(.x, .y, .f, ...)
if (length(out) > length(.x)) {
.x <- .x[rep(1L, length(out))]
}
.x[] <- out
.x
} else {
cli::cli_abort("Don't know how to modify {.obj_type_friendly {.x}}.")
}
.x[] <- out
.x
}

#' @rdname modify
#' @export
imodify <- function(.x, .f, ...) {
modify2(.x, vec_index(.x), .f, ...)
}


# helpers -----------------------------------------------------------------

modify_where <- function(.x, .where, .f, ..., .error_call = caller_env()) {
if (vec_is_list(.x) || is.data.frame(.x)) {
out <- vec_proxy(.x)
out[.where] <- map(out[.where], .f, ...)
vec_restore(out, .x)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

df-restoration

} else if (vec_is(.x)) {
.x[.where] <- map_vec(.x[.where], .f, ..., .ptype = .x)
.x
} else if (is.null(.x) || is.list(.x)) {
.x[.where] <- map(.x[.where], .f, ...)
.x
} else {
cli::cli_abort(
"Don't know how to modify {.obj_type_friendly {.x}}.",
call = .error_call
)
}
}
Loading