-
Notifications
You must be signed in to change notification settings - Fork 272
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request: ability to keep/discard list elements by name #817
Comments
stray observation: given the correspondence between lists and data-frames, could tidyselect be useful here? |
@ijlyttle You can experiment with this unexported function which implements tidyselect over all vector inputs: list(a = 1, b = 2, aa = 3) %>%
tidyselect:::select(starts_with("a"))
#> $a
#> [1] 1
#>
#> $aa
#> [1] 3
c(a = 1, b = 2, aa = 3) %>%
tidyselect:::select(starts_with("a"))
#> a aa
#> 1 3 |
This is probably more a |
Honestly if tidyselect:::select could become an exported function (and perhaps renamed to avoid confusion with dplyr::select) I think that would do exactly what I was looking for, right? |
Right. This is a big design decision though. In the meantime you can add it to your set of helper functions if you'd like to use it right away: vec_select <- function(.x, ..., .strict = TRUE) {
pos <- tidyselect::eval_select(quote(c(...)), .x, strict = .strict)
rlang::set_names(.x[pos], names(pos))
} It might be slow with long vectors. Feel free to post any feedback in an issue on the tidyselect repo. |
And in more complex cases, when the predicate function needs to operate both on the name and the value at the same time? |
I strongly agree with deeenes here. IMHO, one could expect a rather homogenous design throughout the tidyverse, and purrr::keep(example, c("a", starts_with("c"), where(~str_detect(.x, "\\d+")))) It would be pretty awesome if |
For those looking for a simple and pipable solutions, albeit only covers a simple cases but they can be modified to cover more general cases. The easiest solution here is to use indexing and not assigning function naming can be improved (not my strongest point :) ) but I think the general gist on how to create these functions is there. #' @param l a named list
#' @param kn a vector containing the names to keep
keep_names <- function(l, kn) {
l[names(l) %in% kn]
}
x <- list(a = 1, b = 2, c = 3)
keep_names(x, "a")
# $a
# [1] 1
keep_names(x, c("a", "b"))
# $a
# [1] 1
#
# $b
# [1] 2
#' @param l a named list
#' @param fn a function that will receive a list of the names. Must produce a TRUE FALSE value. Must be vectorized.
keep_names_func <- function(l, fn) {
l[fn(names(l))]
}
x <- list(ka = 1, kb = 2, c = 3)
keep_names_func(x, function(n){startsWith(n, "k")})
# $ka
# [1] 1
#
# $kb
# [1] 2
# Or even one with both names and value
#' @param l a named list
#' @param fn a function that will receive the list of names and value. a function that will receive a list of the names. Must produce a TRUE FALSE value. Must be vectorized.
keep_names_func_both <- function(l, fn) {
l[fn(names(l), l)]
}
x <- list(ka = 1, kb = 2, c = 3)
keep_names_func_both(x, function(n, v){startsWith(n, "k") & v>=2})
# $kb
# [1] 2
|
I came across purrr's Two problems, though:
I figured I'd put it in front of the group, using @jnolis' example: library("purrr")
example <- as.list(1:5)
names(example) <- list("a", "b", "c", "rstudioconf_2022", "cat")
# this seems like it should work, but it doesn't
modify_at(example, rlang::quos(any_of(letters)), ~NULL)
#> $b
#> [1] 2
#>
#> $rstudioconf_2022
#> [1] 4
# same thing - the sets should be complementary, but they aren't
modify_at(example, rlang::quos(!any_of(letters)), ~NULL)
#> $a
#> [1] 1
#>
#> $b
#> [1] 2
#>
#> $c
#> [1] 3
#>
#> $cat
#> [1] 5 Created on 2022-01-05 by the reprex package (v2.0.1) Of course, I could be doing something wrong™️. |
I think these are interesting ideas but I don't quite see how they fit into purr. A straightfforward implementation of discard_names <- function(.x, .p, ...) {
sel <- .p(names(x))
.x[!is.na(x) & !sel]
}
keep_names <- function(.x, .p, ...) {
sel <- .p(names(x))
.x[!is.na(x) & sel]
} And we're currently moving away from tidyselect usage in purrr, because NSE just doesn't feel very "purrr-like". But maybe we could make something a bit more flexible? keep_names <- function(.x, .names, ...) {
if (is.character(.names) {
idx <- intersect(names(.x), .names)
} else if (is.function(.names) || is_formula(.names)) {
,names <- rlang::as_function(.names)
idx <- .names(names(x))
if (is.logical(idx)) {
idx[is.na(idx)] <- FALSE
} else if (is.character(idx)) {
idx <- intersect(names(.x), idx)
} else if (!is.integer(idx)) {
abort("If `.names` is a function, it must return an logical, integer, or character vector")
}
}
.x[idx]
} Then you could write |
That seems like a reasonable compromise to me! I'd also have the negation for |
Just realised that these should probably be |
That makes sense to me! a |
Some progress: keep_at <- function(.x, .names, ...) {
if (!is_named(.x)) {
cli::cli_abort("{.arg .x} must be named")
}
x_names <- names(.x)
if (is.character(.names)) {
idx <- intersect(.names, x_names)
} else if (is.function(.names) || is_formula(.names)) {
names <- rlang::as_function(.names)
idx <- .names(x_names, ...)
if (is.logical(idx)) {
if (length(idx) != length(x_names)) {
cli::cli_abort("Result of `.fun .names()` must be length {length(x_names}) not {length(idx)}.")
}
idx[is.na(idx)] <- FALSE
} else if (is.character(idx)) {
idx <- intersect(names(.x), idx)
} else {
cli::cli_abort("If {.arg .names} is a function, it must return a logical or character vector, not {.obj_type_friendly {idx}}.")
}
} else {
names <- .names
cli::cli_abort("{.arg .names} must be a function or a character vector, not {.obj_type_friendly {names}}.")
}
.x[idx]
} @jnolis to be clear, |
From the discussion in this twitter thread it seems there is a need to remove elements from lists by name. The current "best" solution is to assign an element NULL using base R commands, which does not have an elegant tidy piping implementation. Since this is a fairly common task that needs to be done, it would be helpful to create a purrr a function that can easily be put within a sequence of piped purrr calls:
One approach I have been thinking of after writing that last tweet would be to create a
purrr::keep_names()
andpurrr::discard_names()
. The point of these functions would be to closely mimic the existingpurrr::keep()
andpurrr::discard()
, but to have the functions be applied to the names of the list rather than the values. It could also work on a vectors of names as the input rather than a function, for the common case when you just want to keep/remove specific elements. So something like this:And then in the case of the Twitter thread, Elaine could have just added
discard_names("b")
into her code.Things I like about adding these functions:
x <- setnames(x, x)
a lot at the start of my purrr piping sequences and that could have a convenience function.This I dislike about adding these functions:
discard_names
example above, the anonymous function would have been called twice, but since %in% is vectorized here a single function call across the whole vector of names would have been fine to get the boolean results of the function. On a list thousands of elements long this could be a problem.map_names()
) is so big I do fall into some "slippery slope" fears of this going too far.These functions seems simple enough that I would think I could personally make a PR request to add them. I would love some feedback on if other people would want them included or if they should be changed somehow. Thank you!!
The text was updated successfully, but these errors were encountered: