-
Notifications
You must be signed in to change notification settings - Fork 272
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use vctrs for flattening into atomic vectors #785
Conversation
160f07f
to
8783f11
Compare
It might be worth revisiting this if Hmm, maybe not faster in some odd cases. This was from that example library(vctrs)
index_flatten <- function(x) {
unlist(x, recursive = FALSE, use.names = FALSE)
}
x <- rep_len(list(1L), 1e6)
bench::mark(
vec_unchop(x, ptype = integer()),
index_flatten(x),
iterations = 10
)
#> # A tibble: 2 x 6
#> expression min median `itr/sec` mem_alloc `gc/sec`
#> <bch:expr> <bch:tm> <bch:t> <dbl> <bch:byt> <dbl>
#> 1 vec_unchop(x, ptype = integer()) 138.3ms 149.3ms 6.62 15.27MB 15.5
#> 2 index_flatten(x) 10.9ms 11.5ms 74.1 3.81MB 8.23 Created on 2020-08-07 by the reprex package (v0.3.0) |
oh so the crucial test case is long lists rather than heavy lists. |
which makes total sense since the likely bottleneck is the common type determination? Though in this case we pass a prototype so that can't be it, maybe the casting (which should essentially be a no-op here). |
I added a # With CRAN purrr
flatten_int(list(x = c(foo = 1L, bar = 2L), baz = 3L))
#> foo bar
#> 1 2 3
# With this PR
flatten_int(list(x = c(foo = 1L, bar = 2L), baz = 3L))
#> foo bar baz
#> 1 2 3 |
6b35f74
to
cd88237
Compare
I now wonder if
|
cd88237
to
861c2f8
Compare
861c2f8
to
a1d204c
Compare
Actually I fixed it but there is currently no way to zap names from a name-spec so this is a behaviour change: # CRAN
names(purrr::flatten_int(list(x = c(1L, 2L), 3L)))
NULL
# This PR
names(flatten_int(list(x = c(1L, 2L), 3L)))
#> [1] "" "" "" Feature request at r-lib/vctrs#1215. |
We should consider renaming to |
Superseded by #912. |
I added a bit of theory in
?flatten
to explain the current flattening semantics.The new implementation supports vctrs coercions:
The historical numeric-to-character coercion is still supported (when vctrs coercion fails) but deprecated:
The historical behaviour with data frames is preserved:
But I wonder if this should be deprecated? In a way this is consistent with how
map()
supports data frames though.Improves performance. Interestingly vctrs has gotten faster than
unlist()
?