-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
vec_get() #141
Comments
This would be useful in |
Perhaps related to tidyverse/dplyr#3789 |
To get this included in vctrs, you need to answer the question: What are the invariants that this function would satisfy? |
A word I would like to insert into this discussion is "atom". As in, atomic vectors have atoms that are of the same type. I think maybe |
I've been staring at this for some time now, and yes this comes from iterating on list columns (from purrr, or my toy zap). Not sure about this, here's a draft:
|
There was a bit of a misunderstanding between @romainfrancois and I regarding list column behavior. I've marked out the relevant info above. At this point, I think the implementation is just: vec_rip <- function(x, i) {
stopifnot(length(i) == 1L)
out <- vec_slice(x, i)
if (rlang::is_bare_list(x)) {
out <- out[[1]]
}
out
} I think this all makes sense in the context of atoms if you think of them as "a basic building block." Notice that for "a list of elements", the building block is a single element and not a list of length 1 (which aligns with what
I'm a little unsure of what the "atom" of a matrix should be. 1 row would make sense, but every element in the matrix is of the same type so 1 single element might also make sense. Same for arrays. The only other caveat is that a single |
Could be vec_atom(). The way i think about this is this function should return something sensible for when you iterate on « observations » of the vctr type. |
Not quite because extracting an observation from an atomic vector should zap the names and all vector-related attributes. I think instead of atom, we should think about observations. Then this function would better be called |
Hmm... Is that true? What would an observation from a factor be in that case? A level? factor(c("a", "b"))[[1]]
#> [1] a
#> Levels: a b
# Does that make sense?
vec_obs(factor(c("a", "b")), 1)
#> [1] "a" |
Taking an observation from a data frame is a type-preserving operation. Taking an observation from a list or list_of is not? This problem is harder than I thought. Maybe, somehow, because data frames are atomic (scalars), when taken as a whole? Slight connection to why @jennybc's intuition that the notion atomicity is important was spot on. Still it's not clear what parts of the type should be dropped/preserved when taking an observation of an atomic object. Should we trust the designers of S that vector names are dropped under |
If a data frame can be seen as an atomic vector, when subsetting observations, this would imply that there should be a notion of missing row. Is a missing row a row full of missings? Do we need a predicate Interestingly: mtcars[1, ][NA, ]
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> NA NA NA NA NA NA NA NA NA NA NA NA |
(@lionel- that's just |
Also worth thinking about this case: x <- package_version("1.1.1")
identical(x, x[[1]])
#> [1] TRUE Created on 2019-01-28 by the reprex package (v0.2.1.9000) |
The inverse operation seems interesting too. Would it be too bad if qr <- qr(lm(y ~ x, data.frame(x = 1, y = 1)))
class(qr)
#> [1] "qr"
typeof(qr)
#> [1] "list"
vctrs::vec_c(qr)
#> $qr
#> (Intercept) x
#> 1 1 1
#> attr(,"assign")
#> [1] 0 1
#>
#> $qraux
#> [1] 1 1
#>
#> $pivot
#> [1] 1 2
#>
#> $tol
#> [1] 1e-07
#>
#> $rank
#> [1] 1 Created on 2019-01-28 by the reprex package (v0.2.1.9000) |
Another thought: maybe we need to more explicitly talk about "container" types, and then |
I think this operation should error if you're attempting to index into a vector that is not a recursive, e.g.: vec_recursive <- function(x) UseMethod("vec_recursive")
vec_recursive.list_of <- function(x) TRUE
vec_recursive.data.frame <- function(x) TRUE
vec_recursive.default <- function(x) {
if (is_bare_list(x)) {
TRUE
} else if (is_vector(x)) {
FALSE
} else {
stop("Non-vector")
}
}
vec_extract <- function(x, i) {
stopifnot(length(i) == 1L)
stopifnot(vec_recursive(x))
vec_slice(x, i)[[1]]
} This clearly distinguishes it from Note that |
I think
If We might need a more general
|
Why does that distinction matter for indexing? |
I don't know if it matters for extraction (though |
I'm running into the case where it would be nice to have the assignment form of this,
library(vctrs)
out <- vec_init(list(), 3)
elt <- 2:3
# cant use vec_assign, of course
vec_assign(out, 1, elt)
#> No common type for `value` <integer> and `x` <list>.
# this is what i want
out[[1]] <- elt
out
#> [[1]]
#> [1] 2 3
#>
#> [[2]]
#> NULL
#>
#> [[3]]
#> NULL
# but i also might "init" a dbl
out <- vec_init(double(), 3)
elt <- 2
# can use vec_assign
vec_assign(out, 1, 2)
#> [1] 2 NA NA
# and i can use this
out[[1]] <- 2
out
#> [1] 2 NA NA Update) For data frames I think this needs to be able to replace a column, not a row. |
Looking into |
It'll probably be called @DavisVaughan suggested it would return rows as lists in case of a data frame, or as a vector of dimensionality n - 1 in case of an array (for instance a dimless vector if a matrix). In that case, rowwise extraction in data frames becomes a recursive operation in the sense that you can dig deep into them by calling recursively Despite being recursive, this operation doesn't really connect to algorithms ordinarily used with recursive data structures such as |
@romainfrancois, so |
Probably it's the whole |
This is what I was getting at way, way above:
|
What @romainfrancois does here has essentially the same implementation of what I used when attempting to reimplement The Essentially you (This is different than what Lionel mentioned above about the other semantics I proposed) |
I think that it would also be nice to have the equivalent of So:
It would be useful for |
As a complement to
vec_slice()
@romainfrancois and I have been discussing the utility of a function that would extract 1 observation, but would be more of an analog to[[
than[
.Possible implementation:
It would function somewhat like:
We are currently undecided on what it "should" do for data frames and matrices. A couple ideas:1) Return the 1 row observation as is (so the same asvec_slice()
, this is shown above). This doesn't feel right.2) Extract the 1 row observation, then coerce it to some lower level type. For data.frames, a list and for matrices, a vector.
If2)
is chosen, one question that came up is "what should it do for list columns"? Two possibilities:The text was updated successfully, but these errors were encountered: