Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fill for DT[i, :=] #2676

Closed
mattdowle opened this issue Mar 15, 2018 · 5 comments
Closed

fill for DT[i, :=] #2676

mattdowle opened this issue Mar 15, 2018 · 5 comments

Comments

@mattdowle
Copy link
Member

mattdowle commented Mar 15, 2018

> DT = data.table(A=LETTERS[1:5])
> DT[ A %in% c("C","E"), val:=42][]
        A   val
   <char> <num>
1:      A    NA
2:      B    NA
3:      C    42
4:      D    NA
5:      E    42

> DT[ A %in% c("C","E"), val:=42, fill=0][]       # current behaviour
Error in `[.data.table`(DT, A %in% c("C", "E"), `:=`(val, 42), fill = 0) : 
  unused argument (fill = 0)

> DT[ A %in% c("C","E"), val:=42, fill=0][]       # desired behaviour
        A   val
   <char> <num>
1:      A     0
2:      B     0
3:      C    42
4:      D     0
5:      E    42
@franknarf1
Copy link
Contributor

franknarf1 commented Mar 15, 2018

I guess you will want to generalize to accepting a named list, similar to what dplyr/tidyr has in its many functions' fill= args (none of their names comes to mind...).

Example from SO with i and multiple columns in :=: https://stackoverflow.com/a/49283432/

I'm thinking...

library(data.table)
DT = data.table(id = 1:3)
mDT = data.table(id = 1L, v = 2, x = 3)
defaults = list(v = 0, x = 0)

# current syntax
DT[, names(defaults) := defaults ]
DT[mDT, on=.(id), `:=`(v = i.v, x = i.x)]

# desired
DT[mDT, on=.(id), `:=`(v = i.v, x = i.x), fill = defaults]

If you do go that way, there's the similar case of shift as well: shift(data.table(a = 1:2, b = 3:4), fill = list(a = 11, b = 12)).


New example from SO: https://stackoverflow.com/q/51673607/

library(data.table)
a <- data.table(Test=1:4, TestA=5:6)
b <- data.table(TEST=1:10, TestB=11:20)

# current syntax
defaults = list(Test = 0L, TestA = 0L)
new_cols = names(defaults)
b[, (new_cols) := defaults]
b[a, on=.(TEST = Test), (new_cols) := mget(sprintf("i.%s", new_cols))]

# desired syntax
defaults = list(Test = 0L, TestA = 0L)
new_cols = names(defaults)
b[a, on=.(TEST = Test), (new_cols) := mget(sprintf("i.%s", new_cols)), fill = defaults]

@arunsrinivasan
Copy link
Member

Couldn't we use the already existing nomatch argument here? It'd help make nomatch to be not specific to joins. Just a thought ...

@jangorecki
Copy link
Member

jangorecki commented Mar 20, 2018

exactly, #857

@MarkusBonsch
Copy link
Contributor

Concerning @arunsrinivasan's proposal: this might interfere with the new optimized subsetting implementation, where subsets in i are redirected to joins and nomatch is set to 0L, assuming that nomatch has no significance outside joins and can be altered automatically. While this can be changed, it should be considered when implementing nomatch for non-join operations.

@jangorecki
Copy link
Member

Closing as duplicate. nomatch arg is exactly about that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants