Skip to content

Commit

Permalink
Improved handling of list columns with NULL entries (#4250)
Browse files Browse the repository at this point in the history
* Updated documentation for rbindlist(fill=TRUE)

* Print NULL entries of list as NULL

* Added news item

* edit NEWS, use '[NULL]' not 'NULL'

* fix test

* split NEWS item

* add example

---------

Co-authored-by: Michael Chirico <chiricom@google.com>
Co-authored-by: Michael Chirico <michaelchirico4@gmail.com>
Co-authored-by: Benjamin Schwendinger <benjamin.schwendinger@tuwien.ac.at>
  • Loading branch information
4 people authored Jan 7, 2024
1 parent 15da978 commit f2547b6
Show file tree
Hide file tree
Showing 4 changed files with 17 additions and 3 deletions.
14 changes: 14 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,24 @@

# data.table [v1.15.99]() (in development)

## NEW FEATURES

1. `print.data.table()` shows empty (`NULL`) list column entries as `[NULL]` for emphasis. Previously they would just print nothing (same as for empty string). Part of [#4198](https://github.com/Rdatatable/data.table/issues/4198). Thanks @sritchie73 for the proposal and fix.

```R
data.table(a=list(NULL, ""))
# a
# <list>
# 1: [NULL]
# 2:
```

## NOTES

1. `transform` method for data.table sped up substantially when creating new columns on large tables. Thanks to @OfekShilon for the report and PR. The implemented solution was proposed by @ColeMiller1.

2. The documentation for the `fill` argument in `rbind()` and `rbindlist()` now notes the expected behaviour for missing `list` columns when `fill=TRUE`, namely to use `NULL` (not `NA`), [#4198](https://github.com/Rdatatable/data.table/pull/4198). Thanks @sritchie73 for the proposal and fix.

# data.table [v1.14.99](https://github.com/Rdatatable/data.table/milestone/29) (in development)

## BREAKING CHANGE
Expand Down
2 changes: 1 addition & 1 deletion R/print.data.table.R
Original file line number Diff line number Diff line change
Expand Up @@ -215,7 +215,7 @@ format_col.expression = function(x, ...) format(char.trunc(as.character(x)), ...

format_list_item.default = function(x, ...) {
if (is.null(x)) # NULL item in a list column
""
"[NULL]" # not '' or 'NULL' to distinguish from those "common" string values in data
else if (is.atomic(x) || inherits(x, "formula")) # FR #2591 - format.data.table issue with columns of class "formula"
paste(c(format(head(x, 6L), ...), if (length(x) > 6L) "..."), collapse=",") # fix for #5435 and #37 - format has to be added here...
else if (has_format_method(x) && length(formatted<-format(x, ...))==1L) {
Expand Down
2 changes: 1 addition & 1 deletion inst/tests/tests.Rraw
Original file line number Diff line number Diff line change
Expand Up @@ -320,7 +320,7 @@ test(69.4, names(tables(silent=TRUE, mb=FALSE, index=TRUE)),

xenv = new.env() # to control testing tables()
xenv$DT = data.table(a = 1)
test(69.5, nrow(tables(env=xenv)), 1L, output="NAME NROW NCOL MB COLS KEY\n1: DT 1 1 0 a.*Total: 0MB")
test(69.5, nrow(tables(env=xenv)), 1L, output="NAME NROW NCOL MB COLS KEY\n1: DT 1 1 0 a [NULL]\nTotal: 0MB")
xenv$DT = data.table(A=1:2, B=3:4, C=5:6, D=7:8, E=9:10, F=11:12, G=13:14, H=15:16, key="A,D,F,G")
test(69.6, nrow(tables(env=xenv)), 1L, output="NAME NROW NCOL MB COLS KEY\n1: DT 2 8 0 A,B,C,D,E,F,... A,D,F,G.*Total: 0MB")
rm(xenv)
Expand Down
2 changes: 1 addition & 1 deletion man/rbindlist.Rd
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ rbindlist(l, use.names="check", fill=FALSE, idcol=NULL)
\arguments{
\item{l}{ A list containing \code{data.table}, \code{data.frame} or \code{list} objects. \code{\dots} is the same but you pass the objects by name separately. }
\item{use.names}{\code{TRUE} binds by matching column name, \code{FALSE} by position. `check` (default) warns if all items don't have the same names in the same order and then currently proceeds as if `use.names=FALSE` for backwards compatibility (\code{TRUE} in future); see news for v1.12.2.}
\item{fill}{\code{TRUE} fills missing columns with NAs. By default \code{FALSE}.}
\item{fill}{\code{TRUE} fills missing columns with NAs, or NULL for missing list columns. By default \code{FALSE}.}
\item{idcol}{Creates a column in the result showing which list item those rows came from. \code{TRUE} names this column \code{".id"}. \code{idcol="file"} names this column \code{"file"}. If the input list has names, those names are the values placed in this id column, otherwise the values are an integer vector \code{1:length(l)}. See \code{examples}.}
}
\details{
Expand Down

0 comments on commit f2547b6

Please sign in to comment.