-
Notifications
You must be signed in to change notification settings - Fork 991
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improved handling of list columns with NULL entries #4250
Conversation
@mattdowle RE: printing NULL entries in list columns as "NULL" instead of "". This has a potentially undesirable side effect on the output of the E.g.:
Instead of:
|
Same as #4196 -- just checking if this is still needed? If so we can merge to master and resume the review process (nearly 4 years later 😅). Also if you'd like someone else to take over the PR, do feel free. cc @ben-schwen since I know you've been working a lot on rbindlist lately. |
AFAIU this PR is only about documentation. Printing something might be better, to not confuse it with the empty string. But then we might should also start to print the empty string "". data.table(a="", b=list(""), c=list(NULL))
# a b c
# <char> <list> <list>
# 1:
tibble(a="a", b=list(""), c=list(NULL))
# # A tibble: 1 × 3
# a b c
# <chr> <list> <list>
# 1 a <chr [1]> <NULL>
|
I like the idea of adding a further delimiter to distinguish This is unavoidable, but using a delimiter at least makes it less likely that string is present in the data. (PS, ideally this PR would be split into two separate PRs, one for the rbindlist documentation change, the other for the change in print behavior, but given the age of the PR / extended lack of review from us / small overall PR size, I'll allow the two-in-one PR in this case) |
@ben-schwen I'll let you give final approval since there's a "mild" breaking change we should agree on (namely, anyone relying on the existing behavior not to print anything for |
No hard fillings on these things but definitely an improvement. I added an example to make it even easier to graph. Together with displaying column types, I guess we give now enough information to the users. |
…5342) * improve documentation for GForce where sorting affects the result * link issue * tests * typo * mention Sys.setlocale * obsolete comment * 1.15.0 on CRAN. Bump to 1.15.99 * Fix transform slowness (#5493) * Fix 5492 by limiting the costly deparse to `nlines=1` * Implementing PR feedbacks * Added inside * Fix typo in name * Idiomatic use of inside * Separating the deparse line limit to a different PR --------- Co-authored-by: Michael Chirico <chiricom@google.com> * Improvements to the introductory vignette (#5836) * Added my improvements to the intro vignette * Removed two lines I added extra as a mistake earlier * Requested changes * Vignette typo patch (#5402) * fix typos and grammatical mistakes * fix typos and punctuation * remove double spaces where it wasn't necessary * fix typos and adhere to British English spelling * fix typos * fix typos * add missing closing bracket * fix typos * review fixes * Update vignettes/datatable-benchmarking.Rmd Co-authored-by: Michael Chirico <michaelchirico4@gmail.com> * Update vignettes/datatable-benchmarking.Rmd Co-authored-by: Michael Chirico <michaelchirico4@gmail.com> * Apply suggestions from code review benchmarking Co-authored-by: Michael Chirico <michaelchirico4@gmail.com> * remove unnecessary [ ] from datatable-keys-fast-subset.Rmd * Update vignettes/datatable-programming.Rmd Co-authored-by: Michael Chirico <michaelchirico4@gmail.com> * Update vignettes/datatable-reshape.Rmd Co-authored-by: Michael Chirico <michaelchirico4@gmail.com> * One last batch of fine-tuning --------- Co-authored-by: Michael Chirico <michaelchirico4@gmail.com> Co-authored-by: Michael Chirico <chiricom@google.com> * fix bad merge * Improved handling of list columns with NULL entries (#4250) * Updated documentation for rbindlist(fill=TRUE) * Print NULL entries of list as NULL * Added news item * edit NEWS, use '[NULL]' not 'NULL' * fix test * split NEWS item * add example --------- Co-authored-by: Michael Chirico <chiricom@google.com> Co-authored-by: Michael Chirico <michaelchirico4@gmail.com> Co-authored-by: Benjamin Schwendinger <benjamin.schwendinger@tuwien.ac.at> * clarify that list input->unnamed list output (#5383) * clarify that list input->unnamed list output * Add example where make.names is used * mention role of make.names * revert from next release branch * manual merge NEWS * manual rebase tests * manual rebase data.table.R * clarify 0 turns off everything --------- Co-authored-by: Ofek <ofekshilon@gmail.com> Co-authored-by: Ani <bloodraven166@gmail.com> Co-authored-by: David Budzynski <56514985+davidbudzynski@users.noreply.github.com> Co-authored-by: Scott Ritchie <sritchie73@gmail.com> Co-authored-by: Benjamin Schwendinger <benjamin.schwendinger@tuwien.ac.at>
* Updated documentation for rbindlist(fill=TRUE) * Print NULL entries of list as NULL * Added news item * edit NEWS, use '[NULL]' not 'NULL' * fix test * split NEWS item * add example --------- Co-authored-by: Michael Chirico <chiricom@google.com> Co-authored-by: Michael Chirico <michaelchirico4@gmail.com> Co-authored-by: Benjamin Schwendinger <benjamin.schwendinger@tuwien.ac.at>
* Updated documentation for rbindlist(fill=TRUE) * Print NULL entries of list as NULL * Added news item * edit NEWS, use '[NULL]' not 'NULL' * fix test * split NEWS item * add example --------- Co-authored-by: Michael Chirico <chiricom@google.com> Co-authored-by: Michael Chirico <michaelchirico4@gmail.com> Co-authored-by: Benjamin Schwendinger <benjamin.schwendinger@tuwien.ac.at>
* Updated documentation for rbindlist(fill=TRUE) * Print NULL entries of list as NULL * Added news item * edit NEWS, use '[NULL]' not 'NULL' * fix test * split NEWS item * add example --------- Co-authored-by: Michael Chirico <chiricom@google.com> Co-authored-by: Michael Chirico <michaelchirico4@gmail.com> Co-authored-by: Benjamin Schwendinger <benjamin.schwendinger@tuwien.ac.at>
* 1.15.0 on CRAN. Bump to 1.15.99 * Fix transform slowness (#5493) * Fix 5492 by limiting the costly deparse to `nlines=1` * Implementing PR feedbacks * Added inside * Fix typo in name * Idiomatic use of inside * Separating the deparse line limit to a different PR --------- Co-authored-by: Michael Chirico <chiricom@google.com> * Improvements to the introductory vignette (#5836) * Added my improvements to the intro vignette * Removed two lines I added extra as a mistake earlier * Requested changes * Vignette typo patch (#5402) * fix typos and grammatical mistakes * fix typos and punctuation * remove double spaces where it wasn't necessary * fix typos and adhere to British English spelling * fix typos * fix typos * add missing closing bracket * fix typos * review fixes * Update vignettes/datatable-benchmarking.Rmd Co-authored-by: Michael Chirico <michaelchirico4@gmail.com> * Update vignettes/datatable-benchmarking.Rmd Co-authored-by: Michael Chirico <michaelchirico4@gmail.com> * Apply suggestions from code review benchmarking Co-authored-by: Michael Chirico <michaelchirico4@gmail.com> * remove unnecessary [ ] from datatable-keys-fast-subset.Rmd * Update vignettes/datatable-programming.Rmd Co-authored-by: Michael Chirico <michaelchirico4@gmail.com> * Update vignettes/datatable-reshape.Rmd Co-authored-by: Michael Chirico <michaelchirico4@gmail.com> * One last batch of fine-tuning --------- Co-authored-by: Michael Chirico <michaelchirico4@gmail.com> Co-authored-by: Michael Chirico <chiricom@google.com> * Improved handling of list columns with NULL entries (#4250) * Updated documentation for rbindlist(fill=TRUE) * Print NULL entries of list as NULL * Added news item * edit NEWS, use '[NULL]' not 'NULL' * fix test * split NEWS item * add example --------- Co-authored-by: Michael Chirico <chiricom@google.com> Co-authored-by: Michael Chirico <michaelchirico4@gmail.com> Co-authored-by: Benjamin Schwendinger <benjamin.schwendinger@tuwien.ac.at> * clarify that list input->unnamed list output (#5383) * clarify that list input->unnamed list output * Add example where make.names is used * mention role of make.names * fix subsetting issue in split.data.table (#5368) * fix subsetting issue in split.data.table * add a test * drop=FALSE on inner [ * switch to 3.2.0 R dep (#5905) * Allow early exit from check for eval/evalq in cedta (#5660) * Allow early exit from check for eval/evalq in cedta Done in the browser+untested, please take a second look :) * Use %chin% * nocov new code * frollmax1: frollmax, frollmax adaptive, left adaptive support (#5889) * frollmax exact, buggy fast, no fast adaptive * frollmax fast fixing bugs * frollmax man to fix CRAN check * frollmax fast adaptive non NA, dev * froll docs, adaptive left * no frollmax fast adaptive * frollmax adaptive exact NAs handling * PR summary in news * copy-edit changes from reviews Co-authored-by: Benjamin Schwendinger <52290390+ben-schwen@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Michael Chirico <michaelchirico4@gmail.com> Co-authored-by: Benjamin Schwendinger <52290390+ben-schwen@users.noreply.github.com> * comment requested by Michael * update NEWS file * Apply suggestions from code review Co-authored-by: Michael Chirico <michaelchirico4@gmail.com> * Apply suggestions from code review Co-authored-by: Michael Chirico <michaelchirico4@gmail.com> * add comment requested by Michael * add comment about int iterator for loop over k-1 obs * extra comments * Revert "extra comments" This reverts commit 03af0e3. * add comments to frollmax and frollsum * typo fix --------- Co-authored-by: Michael Chirico <michaelchirico4@gmail.com> Co-authored-by: Benjamin Schwendinger <52290390+ben-schwen@users.noreply.github.com> * Friendlier error in assignment with trailing comma (#5467) * friendlier error in assignment with trailing comma e.g. `DT[, `:=`(a = 1, b = 2,)`. WIP. Need to add tests and such, but editing from browser before I forget. * Another pass * include unnamed indices on RHS too * tests * NEWS * test numbering * explicit example in NEWS * Link to ?read.delim in ?fread to give a closer analogue of expected behavior (#5635) * fread is similar to read.delim (#5634) * Use ?read.csv / ?read.delim --------- Co-authored-by: Michael Chirico <chiricom@google.com> Co-authored-by: Michael Chirico <michaelchirico4@gmail.com> * Run GHA jobs on 1-15-99 dev branch (#5909) * Make declarations static for covr (#5910) * class= argument for condition calls * Unify logic with helper * Add tests * Use call.=FALSE where possible * correct caught class * strip call=/call.= handling * botched merge --------- Co-authored-by: Ofek <ofekshilon@gmail.com> Co-authored-by: Ani <bloodraven166@gmail.com> Co-authored-by: David Budzynski <56514985+davidbudzynski@users.noreply.github.com> Co-authored-by: Scott Ritchie <sritchie73@gmail.com> Co-authored-by: Benjamin Schwendinger <benjamin.schwendinger@tuwien.ac.at> Co-authored-by: Jan Gorecki <J.Gorecki@wit.edu.pl> Co-authored-by: Benjamin Schwendinger <52290390+ben-schwen@users.noreply.github.com> Co-authored-by: Manuel López-Ibáñez <2620021+MLopez-Ibanez@users.noreply.github.com>
* 1.15.0 on CRAN. Bump to 1.15.99 * Fix transform slowness (#5493) * Fix 5492 by limiting the costly deparse to `nlines=1` * Implementing PR feedbacks * Added inside * Fix typo in name * Idiomatic use of inside * Separating the deparse line limit to a different PR --------- Co-authored-by: Michael Chirico <chiricom@google.com> * Improvements to the introductory vignette (#5836) * Added my improvements to the intro vignette * Removed two lines I added extra as a mistake earlier * Requested changes * Vignette typo patch (#5402) * fix typos and grammatical mistakes * fix typos and punctuation * remove double spaces where it wasn't necessary * fix typos and adhere to British English spelling * fix typos * fix typos * add missing closing bracket * fix typos * review fixes * Update vignettes/datatable-benchmarking.Rmd Co-authored-by: Michael Chirico <michaelchirico4@gmail.com> * Update vignettes/datatable-benchmarking.Rmd Co-authored-by: Michael Chirico <michaelchirico4@gmail.com> * Apply suggestions from code review benchmarking Co-authored-by: Michael Chirico <michaelchirico4@gmail.com> * remove unnecessary [ ] from datatable-keys-fast-subset.Rmd * Update vignettes/datatable-programming.Rmd Co-authored-by: Michael Chirico <michaelchirico4@gmail.com> * Update vignettes/datatable-reshape.Rmd Co-authored-by: Michael Chirico <michaelchirico4@gmail.com> * One last batch of fine-tuning --------- Co-authored-by: Michael Chirico <michaelchirico4@gmail.com> Co-authored-by: Michael Chirico <chiricom@google.com> * Improved handling of list columns with NULL entries (#4250) * Updated documentation for rbindlist(fill=TRUE) * Print NULL entries of list as NULL * Added news item * edit NEWS, use '[NULL]' not 'NULL' * fix test * split NEWS item * add example --------- Co-authored-by: Michael Chirico <chiricom@google.com> Co-authored-by: Michael Chirico <michaelchirico4@gmail.com> Co-authored-by: Benjamin Schwendinger <benjamin.schwendinger@tuwien.ac.at> * clarify that list input->unnamed list output (#5383) * clarify that list input->unnamed list output * Add example where make.names is used * mention role of make.names * fix subsetting issue in split.data.table (#5368) * fix subsetting issue in split.data.table * add a test * drop=FALSE on inner [ * switch to 3.2.0 R dep (#5905) * Allow early exit from check for eval/evalq in cedta (#5660) * Allow early exit from check for eval/evalq in cedta Done in the browser+untested, please take a second look :) * Use %chin% * nocov new code * frollmax1: frollmax, frollmax adaptive, left adaptive support (#5889) * frollmax exact, buggy fast, no fast adaptive * frollmax fast fixing bugs * frollmax man to fix CRAN check * frollmax fast adaptive non NA, dev * froll docs, adaptive left * no frollmax fast adaptive * frollmax adaptive exact NAs handling * PR summary in news * copy-edit changes from reviews Co-authored-by: Benjamin Schwendinger <52290390+ben-schwen@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Michael Chirico <michaelchirico4@gmail.com> Co-authored-by: Benjamin Schwendinger <52290390+ben-schwen@users.noreply.github.com> * comment requested by Michael * update NEWS file * Apply suggestions from code review Co-authored-by: Michael Chirico <michaelchirico4@gmail.com> * Apply suggestions from code review Co-authored-by: Michael Chirico <michaelchirico4@gmail.com> * add comment requested by Michael * add comment about int iterator for loop over k-1 obs * extra comments * Revert "extra comments" This reverts commit 03af0e3. * add comments to frollmax and frollsum * typo fix --------- Co-authored-by: Michael Chirico <michaelchirico4@gmail.com> Co-authored-by: Benjamin Schwendinger <52290390+ben-schwen@users.noreply.github.com> * Friendlier error in assignment with trailing comma (#5467) * friendlier error in assignment with trailing comma e.g. `DT[, `:=`(a = 1, b = 2,)`. WIP. Need to add tests and such, but editing from browser before I forget. * Another pass * include unnamed indices on RHS too * tests * NEWS * test numbering * explicit example in NEWS * Link to ?read.delim in ?fread to give a closer analogue of expected behavior (#5635) * fread is similar to read.delim (#5634) * Use ?read.csv / ?read.delim --------- Co-authored-by: Michael Chirico <chiricom@google.com> Co-authored-by: Michael Chirico <michaelchirico4@gmail.com> * Run GHA jobs on 1-15-99 dev branch (#5909) * overhauled linter * revert code changes * Initial commit of {lintr} approach * first pass at personalization * first custom linter * delint vignettes * delint tests * delint R sources * rm empty * re-merge * Move config to .ci directory * Use endsWithAny * Make declarations static for covr (#5910) * restore lint on branch * extension needed after all? * set option in R * debug printing * Exact file name in option * really hacky approach * skip more linters * One more round of deactivation * FIx whitespace issues (again??) * botched merge * obsolete branch ref * restore simple CI script thanks to upstream fix * more delint * just disable unused_import_linter() everywhere for now * rm whitespace from atime tests * comment about comment --------- Co-authored-by: Ofek <ofekshilon@gmail.com> Co-authored-by: Ani <bloodraven166@gmail.com> Co-authored-by: David Budzynski <56514985+davidbudzynski@users.noreply.github.com> Co-authored-by: Scott Ritchie <sritchie73@gmail.com> Co-authored-by: Benjamin Schwendinger <benjamin.schwendinger@tuwien.ac.at> Co-authored-by: Jan Gorecki <J.Gorecki@wit.edu.pl> Co-authored-by: Benjamin Schwendinger <52290390+ben-schwen@users.noreply.github.com> Co-authored-by: Manuel López-Ibáñez <2620021+MLopez-Ibanez@users.noreply.github.com>
* Updated documentation for rbindlist(fill=TRUE) * Print NULL entries of list as NULL * Added news item * edit NEWS, use '[NULL]' not 'NULL' * fix test * split NEWS item * add example --------- Co-authored-by: Michael Chirico <chiricom@google.com> Co-authored-by: Michael Chirico <michaelchirico4@gmail.com> Co-authored-by: Benjamin Schwendinger <benjamin.schwendinger@tuwien.ac.at>
Updates the documentation for rbindlist(fill=TRUE) and the print method associated with NULL entries in list columns as per discussion in #4198