format_col/format_list_item printing generics for customization #3414

MichaelChirico · 2019-02-17T15:26:11Z

Successor to @mllg's original PR (I don't have push access there so just continuing here) #3338.

I've built upon @mllg's initial idea to allow two dimensions of print output customization -- one at the column level and one at the row level (for list columns). The latter was covered in #3338.

Currently test 1293 fails:

# Bug #5435 - print.data.table and digits option:
DT <- structure(list(fisyr = 1995:1996, er = list(c(1, 3), c(1, 3)),
    eg = c(0.0197315833926059, 0.0197315833926059), esal = list(
        c(2329.89763779528, 2423.6811023622), c(2263.07456978967,
        2354.16826003824)), fr = list(c(4, 4), c(4, 4)), fg =
c(0.039310363070415,
    0.039310363070415), fsal = list(c(2520.85433070866, 2520.85433070866
    ), c(2448.55449330784, 2448.55449330784)), mr = list(c(5,
    30), c(5, 30)), mg = c(0.0197779376457164, 0.0197779376457164
    ), msal = list(c(2571.70078740157, 4215.73622047244),
c(2497.94263862333,
    4094.82600382409))), .Names = c("fisyr", "er", "eg", "esal",
"fr", "fg", "fsal", "mr", "mg", "msal"), class = c("data.table",
"data.frame"), row.names = c(NA, -2L))

ans1 = capture.output(print(DT, digits=4, row.names=FALSE))
ans2 = c(" fisyr  er      eg      esal  fr      fg      fsal    mr      mg      msal",
         "  1995 1,3 0.01973 2330,2424 4,4 0.03931 2521,2521  5,30 0.01978 2572,4216",
         "  1996 1,3 0.01973 2263,2354 4,4 0.03931 2449,2449  5,30 0.01978 2498,4095")
test(1293, ans1, ans2)

IIUC the problem is that digits is passed on through ... format_list_item and this messes with the output since numerics in lists have digits applied as well.

In a sense this is a more consistent interpretation, so I didn't bother keeping the status quo behavior, but don't want to overwrite the old test behavior without consulting here first.

mattdowle · 2019-02-20T00:58:35Z

Can you explain why not toString please? I don't follow the comments that followed #3338 (comment). R 3.1.0 has toString so no backporting is needed.

MichaelChirico · 2019-02-20T01:47:34Z

Its just shy of being flexible enough; see your comment here:

#2562 (comment)

mattdowle · 2019-02-20T22:45:12Z

I sort of see. Just shy meaning the space in collapse=", " in toString.default? This PR is defining our own methods though. Can't they be toString methods rather than our own new class? We will rarely be falling back to toString.default and if we do, we could always mask toString.default with our own that uses collapse=",".

MichaelChirico · 2019-02-21T00:49:42Z

OK. For some reason I thought toString was somehow recent but apparently it's quite old:

https://github.com/wch/r-source/blame/5a156a0865362bb8381dcd69ac335f5174a4f60c/src/library/base/R/toString.R

I'm still not quite convinced though, as I'm still hung up on the following:

This PR introduces two new generics: format_col and format_list_item. Are we aiming to replace both of these by toString? Seems like it reduces flexibility. Otherwise, we keep just one and then have e.g. format_col and toString for column/list item formatting?

MichaelChirico · 2019-05-01T16:59:13Z

@mattdowle merged this to master and overwrote changes built in #3500 with what was originally proposed here. I think the approach here is cleaner. Also subsumed the NEWS note there into the item for this PR.

codecov · 2019-05-03T08:30:07Z

Codecov Report

Merging #3414 (c57336b) into master (2791043) will decrease coverage by 2.29%.
The diff coverage is 100.00%.

❗ Current head c57336b differs from pull request most recent head 51389d4. Consider uploading reports for the commit 51389d4 to get more accurate results

@@            Coverage Diff             @@
##           master    #3414      +/-   ##
==========================================
- Coverage   99.47%   97.17%   -2.30%     
==========================================
  Files          75       66       -9     
  Lines       14808    12632    -2176     
==========================================
- Hits        14730    12275    -2455     
- Misses         78      357     +279

Impacted Files	Coverage Δ
R/print.data.table.R	`92.30% <100.00%> (-7.70%)`	⬇️
R/last.R	`61.11% <0.00%> (-38.89%)`	⬇️
src/fcast.c	`72.88% <0.00%> (-27.12%)`	⬇️
R/cedta.R	`87.50% <0.00%> (-12.50%)`	⬇️
src/fmelt.c	`88.63% <0.00%> (-11.37%)`	⬇️
R/fcast.R	`90.42% <0.00%> (-9.58%)`	⬇️
src/frank.c	`91.15% <0.00%> (-8.85%)`	⬇️
R/utils.R	`91.66% <0.00%> (-8.34%)`	⬇️
R/bmerge.R	`93.22% <0.00%> (-6.78%)`	⬇️
src/dogroups.c	`93.03% <0.00%> (-6.64%)`	⬇️
... and 62 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0820aab...51389d4. Read the comment docs.

mllg · 2019-06-27T10:00:50Z

Just checking back ... is this PR dead?

FWIW, here is the approach of custom printers for tibbles: https://pillar.r-lib.org/reference/pillar_shaft.html

randomgambit · 2019-07-02T11:53:22Z

happy to do some testing when ready. This is a must have

MichaelChirico · 2019-08-18T15:11:50Z

Related: #605

MichaelChirico · 2020-01-02T09:43:31Z

OK finally merged & updated this PR against current master. We fail one test:

DT2 = data.table(
  Dcol = as.Date('2016-01-01') + 0:2,
  Pcol = as.POSIXct('2016-01-01 01:00:00', tz = 'UTC') + 86400L*(0:2),
  gcol = TRUE, Icol = as.IDate(16801) + 0:2,
  ucol = `class<-`(1:3, 'asdf')
)
test(1610.2, capture.output(print(DT2, class=TRUE)),
     c("         Dcol                Pcol   gcol       Icol   ucol",
       "       <Date>              <POSc> <lgcl>     <IDat> <asdf>",
       "1: 2016-01-01 2016-01-01 01:00:00   TRUE 2016-01-01      1",
       "2: 2016-01-02 2016-01-02 01:00:00   TRUE 2016-01-02      2",
       "3: 2016-01-03 2016-01-03 01:00:00   TRUE 2016-01-03      3"))

That's because in this PR I removed the timezone argument... I originally did so before a release.

Now I guess we'd have to do a deprecation cycle. Or should we continue to support timezone argument?

Default behavior in this PR is to always print the time zone if it's there.

.ci/ci.R

MichaelChirico · 2020-05-19T10:14:13Z

~~TODO: address expression column printing:~~

~~#4196 (comment)~~

nvm, it's already done

…rror; and remove registerS3method as it isn't needed

…remained registered afterwards

mattdowle · 2021-08-05T23:03:46Z

@MichaelChirico This looks good to me to merge now. You?

MichaelChirico · 2021-08-06T00:13:38Z

R/print.data.table.R

+char.trunc <- function(x, trunc.char = getOption("datatable.prettyprint.char")) {
+  trunc.char = max(0L, suppressWarnings(as.integer(trunc.char[1L])), na.rm=TRUE)
+  if (!is.character(x) || trunc.char <= 0L) return(x)
+  idx = which(nchar(x) > trunc.char)


not crucial for this PR in particular, but wouldn't nchar(x, 'width') be more appropriate?

Good idea. Would need a test too in a new PR: your Chinese multi-byte data?

MichaelChirico · 2021-08-06T00:15:46Z

Made a few tweaks, ready for merge now

MichaelChirico mentioned this pull request Feb 17, 2019

Allow to customize format of objects in 'print.data.table' #3338

Closed

MichaelChirico added this to the 1.12.4 milestone Feb 17, 2019

MichaelChirico force-pushed the dt_col_format branch from 071abde to 6135767 Compare May 3, 2019 09:11

MichaelChirico mentioned this pull request May 4, 2019

Printing of an expression type column does not work when the expression wraps to a new line #3011

Closed

MichaelChirico force-pushed the dt_col_format branch from 6135767 to bf591d6 Compare May 4, 2019 14:49

format_col/format_list_item generics for custom printing

3e40e44

MichaelChirico force-pushed the dt_col_format branch from ee8c86b to 3e40e44 Compare May 10, 2019 03:31

franknarf1 mentioned this pull request Jun 30, 2019

show dimensions of list columns with DT #3671

Closed

mattdowle modified the milestones: 1.12.4, 1.13.0 Sep 19, 2019

mattdowle modified the milestones: 1.12.7, 1.12.9 Dec 8, 2019

MichaelChirico mentioned this pull request Jan 2, 2020

Print list-column dims #4154

Merged

Michael Chirico added 2 commits January 2, 2020 17:23

Merge branch 'master' into dt_col_format

7b8e399

complete merge to master

e69cac7

jangorecki reviewed Jan 2, 2020

View reviewed changes

.ci/ci.R Outdated Show resolved Hide resolved

MichaelChirico mentioned this pull request May 19, 2020

All non-atomic types are now converted to lists in rbindlist #4196

Open

mattdowle modified the milestones: 1.13.1, 1.13.3 Oct 17, 2020

mattdowle added 2 commits August 4, 2021 23:11

Merge branch 'master' into dt_col_format

37a6613

eof \n to reduce diff

790e878

mattdowle added a commit that referenced this pull request Aug 5, 2021

end of line whitespace to reduce diff in #3414

2bbd07d

mattdowle added 7 commits August 5, 2021 00:39

Merge branch 'master' into dt_col_format

0402b3a

Merge branch 'master' into dt_col_format

c07cad5

merge timezone argument of print.data.table

868168b

format.data.table back to method dispatch; passes tests

bcb1512

subsume justify and timezone into ... where possible

d43d4ef

add ... to format_list_item.lm in example to pass 'unused argument' e…

3d79602

…rror; and remove registerS3method as it isn't needed

make test.data.table() pass a 2nd run; the method format for complex …

8002114

…remained registered afterwards

MichaelChirico added 2 commits August 5, 2021 17:07

fix issue reference

0795dee

use is.na<- to set NA

51389d4

MichaelChirico commented Aug 6, 2021

View reviewed changes

mattdowle merged commit 831013a into master Aug 6, 2021

mattdowle deleted the dt_col_format branch August 6, 2021 01:12

MichaelChirico mentioned this pull request Aug 6, 2021

truncate.char misbehaves for multibyte characters #5096

Closed

mllg mentioned this pull request Aug 6, 2021

Use format_list_item() to improve data table printing mlr-org/mlr3#674

Merged

grantmcdermott mentioned this pull request Aug 31, 2021

compatibility with sf library #2273

Closed

jangorecki modified the milestones: 1.14.9, 1.15.0 Oct 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

format_col/format_list_item printing generics for customization #3414

format_col/format_list_item printing generics for customization #3414

MichaelChirico commented Feb 17, 2019

mattdowle commented Feb 20, 2019

MichaelChirico commented Feb 20, 2019

mattdowle commented Feb 20, 2019

MichaelChirico commented Feb 21, 2019

MichaelChirico commented May 1, 2019

codecov bot commented May 3, 2019 •

edited

Loading

mllg commented Jun 27, 2019

randomgambit commented Jul 2, 2019

MichaelChirico commented Aug 18, 2019

MichaelChirico commented Jan 2, 2020

MichaelChirico commented May 19, 2020 •

edited

Loading

mattdowle commented Aug 5, 2021

MichaelChirico Aug 6, 2021

mattdowle Aug 6, 2021

MichaelChirico commented Aug 6, 2021

format_col/format_list_item printing generics for customization #3414

format_col/format_list_item printing generics for customization #3414

Conversation

MichaelChirico commented Feb 17, 2019

mattdowle commented Feb 20, 2019

MichaelChirico commented Feb 20, 2019

mattdowle commented Feb 20, 2019

MichaelChirico commented Feb 21, 2019

MichaelChirico commented May 1, 2019

codecov bot commented May 3, 2019 • edited Loading

Codecov Report

mllg commented Jun 27, 2019

randomgambit commented Jul 2, 2019

MichaelChirico commented Aug 18, 2019

MichaelChirico commented Jan 2, 2020

MichaelChirico commented May 19, 2020 • edited Loading

mattdowle commented Aug 5, 2021

MichaelChirico Aug 6, 2021

Choose a reason for hiding this comment

mattdowle Aug 6, 2021

Choose a reason for hiding this comment

MichaelChirico commented Aug 6, 2021

codecov bot commented May 3, 2019 •

edited

Loading

MichaelChirico commented May 19, 2020 •

edited

Loading