Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fwrite should always quote empty strings #2216

Merged
merged 1 commit into from
Jun 29, 2017
Merged

Fwrite should always quote empty strings #2216

merged 1 commit into from
Jun 29, 2017

Conversation

st-pasha
Copy link
Contributor

(under quote='auto' at least), so that NA strings and empty strings can be distinguished in the output. Closes #2215

…), so that NA strings and empty strings can be distinguished in the output. Closes #2215
@st-pasha
Copy link
Contributor Author

Seems like something broke in package nanotime, which causes test 1751 to fail

@jangorecki
Copy link
Member

jangorecki commented Jun 25, 2017

@st-pasha it seems to be related to nanotime upgrade on CRAN on 22/06. fwrite does not output expected format anymore. @eddelbuettel - your example in nanotime package readme doesn't work anymore.

@eddelbuettel
Copy link
Contributor

Yes we make substantial changes to nanotime since the very, very beginning when this first example was set up. I am not entirely sure why fwrite() now fails and returns 0 though. nanotime itself works fine from / to character:

R> v
[1] "2017-06-25T11:56:43.074612001+00:00"
[2] "2017-06-25T11:56:43.074612002+00:00"
[3] "2017-06-25T11:56:43.074612003+00:00"
[4] "2017-06-25T11:56:43.074612004+00:00"
[5] "2017-06-25T11:56:43.074612005+00:00"
R> class(v)
[1] "nanotime"
attr(,"package")
[1] "nanotime"
R> vf <- format(v)
R> vf
[1] "2017-06-25T11:56:43.074612001+00:00"
[2] "2017-06-25T11:56:43.074612002+00:00"
[3] "2017-06-25T11:56:43.074612003+00:00"
[4] "2017-06-25T11:56:43.074612004+00:00"
[5] "2017-06-25T11:56:43.074612005+00:00"
R> class(vf)
[1] "character"
R> nanotime(vf)
[1] "2017-06-25T11:56:43.074612001+00:00"
[2] "2017-06-25T11:56:43.074612002+00:00"
[3] "2017-06-25T11:56:43.074612003+00:00"
[4] "2017-06-25T11:56:43.074612004+00:00"
[5] "2017-06-25T11:56:43.074612005+00:00"
R> class(nanotime(vf))
[1] "nanotime"
attr(,"package")
[1] "nanotime"
R> 

I think we may have been going through int64 before but don't think we need that anymore. Now, what nanotime itself does relies on RcppCCTZ where data.table has its own in writeNanotime. I suspect it may need a closer look. @mattdowle can you take a look there?

@eddelbuettel
Copy link
Contributor

eddelbuettel commented Jun 25, 2017

I have an idea. We moved the representation from S3 to S4, so the test for 'is it nanotime?' in src/fwrite.c and its whichWriter() now fails. I just put a quick REprintf() in, and that confirms it:

R> library(data.table)
data.table 1.10.5 IN DEVELOPMENT built 2017-06-25 12:14:47.993013 UTC
  The fastest way to learn (by data.table authors): https://www.datacamp.com/courses/data-analysis-the-data-table-way
  Documentation: ?data.table, example(data.table) and browseVignettes("data.table")
  Release notes, videos and slides: http://r-datatable.com
R> library(nanotime)
R> v <- nanotime(Sys.time())+1:5
R> dt <- data.table(cbind(a=1:5, b=5:1), v)
R> dt
   a b                                   v
1: 1 5 2017-06-25T12:19:30.015459001+00:00
2: 2 4 2017-06-25T12:19:30.015459002+00:00
3: 3 3 2017-06-25T12:19:30.015459003+00:00
4: 4 2 2017-06-25T12:19:30.015459004+00:00
5: 5 1 2017-06-25T12:19:30.015459005+00:00
R> fwrite(dt, "/tmp/newdt.csv")
Write integer
Write integer
Write integer
Write integer
Write integer
Write integer
Write integer
Write integer
Write integer
Write integer
Write integer
Write integer
Write integer
Write integer
Write integer
Write integer
Write integer
Write integer
Write integer
Write integer
R> 

@joshuaulrich
Copy link

joshuaulrich commented Jun 25, 2017

Here's a small patch that works for @eddelbuettel's example. I didn't want to remove code, so I left the return for INHERITS(column, char_integer64) as-is. The patch is similar to what we do for timeDate objects in xts/src/diff.c.

diff --git a/src/fwrite.c b/src/fwrite.c
index f49a24a..55727f7 100644
--- a/src/fwrite.c
+++ b/src/fwrite.c
@@ -496,6 +496,10 @@ static writer_fun_t whichWriter(SEXP column) {
   case REALSXP:
     if (INHERITS(column, char_integer64))
       return (INHERITS(column, char_nanotime) && dateTimeAs!=DATETIMEAS_EPOCH) ? writeNanotime : writeInteger;
+    if (IS_S4_OBJECT(column)) {
+      SEXP class = STRING_ELT(getAttrib(column, R_ClassSymbol), 0);
+      return (class == char_nanotime) ? writeNanotime : writeInteger;
+    }
     if (dateTimeAs==DATETIMEAS_EPOCH)    return writeNumeric;
     if (INHERITS(column, char_Date))     return writeDateReal;
     if (INHERITS(column, char_POSIXct))  return writePOSIXct;
R> library(data.table)
R> library(nanotime)
R> v <- nanotime(Sys.time())+1:5
R> dt <- data.table(cbind(a=1:5, b=5:1), v)
R> dt
   a b                                   v
1: 1 5 2017-06-25T15:23:08.303934001+00:00
2: 2 4 2017-06-25T15:23:08.303934002+00:00
3: 3 3 2017-06-25T15:23:08.303934003+00:00
4: 4 2 2017-06-25T15:23:08.303934004+00:00
5: 5 1 2017-06-25T15:23:08.303934005+00:00
R> fwrite(dt, "/tmp/newdt.csv")
R> readLines("/tmp/newdt.csv")
[1] "a,b,v"                              "1,5,2017-06-25T15:23:08.303934001Z"
[3] "2,4,2017-06-25T15:23:08.303934002Z" "3,3,2017-06-25T15:23:08.303934003Z"
[5] "4,2,2017-06-25T15:23:08.303934004Z" "5,1,2017-06-25T15:23:08.303934005Z"

@mattdowle
Copy link
Member

@eddelbuettel Thanks for the investigation. I can make the change. But it is failing on CRAN. Did you check reverse dependencies before releasing nanotime to CRAN?

mattdowle added a commit that referenced this pull request Jun 28, 2017
@eddelbuettel
Copy link
Contributor

@mattdowle Sorry, I did not. We changed from S3 to S4, invalidating your existing test.

But the suggested change by @joshuaulrich should address this.

@codecov-io
Copy link

Codecov Report

Merging #2216 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2216      +/-   ##
==========================================
+ Coverage   90.69%   90.69%   +<.01%     
==========================================
  Files          59       59              
  Lines       11476    11481       +5     
==========================================
+ Hits        10408    10413       +5     
  Misses       1068     1068
Impacted Files Coverage Δ
src/fwrite.c 91.21% <100%> (+0.08%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 455c945...57f4a6d. Read the comment docs.

@mattdowle mattdowle merged commit 0b5a1cb into Rdatatable:master Jun 29, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants