Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fwrite doesn't save dates #1772

Closed
skanskan opened this issue Jul 14, 2016 · 6 comments
Closed

fwrite doesn't save dates #1772

skanskan opened this issue Jul 14, 2016 · 6 comments
Milestone

Comments

@skanskan
Copy link

Hello.

I have a dataset. I can choose to load it on R from a Stata file or from a SPSS file.
In both cases it's loaded properly with the haven package.
The dates are recognized properly.

But when I save it to disk with data.table's fwrite.
fwrite(ppp, "ppp.csv", sep=",", col.names = TRUE)
I have a problem, the dates dissapear and are converted to numbers with no sense.

For example the date 1967-08-06 now is -879

I've also tried playing with fwrite options, such as quote=FALSE, with no success.

I've uploaded a small sample of the files, the spss, the stata and the saved csv.

library(haven)
library(data.table)
ppp <- read_sav("pspss.sav") # choose one of these two.
ppp <- read_dta("pstata.dta")  # choose one of these two.
fwrite(ppp, "ppp.csv",  sep=",", col.names = TRUE) 

http://www73.zippyshare.com/v/OwzwbyQq/file.html

How can I solve it?

Regards

@arunsrinivasan
Copy link
Member

arunsrinivasan commented Jul 14, 2016

Duplicate of #1664. It's not yet implemented.

What you're seeing is the internal representation. By setting class attribute to Date, you should be able to see the date.

@skanskan
Copy link
Author

skanskan commented Jul 14, 2016

When should I set the class attribute to Date? When fwriting or when reading back? And how?

When I read back the csv file with fread the 1967-08-06 date still appears as -879 .
Regards

@arunsrinivasan
Copy link
Member

arunsrinivasan commented Jul 14, 2016

After reading it back. attr(dt$col, 'class') = "Date" or class(dt$col) = "Date" or setattr(dt$col, 'class', "Date").

@skanskan
Copy link
Author

skanskan commented Jul 14, 2016

OK, thank you.
Any easy way to do it easily?. The real file has 1200 variables, many of them are dates.
I mean some way to save the original column classes and change the new...

@skanskan
Copy link
Author

Since it seems there is no simple solution I'm trying to store column classes and change them back again.

I've taken the original dataset ppp,

areDates <- (sapply(ppp, class) == "Date")

I save it on an file and I can read it next time.

ppp <- fread("ppp.csv", encoding="UTF-8")

And now I change the classes of the newly read dataset back to the original one.

ppp[,names(ppp)[areDates] := lapply(.SD,as.Date) , .SDcols = areDates ]

The problem is that it works with this toy example but with the real big example I get the error:

Error in as.Date.numeric(X[[i]], ...) : 'origin' must be supplied

I've also tried with
lapply(.SD,function(xx) class(xx) <- "Date")
but it fills the variables with NA

@jangorecki
Copy link
Member

jangorecki commented Jul 15, 2016

@skanskan I would try lapply(., setattr, "class", "Date"), but just guessing. Changing class of elements in a list fits more on Stackoverflow. Date class is just a numeric having own methods, you "convert" it just by add or drop class attribute (unless you use origin attribute). There is separate issue which discuss that fact and its potential for IO performance (fread-fwrite) when writing Date in string representation will be already implemented, see #1656

@mattdowle mattdowle added this to the v1.9.8 milestone Nov 11, 2016
mattdowle added a commit that referenced this issue Nov 11, 2016
…ite.csv'. Closes #1664. Closes #1772. Closes the fwrite part of #1656.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants