-
Notifications
You must be signed in to change notification settings - Fork 991
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fwrite(): final items #1664
Comments
Do people actually like having |
I find |
@eantonya Agree. I prefer quote=FALSE too. The base R thinking I believe has numbers/ids with leading 0's stored as character format ... the default ensures they get read by Excel as character and the leading 0's not lost. But fwrite could detect that situation and quote just that situation by default. Where character columns contain letters and no embedded quotes, I really don't see why quotes are needed. Plus we save a bit on file size by saving the 2 extra quotes per field. |
@MichaelChirico Agree with you too. fwrite can detect that and put the quotes in those situations. fwrite already does a first-pass through all strings to calculate maximum line length before allocating buffer sizes. It could test if there are any |
@mattdowle great, good point. Should only marginally affect speed then. PS IIRC Excel converts "001" to 1 anyway :| |
@MichaelChirico Now you mention it I do seem to remember Excel doing that. I haven't used Excel for many years now thankfully. |
I would assume Excel behave in an inconsistent (os versions, office versions, os locales, office localces, 365s, etc.) way about that matter. |
Are you planning to include the |
@rafapereirabr? |
@MichaelChirico , I didn't know it was already implemented ! I couldn't try it as I was planning to test it tonight. Just ignore my comment then. ps. I don't think this should be the default. |
[ Update : quote='auto' now fully implemented ] Can we please set I'm being royally screwed right now by having written a data file I needed to carry remotely with
Until then, the marginal cost to |
[ Update: now fixed and fwrite is consistent with write.csv ] I find it a bit odd that
Has output:
Not sure the ideal approach, as floating points are always going to cause headaches... Also note that |
integer64 implemented: 6d55d2f |
…stimate based on sample for efficiency and to prep for sep2 now we can realloc the buffers if needed. #1664
Can you clarify what you are looking for in It's also tremendously fast: less than a minute ( |
@HughParsonage Perfect - that's a pass then. Thanks! Windows has different C functions for reading from files bigger than 4GB so it was feasible that something extra was required for writing too. |
…umns are present. Changed default sep2 from ; to | to distinguish it more from sep=, default. #1664
excellent stuff Matt, thanks so much!! |
Do we need to use "library(bit64)" with fwrite and fread when we have long numbers or not anymore? |
Thank you very much for your work, this feature is really useful. I caught 2 strange issues:
|
@stanislav-a It would be useful if you could provide your |
@jangorecki I reinstall package with last commit, now it works fine, thank you. |
I can confirm the first issue @stanislav-a has mentioned, I'm on the 1.9.8 release. I use the following generated file, it's a simple csv file: https://gist.github.com/thvasilo/6edffdccda87f09572cbc4184662af47 surv_1k <- fread("surv_1k.csv")
fwrite(surv_1k, "copy.csv")
# Error: isLOGICAL(showProgress) is not TRUE Session info:
I haven't tried the latest master. |
please update, Matt just fixed this
…On Dec 2, 2016 6:51 AM, "Theodore Vasiloudis" ***@***.***> wrote:
I can confirm the first issue @stanislav-a
<https://github.com/stanislav-a> has mentioned, I'm on the 1.9.8 release.
I use the following generated file, it's a simple csv file:
https://gist.github.com/thvasilo/6edffdccda87f09572cbc4184662af47
surv_1k <- fread("surv_1k.csv")
fwrite(surv_1k, "copy.csv")
# Error: isLOGICAL(showProgress) is not TRUE
Session info:
> sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.1 LTS
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8
[4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] purrr_0.2.2 caret_6.0-73 ggplot2_2.2.0 lattice_0.20-34 data.table_1.9.8
loaded via a namespace (and not attached):
[1] Rcpp_0.12.7 magrittr_1.5 splines_3.3.2 MASS_7.3-45
[5] munsell_0.4.3 colorspace_1.2-6 foreach_1.4.3 minqa_1.2.4
[9] stringr_1.1.0 car_2.1-4 plyr_1.8.4 tools_3.3.2
[13] parallel_3.3.2 nnet_7.3-12 pbkrtest_0.4-6 grid_3.3.2
[17] gtable_0.2.0 nlme_3.1-128 mgcv_1.8-16 quantreg_5.29
[21] MatrixModels_0.4-1 iterators_1.0.8 lme4_1.1-12 lazyeval_0.2.0
[25] assertthat_0.1 tibble_1.2 Matrix_1.2-7.1 nloptr_1.0.4
[29] reshape2_1.4.2 ModelMetrics_1.1.0 codetools_0.2-15 stringi_1.1.1
[33] scales_0.4.1 stats4_3.3.2 SparseM_1.74
I haven't tried the latest master.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1664 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AHQQdTbq8P9plEY4BtUrdCKBRDlYvQP2ks5rEAY5gaJpZM4IL7OV>
.
|
I just upgraded to 1.9.8 today and still have the same issue. R version 3.3.2 (2016-10-31) locale: attached base packages: other attached packages: loaded via a namespace (and not attached): |
@MichaelChirico David is already on 1.10 according to session info.
|
Include R_CheckUserInterrupt and test closes worker team ok1a4263fAdd "NEW:" item to startup bannersetthreads
tosetDTthreads
so as not to affect other packages using OpenMP, add reference to its manual in fwrite manThe text was updated successfully, but these errors were encountered: