-
Notifications
You must be signed in to change notification settings - Fork 991
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fread crashes when reading large file with binary data #1895
Comments
to clarify, the gist contains two files https://gist.github.com/tdhock/67f8507fee522343cc813a2affcb9d37/raw/8465f80bbee4fd26f01069a81fd19bc71013c71a/big_fread_crashes.txt.xz is the file that crashes fread https://gist.github.com/tdhock/67f8507fee522343cc813a2affcb9d37/raw/8465f80bbee4fd26f01069a81fd19bc71013c71a/small_fread_ok.txt.xz is a control file -- it is slightly smaller than the other one, and it does not crash fread (even on my laptop) |
Probably nothing to do with crash on binary data in a csv file, but it is recommended to use data.table from our package repository at https://rdatatable.github.io/data.table because master is not guaranteed to pass unit tests. Package is published to repository only if all tests are passing fine. |
another computer where the MRE does not result in a crash:
|
on my laptop which exhibits the crash the only differences between the verbose output on the bad and control files are:
|
by the way, if you guys are having a hard time replicating this on one of your computers, I would be more than happy to help by testing on my laptop, where the crash has been known to occur. |
I checked my old example with R-3.4.1, and the old version of data.table mentioned in that gist, and it is still giving the same segfault. I then checked with the newest data.table version from Github, and it is now working! (no crash) So congratulations, something you guys did fixed the crash I was having. However I was a bit surprised that fread did not give any warning about an unfinished line or binary data, since there is binary data on the last line of those files. For example I run |
here is a screenshot which shows the binary data at the end of the file https://gist.github.com/tdhock/67f8507fee522343cc813a2affcb9d37#file-big_binary_data_at_end-png |
Checking with the latest version of As for the "binary data" at the end of the file(s) -- they are |
great thanks |
Hey @arunsrinivasan I saw you were investigating one of the related issues #1464 #1183 #1119 where fread crashes R. This issue is happening on my laptop which compiles data.table like this
The MRE which crashes R on my laptop is from this gist https://gist.github.com/tdhock/67f8507fee522343cc813a2affcb9d37#file-crash-r
which gives me the following output.
I tried my best to create a MRE, but upon trying it on a different computer (with the same version of data.table), it does not crash R. Any ideas? Or is this a bug in my compiler?
The text was updated successfully, but these errors were encountered: