Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race condition causes fread to read the data incorrectly occasionally #2260

Closed
st-pasha opened this issue Jul 7, 2017 · 0 comments · Fixed by #2264
Closed

Race condition causes fread to read the data incorrectly occasionally #2260

st-pasha opened this issue Jul 7, 2017 · 0 comments · Fixed by #2264

Comments

@st-pasha
Copy link
Contributor

st-pasha commented Jul 7, 2017

This example uses a 4209 rows x 378 cols dataset.

require(data.table);
f = fread("~/github/h2oai/tests/data/mercedesbenz.csv");
s385 = sum(f$X385);
s380 = sum(f$X380);
s200 = sum(f$X200);
for (b in 1:10000) {
   f = fread("~/github/h2oai/tests/data/mercedesbenz.csv");
   if (sum(f$X385) != s385) { stop("Checksum on column X385 failed"); }
   if (sum(f$X380) != s380) { stop("Checksum on column X380 failed"); }
   if (sum(f$X200) != s200) { stop("Checksum on column X200 failed"); }
 }

In my latest run it failed the first time at b=3498, here's an excerpt of the data in column X385 (should be only 0s and 1s):

[2737]           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0
[2753]           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0
[2769]           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0
[2785]           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0           0
[2801]           0           0           0           0           0           0 -1165986440       32747 -1131141416       32747 -1131141416       32747 -1172970760       32747 -1172189800       32747
[2817] -1131141416       32747 -1130042776       32747 -1172970760       32747 -1165986440       32747 -1172189800       32747 -1130042776       32747 -1165986440       32747 -1172189800       32747
[2833] -1165986440       32747 -1172970760       32747 -1130042776       32747 -1165986440       32747 -1130042776       32747 -1165986440       32747 -1130042776       32747 -1165986440       32747
[2849] -1172189800       32747 -1172970760       32747 -1130042776       32747 -1172189800       32747 -1165986440       32747 -1172970760       32747 -1131141416       32747 -1172970760       32747
[2865] -1172189800       32747 -1172970760       32747 -1165986440       32747 -1130042776       32747 -1165986440       32747 -1131141416       32747 -1165986440       32747 -1165986440       32747
[2881] -1172189800       32747 -1130042776       32747 -1165986440       32747 -1130042776       32747 -1172189800       32747 -1131141416       32747 -1172970760       32747 -1172970760       32747
@st-pasha st-pasha changed the title Race condition causes fread to read the data incorrectlyb Race condition causes fread to read the data incorrectly occasionally Jul 7, 2017
st-pasha added a commit that referenced this issue Jul 8, 2017
@mattdowle mattdowle added this to the v1.10.6 milestone Jul 8, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants