Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fread bug with colClasses #2922

Closed
arunsrinivasan opened this issue Jun 2, 2018 · 5 comments
Closed

fread bug with colClasses #2922

arunsrinivasan opened this issue Jun 2, 2018 · 5 comments
Milestone

Comments

@arunsrinivasan
Copy link
Member

arunsrinivasan commented Jun 2, 2018

require(data.table)
# Loading required package: data.table
# data.table 1.11.5 IN DEVELOPMENT built 2018-06-02 00:09:06 UTC; travis  Latest news: http://r-datatable.com
str <- "x1,x2,x3,x4,x5\n1,2,1.5,T,cc\n3,4,2.5,F,ff"

# Read correctly
fread(str)
#    x1 x2  x3 x4 x5
# 1:  1  2 1.5  T cc
# 2:  3  4 2.5  F ff

 # col x4 is wrong
fread(str, colClasses=c("integer", "numeric", "numeric", "integer", "character"))
#    x1 x2  x3 x4 x5
# 1:  1  2 1.5 cc cc
# 2:  3  4 2.5 ff ff

# col x4 is wrong here as well
fread(str, colClasses=c("integer", "numeric", "numeric", "logical", "character"))
#    x1 x2  x3 x4 x5
# 1:  1  2 1.5 cc cc
# 2:  3  4 2.5 ff ff

# correct if colClasses for x4 is 'character' type
fread(str, colClasses=c("integer", "numeric", "numeric", "character", "character"))
#    x1 x2  x3 x4 x5
# 1:  1  2 1.5  T cc
# 2:  3  4 2.5  F ff
@renkun-ken
Copy link
Member

renkun-ken commented Jun 2, 2018

Looks a little bit similar to #2863 but with numeric and integers.

@jangorecki jangorecki added this to the 1.11.6 milestone Jun 3, 2018
@mattdowle
Copy link
Member

Yep, that's a bad one. It's in 1.11.4 and probably 1.11.0+.

@mattdowle mattdowle removed the dev label Jun 3, 2018
@jangorecki jangorecki modified the milestones: 1.12.0, 1.11.6 Jun 6, 2018
@hstahl
Copy link

hstahl commented Aug 21, 2018

The "NULL" value in colClasses no longer drops the column:

fread(str, colClasses=c("integer", "NULL", "numeric", "character", "character"))
#    x1 x2  x3 x4 x5
# 1:  1  2 1.5  T cc
# 2:  3  4 2.5  F ff

@MichaelChirico
Copy link
Member

@hstahl can you file this as a separate issue?

TBH I had no idea fread ever worked like this -- drop argument is designed for this purpose. Not sure how important consistency with read.csv is for this...

@jesskfullwood
Copy link

jesskfullwood commented Aug 22, 2018

I'm running 1.11.2 and just came across the same bug as @hstahl

I have often found it's easier just to specify a list of c("character", "integer", "NULL", "integer" ...) rather than using drop, FWIW.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants