Skip to content

Commit

Permalink
Closes #563. fread gains encoding arg, also fixes windows enc issue.
Browse files Browse the repository at this point in the history
  • Loading branch information
arunsrinivasan committed Aug 25, 2015
1 parent 88dffe5 commit f089fbf
Show file tree
Hide file tree
Showing 4 changed files with 4 additions and 4 deletions.
2 changes: 1 addition & 1 deletion R/fread.R
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@

fread <- function(input="",sep="auto",sep2="auto",nrows=-1L,header="auto",na.strings="NA",stringsAsFactors=FALSE,verbose=getOption("datatable.verbose"),autostart=1L,skip=0L,select=NULL,drop=NULL,colClasses=NULL,integer64=getOption("datatable.integer64"),dec=if (sep!=".") "." else ",", check.names=FALSE, encoding="unknown", showProgress=getOption("datatable.showProgress"),data.table=getOption("datatable.fread.datatable")) {
if (!is.character(dec) || length(dec)!=1L || nchar(dec)!=1) stop("dec must be a single character e.g. '.' or ','")
# handle encoding, #568
# handle encoding, #563
if (missing(encoding)) {
encoding = NULL
} else if (!encoding %in% c("unknown", "UTF-8", "Latin-1")) {
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@

26. `merge.data.table` gains arguments `by.x` and `by.y`. Closes [#637](https://github.com/Rdatatable/data.table/issues/637) and [#1130](https://github.com/Rdatatable/data.table/issues/1130). No copies are made even when the specified columns aren't key columns in data.tables, and therefore much more fast and memory efficient. Thanks to @blasern for the initial PRs.

27. `fread()` gains `eocnding` argument. Acceptable values are "unknown", "UTF-8" and "Latin-1" with default value of "unknown". Closes [#568](https://github.com/Rdatatable/data.table/issues/568). Thanks to @BenMarwick for the original report and to the many requests from others, and Q on SO.
27. `fread()` gains `eocnding` argument. Acceptable values are "unknown", "UTF-8" and "Latin-1" with default value of "unknown". Closes [#563](https://github.com/Rdatatable/data.table/issues/563). Thanks to @BenMarwick for the original report and to the many requests from others, and Q on SO.

#### BUG FIXES

Expand Down
2 changes: 1 addition & 1 deletion inst/tests/tests.Rraw
Original file line number Diff line number Diff line change
Expand Up @@ -6734,7 +6734,7 @@ test(1546, set(df1, grep("^[ ]*$", df1$cats), 1L, NA_integer_), df2)
foo <- function(x, y, ...) { getdots() }
test(1547, foo(1L, 5L, a=2L, "c"), c("2", "c"))

# Fix for encoding issues in windows, #568
# Fix for encoding issues in windows, #563
# perhaps a better way to check exact output in addition to testing encoding?
text="A,B\ną,ž\nū,į\nų,ė\nš,ę\n"
test(1548.1, unique(unlist(lapply(fread(text, sep=",", header=TRUE), Encoding))), "unknown")
Expand Down
2 changes: 1 addition & 1 deletion src/fread.c
Original file line number Diff line number Diff line change
Expand Up @@ -425,7 +425,7 @@ SEXP readfile(SEXP input, SEXP separg, SEXP nrowsarg, SEXP headerarg, SEXP nastr
clock_t t0 = clock();
ERANGEwarning = FALSE; // just while detecting types, then TRUE before the read data loop

// Encoding, #568: Borrowed from do_setencoding from base R
// Encoding, #563: Borrowed from do_setencoding from base R
// https://github.com/wch/r-source/blob/ca5348f0b5e3f3c2b24851d7aff02de5217465eb/src/main/util.c#L1115
// Check for mkCharLenCE function to locate as to where where this is implemented.
cetype_t ienc;
Expand Down

0 comments on commit f089fbf

Please sign in to comment.