Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

robust encoding in fread (like 'fread("iconv -f ISO-8859-1 -t UTF-8 mytextfile.txt")') #1748

Closed
andreasio opened this issue Jun 22, 2016 · 5 comments

Comments

@andreasio
Copy link

Since #563 is closed, I am adding this as a feature request.

fread still causes trouble on most of the files I work with (danish windows files). E.g. I have an iso-8850-1 windows file, that I can't fread correctly on linux. This workaround (#563 (comment) works). And read.csv works (i.e. produces the correct øæå letters, instead of e.g. \xe6), but fread(... encoding = "Latin-1") does not.

Sincerely

@arunsrinivasan
Copy link
Member

arunsrinivasan commented Jun 22, 2016

Please provide a reproducible example.

Possible duplicate of #1726

@andreasio
Copy link
Author

path <- "http://is.gd/0UMAfv"
testdata <- fread(path)

now when you do testdate$ rstudio will respond with this message, the moment you enter the $ sign:

Error in tolower(completions) : invalid multibyte string 1

names(testdata) returns this:

names(testdata)

[1] "F\xe6rdig"           "Forventet udl\xe6rt" "Afslut.\xe5rsag"     "Virk. navn"         
[5] "Lsted_id"  

None of theese issues happen if I perform them on windows, or if I save the file and do

fread("iconv -f ISO-8859-1 -t UTF-8 filename")

Hope this helps

Sincerely

@jangorecki
Copy link
Member

jangorecki commented Jun 24, 2016

@andreasio is it what you expect?

path <- "http://is.gd/0UMAfv"
testdata <- fread(path, encoding="Latin-1")
names(testdata)
#[1] "Færdig"           "Forventet udlært" "Afslut.årsag"     "Virk. navn"       "Lsted_id"   

If so, please re-run on latest devel version, it may have been fixed since you install it. If it didn't help please come back with sessionInfo().

@andreasio
Copy link
Author

Yes Thats what i expect, but on my linux Box Thats not what Happens?
On Jun 24, 2016 22:44, "Jan Gorecki" notifications@github.com wrote:

@andreasio https://github.com/andreasio is it what you expect?

path <- "http://is.gd/0UMAfv"testdata <- fread(path, encoding="Latin-1")
names(testdata)#[1] "Færdig" "Forventet udlært" "Afslut.årsag" "Virk. navn" "Lsted_id"


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#1748 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AA1xnxW6ebxnSFOM_Xm2ApvXelic8meWks5qPEGXgaJpZM4I7kJi
.

@andreasio
Copy link
Author

in 1.9.7 it works as expected. Thanks.
On Jun 24, 2016 23:58, "Andreas Christoffersen" iam@andreas.io wrote:

Yes Thats what i expect, but on my linux Box Thats not what Happens?
On Jun 24, 2016 22:44, "Jan Gorecki" notifications@github.com wrote:

@andreasio https://github.com/andreasio is it what you expect?

path <- "http://is.gd/0UMAfv"testdata <- fread(path, encoding="Latin-1")
names(testdata)#[1] "Færdig" "Forventet udlært" "Afslut.årsag" "Virk. navn" "Lsted_id"


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#1748 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AA1xnxW6ebxnSFOM_Xm2ApvXelic8meWks5qPEGXgaJpZM4I7kJi
.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants