-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
evaluate losing character encoding information of arguments #74
Comments
This allows one to specify the encoding the code is parsed with, and is passed to `base::parse()`. Previously the encoding was implicitly "unknown", so only the code was assumed to be ASCII. Fixes r-lib#74
The On macOS with the C locale: $ LANG="" R -e 'x <- c("寿司"); cl <- call("Encoding", x); eval(cl)'
> x <- c("寿司"); cl <- call("Encoding", x); eval(cl)
[1] "unknown" On Windows with the Chinese locale: > x <- c("寿司")
> cl <- call('Encoding', x)
>
> eval(cl)
[1] "unknown"
> Sys.getlocale()
[1] "LC_COLLATE=Chinese (Simplified)_People's Republic of China.936;LC_CTYPE=Chinese (Simplified)_People's Republic of China.936;LC_MONETARY=Chinese (Simplified)_People's Republic of China.936;LC_NUMERIC=C;LC_TIME=Chinese (Simplified)_People's Republic of China.936" And the problem with an English locale on Windows is known: #59 #66 and I don't think we can fix it in evaluate without a proper fix in base R. I recommend you to post your original problem and see if we can fix it. |
@yihui it doesn't have to be UTF-8 but I the result of |
The root problem here is that |
That is exactly what I was thinking. Let me think a bit longer about it (edge cases, Windows, etc). |
We have been experiencing problems due to
evaluate()
losing character encoding information.The problem goes unnoticed on platforms that treat
unknown
asUTF-8
. But it leads to serious interoperability problems when serializing data. It is important that the encoding bit is retained.A minimal example that returns the correct answer for
eval()
butevaluate()
returnsunknown
.On Windows (english) the strings even gets garbled:
The text was updated successfully, but these errors were encountered: