-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
File encoding doesn't work for some computers #5
Comments
@gracilis Thanks for sharing. I see that the file is encoded in UTF-8 because it has some non-ASCII characters. Perhaps in the future it would be simpler to remove (or replace) the non-ASCII characters from this file so that it is less likely to cause issues. Can you please send us output from running |
Hi Peter,
Below is my sessionInfo when running the code, but the issue only occurs
on a subset of computers whose defaults are set to ASCII to easily type in
Mandarin (I think). Should I have one of those students send the
sessionInfo for comparison?
sessionInfo()R version 3.4.4 (2018-03-15)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.5 LTS
Matrix products: default
BLAS: /usr/lib/libblas/libblas.so.3.6.0
LAPACK: /usr/lib/lapack/liblapack.so.3.6.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] cowplot_0.9.3 readr_1.1.1 ggplot2_3.0.0
loaded via a namespace (and not attached):
[1] Rcpp_0.12.18 digest_0.6.15 withr_2.1.2 dplyr_0.7.4
assertthat_0.2.0 grid_3.4.4 plyr_1.8.4
[8] R6_2.2.2 gtable_0.2.0 magrittr_1.5 scales_0.5.0
pillar_1.2.2 rlang_0.2.0 lazyeval_0.2.1
[15] bindrcpp_0.2.2 labeling_0.3 tools_3.4.4 glue_1.2.0
munsell_0.4.3 hms_0.4.2 compiler_3.4.4
[22] pkgconfig_2.0.1 colorspace_1.3-2 bindr_0.1.1 knitr_1.20
tibble_1.4.2
…
On Tue, Sep 4, 2018 at 9:29 PM, Peter Carbonetto ***@***.***> wrote:
@gracilis <https://github.com/gracilis> Thanks for sharing. I see that
the file is encoded in UTF-8 because it has some non-ASCII characters.
Perhaps in the future it would be simpler to remove (or replace) the
non-ASCII characters from this file so that it is less likely to cause
issues.
Can you please send us output from running sessionInfo()?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#5 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/Ao3ieqkDthXlHOdi7eJ5pl2lDdMOmvUjks5uXzcjgaJpZM4WZ0Ao>
.
--
Grace Hansen
MD/PhD Candidate | University of Chicago
gthansen@uchicago.edu
|
@gracilis I will need the |
Describe the bug
On keyboards set to a non-English setting, e.g. for Mandarin speaking users, files will be encoded with ASCII, which throws errors unless explicitly stated on some R commands.
To Reproduce
papers <- read.csv("~/BSD-QBio4/tutorials/basic_computing_2/data/citations/nature_neuroscience.csv", stringsAsFactors = FALSE)
papers$TitleLength <- nchar(papers$Title)
Error at [something]:
invalid multibyte string at [something]
FIX:
read csv with explicit file encoding, e.g.
papers <- read.csv("~/BSD-QBio4/tutorials/basic_computing_2/data/citations/nature_neuroscience.csv", stringsAsFactors = FALSE,fileEncoding='ASCII')
The text was updated successfully, but these errors were encountered: