Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Character encoding must be UTF-8, Latin-1 or bytes #6

Closed
blset opened this issue Mar 25, 2019 · 5 comments
Closed

Character encoding must be UTF-8, Latin-1 or bytes #6

blset opened this issue Mar 25, 2019 · 5 comments

Comments

@blset
Copy link

blset commented Mar 25, 2019

Hi

I get this error when running the following program with valid UTF8 encoding

Error in (function (..., na.last = TRUE, decreasing = FALSE, method = c("auto", :
Character encoding must be UTF-8, Latin-1 or bytes

library(cdata)
options(stringsAsFactors = FALSE)
data = data.frame(
  privée = 1:10,
  publique = -(1:10)
)
ct = data.frame(
  variable = c("privée", "publique"),
  value = c("privée", "publique")
)


rowrecs_to_blocks(data, ct)

removing the accent is a workaround

this is on osx
thanks


> sessionInfo()
R version 3.5.2 (2018-12-20)
Platform: x86_64-apple-darwin18.2.0 (64-bit)
Running under: macOS Mojave 10.14.3

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /usr/local/Cellar/openblas/0.3.5/lib/libopenblasp-r0.3.5.dylib

locale:
[1] fr_FR.UTF-8/fr_FR.UTF-8/fr_FR.UTF-8/C/fr_FR.UTF-8/fr_FR.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] cdata_1.0.7

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.0       wrapr_1.8.4      dplyr_0.7.6      assertthat_0.2.0
 [5] grid_3.5.2       plyr_1.8.4       R6_2.3.0         gtable_0.2.0    
 [9] magrittr_1.5     scales_0.5.0     ggplot2_3.1.0    pillar_1.2.3    
[13] rlang_0.3.1      lazyeval_0.2.1   bindrcpp_0.2.2   tools_3.5.2     
[17] glue_1.3.0       purrr_0.3.0      munsell_0.4.3    yaml_2.2.0      
[21] compiler_3.5.2   pkgconfig_2.0.1  colorspace_1.3-2 tidyselect_0.2.4
[25] bindr_0.1.1      tibble_1.4.2  
@JohnMount
Copy link
Member

JohnMount commented Mar 25, 2019

Sorry you ran into trouble. And thank you for taking the time to file an issue.

My theory is that this is a bug with knitr, reprex, or RStudio, so you may want to file it with them.

If I run your code in an RStudio console it works.

   variable value
1    privée     1
2  publique    -1
3    privée     2
4  publique    -2
5    privée     3
6  publique    -3
7    privée     4
8  publique    -4
9    privée     5
10 publique    -5
11   privée     6
12 publique    -6
13   privée     7
14 publique    -7
15   privée     8
16 publique    -8
17   privée     9
18 publique    -9
19   privée    10
20 publique   -10

However if I run the exact same code was a reprex it fails, exactly as you saw. The signature in the error message looks like that of base::order().

reprex::reprex({
library("cdata")
options(stringsAsFactors = FALSE)
data = data.frame(
  privée = 1:10,
  publique = -(1:10)
)
ct = data.frame(
  variable = c("privée", "publique"),
  value = c("privée", "publique")
)


rowrecs_to_blocks(data, ct)

packageVersion("cdata")
sessionInfo()
})
library("cdata")
options(stringsAsFactors = FALSE)
data = data.frame(
  privée = 1:10,
  publique = -(1:10)
)
ct = data.frame(
  variable = c("privée", "publique"),
  value = c("privée", "publique")
)


rowrecs_to_blocks(data, ct)
#> Error in (function (..., na.last = TRUE, decreasing = FALSE, method = c("auto", : Character encoding must be UTF-8, Latin-1 or bytes

packageVersion("cdata")
#> [1] '1.0.8'
sessionInfo()
#> R version 3.5.0 (2018-04-23)
#> Platform: x86_64-apple-darwin15.6.0 (64-bit)
#> Running under: macOS High Sierra 10.13.6
#> 
#> Matrix products: default
#> BLAS: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib
#> LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
#> 
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] cdata_1.0.8
#> 
#> loaded via a namespace (and not attached):
#>  [1] compiler_3.5.0  wrapr_1.8.5     magrittr_1.5    tools_3.5.0    
#>  [5] htmltools_0.3.6 yaml_2.2.0      Rcpp_1.0.0      stringi_1.3.1  
#>  [9] rmarkdown_1.11  highr_0.7       knitr_1.21      stringr_1.4.0  
#> [13] xfun_0.4        digest_0.6.18   evaluate_0.13

Created on 2019-03-25 by the reprex package (v0.2.1)

@JohnMount
Copy link
Member

JohnMount commented Mar 25, 2019

I am moving this issue over to wrapr for more research. WinVector/wrapr#9

I have a smaller example here:

ct = data.frame(
  variable = c("privée", "publique"),
  value = c("privée", "publique"),
  stringsAsFactors = FALSE
)
wrapr::has_no_dup_rows(ct)
#> Error in (function (..., na.last = TRUE, decreasing = FALSE, method = c("auto", : Character encoding must be UTF-8, Latin-1 or bytes

Created on 2019-03-25 by the reprex package (v0.2.1)

@JohnMount
Copy link
Member

I have filed the issue with knitr yihui/knitr#1690 . And I am researching a work-around in wrapr.

---
title: "Runs in console, but won't knit"
output: html_document
---

```{r, error=TRUE}
ct = data.frame(
  variable = c("privée", "publique"),
  value = c("privée", "publique"),
  stringsAsFactors = FALSE
)

do.call(order, as.list(ct))

do.call(order, c(as.list(ct), list(method = "radix")))
```

@JohnMount
Copy link
Member

Fixed our issue with WinVector/wrapr@53ed02e and filed underlying issue as yihui/knitr#1690 . So the work-around for now is to use wrapr 1.8.6 which can be installed with devtools::install_github("WinVector/wrapr").

@blset
Copy link
Author

blset commented Mar 25, 2019

thank you very much

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants