Encoding issue during knitting on Windows #1944

cderv · 2021-01-14T13:33:17Z

Initial Context

This was reported by @thomasp85 while working on fonts support for graphic design.

Here are the initial issues

---
title: "Untitled"
author: "C. Dervieux"
date: "13/01/2021"
output: html_document
---

```{r setup}
library(ggplot2)
preview_devices <- function(p, width = 2, height = 1) {
  quartz_file <- fs::path(knitr::fig_path(),  "windows.png")
  if (!dir.exists(dirname(quartz_file))) dir.create(dirname(quartz_file), recursive = TRUE)
  png(quartz_file, width, height, units = 'in', res = 300, type = "windows")
  plot(
    p + 
      ggtitle("  Windows device") + 
      theme(plot.title = element_text(size = 10, hjust = 0.5), plot.title.position = 'plot')
  )
  dev.off()
  cairo_file <- fs::path(knitr::fig_path(),  "cairo.png")
  png(cairo_file, width, height, units = 'in', res = 300, type = "cairo")
  plot(
    p + 
      ggtitle("  Cairo device") + 
      theme(plot.title = element_text(size = 10, hjust = 0.5), plot.title.position = 'plot')
  )
  dev.off()
  ragg_file <- fs::path(knitr::fig_path(),  "ragg.png")
  ragg::agg_png(ragg_file, width, height, units = 'in', res = 300)
  plot(
    p + 
      ggtitle("  Ragg device") + 
      theme(plot.title = element_text(size = 10, hjust = 0.5), plot.title.position = 'plot')
  )
  dev.off()
  list(quartz = quartz_file, cairo = cairo_file, ragg = ragg_file)
}
```

* * *

## Support of non-latin scripts

A device should recognise and properly handle scripts that flows in a different
direction the left-to-right

- The graphic engine in R does not permit devices to handle vertical text 😞

```{r, fig.show='hold'}
hebrew_text <- "זהו טקסט בעברית"
arabic_text <- "هذا نص باللغة العربية"
Encoding(arabic_text)
p <- ggplot() + 
  geom_text(aes(x = 0, y = 1:2, label = c(arabic_text, hebrew_text)), family = "Arial") + 
  expand_limits(y = c(0, 3)) +
  theme_void() + 
  theme(panel.background = element_rect('gray90', 'white', 3))
files <- preview_devices(p)
knitr::include_graphics(files$quartz)
knitr::include_graphics(files$cairo)
knitr::include_graphics(files$ragg)
```

leading to this graphs when executed in the IDE in R console

but this in the knitted document

Issues with encoding in knitr

Using a test.Rmd file encoded in UTF-8 with this content

```{r text}
hebrew_text <- "זהו טקסט בעברית"
Encoding(hebrew_text)
arabic_text <- "هذا نص باللغة العربية"
Encoding(arabic_text)
```

```{r}
hebrew_text
arabic_text
```

it will lead to a different result in the IDE when executing chunk (which leads to code being executed in the R console)

than when knitted knitr::knit("test.Rmd") resulting in

```r
hebrew_text <- "זהו טקסט בעברית"
Encoding(hebrew_text)
```

```
## [1] "unknown"
```

```r
arabic_text <- "هذا نص باللغة العربية"
Encoding(arabic_text)
```

```
## [1] "unknown"
```


```r
hebrew_text
```

```
## [1] "<U+05D6><U+05D4><U+05D5> <U+05D8><U+05E7><U+05E1><U+05D8> <U+05D1><U+05E2><U+05D1><U+05E8><U+05D9><U+05EA>"
```

```r
arabic_text
```

```
## [1] "<U+0647><U+0630><U+0627> <U+0646><U+0635> <U+0628><U+0627><U+0644><U+0644><U+063A><U+0629> <U+0627><U+0644><U+0639><U+0631><U+0628><U+064A><U+0629>"
```

It seems that there are some conversions happening during the evaluation process that leads to incorrect support of those UTF-8 strings

@yihui you may already know about this limitations regarding encoding. Are we missing something ?

The text was updated successfully, but these errors were encountered:

cderv · 2021-01-14T13:40:52Z

And I think this is known and related to old encoding issues : r-lib/evaluate#59 and r-lib/evaluate#66

There was a very close one that had a fix but it seems it did not fix this completly r-lib/evaluate#74

yihui · 2021-01-24T02:52:33Z

Yes, it's a known issue: r-lib/evaluate#59. Unfortunately, there's nothing we could do about it, except waiting for the UTF-8 build of R is officially available: https://developer.r-project.org/Blog/public/2020/07/30/windows/utf-8-build-of-r-and-cran-packages/index.html

Actually there could a workaround but it depends on if your Windows supports the locale. That is, you can call Sys.setlocale(, "LANGUAGE") in .Rprofile. In this case, I don't know what the language name is since I know nothing about Hebrew. I've tried other languages like Chinese, German, and French, etc.

yihui · 2022-03-24T05:10:10Z

R 4.2.0 is coming in about a month: https://developer.r-project.org I guess the current R-devel already works: https://cloud.r-project.org/bin/windows/base/rdevel.html If not, we can reopen this issue and investigate further.

github-actions · 2022-09-21T05:42:14Z

This old thread has been automatically locked. If you think you have found something related to this, please open a new issue by following the issue guide (https://yihui.org/issue/), and link to this old issue if necessary.

cderv added the bug Bugs label Jan 14, 2021

cderv mentioned this issue Dec 8, 2021

Unicode symbols in ggplot fail to render, but only in markdown rstudio/rmarkdown#2256

Open

cderv mentioned this issue Jan 5, 2022

different results between chunk and knitted markdown / Cyrillic letters #2091

Closed

3 tasks

yihui closed this as completed Mar 24, 2022

github-actions bot locked as resolved and limited conversation to collaborators Sep 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Encoding issue during knitting on Windows #1944

Encoding issue during knitting on Windows #1944

cderv commented Jan 14, 2021

cderv commented Jan 14, 2021 •

edited

Loading

yihui commented Jan 24, 2021 •

edited

Loading

yihui commented Mar 24, 2022

github-actions bot commented Sep 21, 2022

Encoding issue during knitting on Windows #1944

Encoding issue during knitting on Windows #1944

Comments

cderv commented Jan 14, 2021

Initial Context

Issues with encoding in knitr

cderv commented Jan 14, 2021 • edited Loading

yihui commented Jan 24, 2021 • edited Loading

yihui commented Mar 24, 2022

github-actions bot commented Sep 21, 2022

cderv commented Jan 14, 2021 •

edited

Loading

yihui commented Jan 24, 2021 •

edited

Loading