-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Encoding issues in Hebrew. Lost in translation #234
Comments
Probably the error comes from here: |
Thanks, @yutannihilation. Does this also happen in your locale? I wonder if there's a string that should have been tagged as UTF-8 but isn't. |
@yutannihilation that seems to be an accurate assessment. When I run an RMarkdown file that looks like this with
...I get the following
|
Update: when I run https://stackoverflow.com/questions/41717781/warning-input-string-not-available-in-this-locale |
Perhaps Line 298 in de45a37
|
This might reproduce on my locale ( "<U+05D0>"
#> [1] "<U+05D0>" Created on 2019-01-23 by the reprex package (v0.2.1) I'm afraid this might be a result of knitr's breaking change, which seems to require reprex to choose a different strategy for encoding. |
Ah, sorry. It seems I'm wrong if this also occurs with
|
@isteves If you add rmarkdown::render("/path/to/file.Rmd", encoding = "UTF-8") |
Can you try a Japanese character in the Japanese locale, please? |
Japanese characters don't raise errors for me. |
Interesting. @isteves: What about Hebrew characters other than א? (Not looking for a comprehensive answer here.) |
@yutannihilation what does your locale info look like? ( |
Here's is my locale, and a Japanese character for example. Sys.getlocale(); "髙"
#> [1] "LC_COLLATE=Japanese_Japan.932;LC_CTYPE=Japanese_Japan.932;LC_MONETARY=Japanese_Japan.932;LC_NUMERIC=C;LC_TIME=Japanese_Japan.932"
#> [1] "髙" Created on 2019-01-23 by the reprex package (v0.2.1)
Hmm, thanks. |
Can you please try |
@krlmlr it did the trick! 🎊 |
@krlmlr I realized maybe I jumped the gun. The error from before no longer shows up, but I get this as a reprex output for Hebrew: "׳—׳•׳�׳•׳¡"
#> [1] "׳—׳•׳\236׳•׳¡" The other examples get "<U+9AD9>"
#> [1] "<U+9AD9>" ...I think this could be a different bug maybe? |
It's getting better, though. Can you please try again, I updated the branch. |
I'm getting the same behavior. Btw I bumped into an old issue of mine ( Is there a way to trace all functions that any given function calls? (to quickly check if two functions are connected) |
@yutannihilation thanks for the detailed explanation! I guess a manual trace always works... Yeah |
I am hopefully doing a small reprex release at this very moment, to update a test for fs v1.3.1. That is intentionally a very low-risk release. But assuming that goes through in a reasonable amount of time, I want to make some meatier changes soon in dev and let people accumulate some experience. This is a long thread and the knitr/xfun context has changed a lot wrt UTF-8. PR #237 from @krlmlr looks like the way to go. @isteves and @yutannihilation would you consider updating your reprexes or thoughts here, after updating your entire knitr/rmarkdown stack? |
Thanks for the notice. On my locale, the result of the reprexes are the same with the current master. |
@jennybc on Hebrew locale, I'm getting the garbled output but no error:
(strangely, the garbles are slightly different than the ones earlier in this thread) Including my session info below in case I need to update any other packages:
|
I think between:
reprex is handling encoding as well as its dependencies allow (mostly especially the difficulties around encoding on Windows in R itself). I'm closing this. If anyone has a new challenging example, especially one that fails with dev reprex + knitr v1.23, please add it to #262. |
@krlmlr
This is on a Windows 10 machine with Hebrew language
The text was updated successfully, but these errors were encountered: