New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Using SET_STRING_ELT and Rf_mkCharLenCE to handle output string encoding #7

Open

qinwf opened this issue May 3, 2016 · 1 comment

Owner

qinwf commented May 3, 2016

> (res = re2_match("中文","中文",value = T))
     ?nocapture
[1,] "涓枃"   
> Encoding(res) = "UTF-8"
> res
     ?nocapture
[1,] "中文"

The text was updated successfully, but these errors were encountered:

gagolews commented May 3, 2016

I recommend that the output is always UTF-8, irrespectively of the input enc. This is how I do in stringi too.

qinwf pushed a commit that referenced this issue


          Fix: UTF-8 string return #7

dc4de38

qinwf pushed a commit that referenced this issue


          Fix: UTF-8 string return #7

af298f4

qinwf pushed a commit that referenced this issue


          Fix: UTF-8 string return in match #7

a63bafb

qinwf pushed a commit that referenced this issue


          Fix: set pattern string encoding #7

4f3abcf

qinwf mentioned this issue

Track R-GSOC-2016 Progress #2

Open

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment