Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rcpp exception with UTF-8 strings on Windows #10

Open
qinwf opened this issue Jun 13, 2016 · 5 comments
Open

Rcpp exception with UTF-8 strings on Windows #10

qinwf opened this issue Jun 13, 2016 · 5 comments

Comments

@qinwf
Copy link
Owner

qinwf commented Jun 13, 2016

This Rcpp issue will affect the error message for regular expression.

re2("this (is 测试")
#> Error: missing closing ): this (is 娴嬭瘯 

Here is an issue about related to this before.

[Rcpp-devel] Unicode on windows 1

[Rcpp-devel] Unicode on windows 2

The solution in the above mailing list posts can not solve the exception handling string problem.

I send an email to the Rcpp mailing list about this issue, and here is links to the discussion:

[Rcpp-devel] Rcpp exception with UTF-8 strings on Windows 1

[Rcpp-devel] Rcpp exception with UTF-8 strings on Windows 2

It seems that Rcpp will not fix this very soon. So I suggest to use the origin R-C API to rewrite existing codes.

@gagolews
Copy link

Just take a look at the way I handle UTF8 string input in stringi. It's pretty simple.
I suggest you LinkingTo: stringi, call stri_enc_toutf8 on a given SEXP object and then play with STRING_ELT etc. on the resulting SEXP.

@qinwf
Copy link
Owner Author

qinwf commented Jun 13, 2016

Yes, I imported stringi and all of the input strings are processed by stri_enc_toutf8.

@qinwf
Copy link
Owner Author

qinwf commented Jun 22, 2016

I opened a PR in Rcpp repo to make this fixable with a macro in Rcpp and it was merged.

@tdhock
Copy link

tdhock commented Jun 24, 2016

that's great that you contributed some code to Rcpp! Good job!

Now when you use the new macro this issue is fixed, right?

So now I would say we can keep the Rcpp interface, right? (we don't need to consider re-writing the re2r interface to use the standard Rinternals.h headers)

@gagolews
Copy link

Rcpp 0.12.6 is now on CRAN

http://dirk.eddelbuettel.com/blog/2016/07/19/#rcpp_0.12.6

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants