-
Notifications
You must be signed in to change notification settings - Fork 993
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fread() throws correct non-ASCII messages #4751
Conversation
in order to display message correctly in R
Codecov Report
@@ Coverage Diff @@
## master #4751 +/- ##
=======================================
Coverage 99.47% 99.47%
=======================================
Files 73 73
Lines 14560 14560
=======================================
Hits 14483 14483
Misses 77 77
Continue to review full report at Codecov.
|
@@ -7,8 +7,13 @@ | |||
#include "myomp.h" | |||
#ifdef DTPY | |||
#include "py_fread.h" | |||
#define ENC2NATIVE(s) (s) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @st-pasha , not sure if there's something to do here on the Python side
Merge remote-tracking branch 'origin/master' into shrektan/fread-nonascii-msg # Conflicts: # inst/tests/tests.Rraw
@@ -2,6 +2,7 @@ | |||
#define dt_FREAD_R_H | |||
#define STRICT_R_HEADERS // https://cran.r-project.org/doc/manuals/r-devel/R-exts.html#Error-handling | |||
#include <R.h> | |||
#include <Rinternals.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand we need this for translateChar(mkCharCE(s, ienc))
, right?
It may be good to ask R core for public API for this functionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe but I have the impression from WRE that Rinternals.h is already a public API, at least partially. For example, the string "#include <Rinternals.h>" is exhibited 13 times in WRE. We need it even for just SEXP (which is stated too in WRE). I did find one use of the word 'public' in the context of Rinternals.h:
Note 144 in full is "see The R API: note that these are not all part of the API."
I remember seeing some debate on R-devel mailing list on this over the years. Some R-core members believe that the R API is anything mentioned in WRE. But that would exclude many of the exported (i.e. non-hidden) functions that many packages use. I believe that #define USE_RINTERNALS
before #include <Rinternals.h>
(as data.table used to do in the past) does access truly internal non-API functions/structures. But including Rinternals.h is not necessarily accessing non-public API, in fact it's necessary to include it to access all the public API. Note that data.table.h
includes Rinternals.h
but that doesn't necessarily mean we are using non-public API.
With that said, translateChar
and mkCharCE
are both highlighted by name in WRE (section 5.15) and for that reason there would be no disagreement by any R-core member that those are already public API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that seems... needlessly confusing!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. All those that have thought so have engaged. Then what you discover is that R-core members disagree between themselves. Tangent issues (such as data.table using SET_TRUELENGTH) are quickly raised. I assume what you would like is for R.h to be public and Rinternals.h to be not-public? Underlying that seemingly simple task is i) change to R over time, and ii) in our case of data.table, achievements by using SET_TRUELENGTH that some core members wish was never accessible.
Careful what you wish for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, sorry for fuss. When I read Rinternals I had in mind USE_RINTERNALS switch. So definitely confusing.
This PR addresses the footer warning, not all messages. So making that clear in the news item. Maybe future PRs will address other messages which also include exhibits from the file.
to avoid any interpretation that the non-ASCII characters might cause the warning message
Fixes #4747
Before the PR the last line throw warnings with garbage letters. This PR fixes it.
Example code
Before
After