-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Treat args/env as lossy UTF-8 #12283
Conversation
pub fn as_bytes_no_nul<'a>(&'a self) -> &'a [u8] { | ||
if self.buf.is_null() { fail!("CString is null!"); } | ||
unsafe { | ||
cast::transmute((self.buf, self.len())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Transmute a raw::Slice
to avoid accidentally getting the order mixed up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. I was copying as_bytes()
but this is a good excuse to fix that.
os::args() was using str::raw::from_c_str(), which would assert if the C-string wasn't valid UTF-8. Switch to using from_utf8_lossy() instead, and add a separate function os::args_as_bytes() that returns the ~[u8] byte-vectors instead.
Parse the environment by default with from_utf8_lossy. Also provide byte-vector equivalents (e.g. os::env_as_bytes()). Unfortunately, setenv() can't have a byte-vector equivalent because of Windows support, unless we want to define a setenv_bytes() that fails under Windows for non-UTF8 (or non-UTF16).
New version pushed |
Generally I like this approach. Why |
The character |
Change `os::args()` and `os::env()` to use `str::from_utf8_lossy()`. Add new functions `os::args_as_bytes()` and `os::env_as_bytes()` to retrieve the args/env as byte vectors instead. The existing methods were left returning strings because I expect that the common use-case is to want string handling. Fixes #7188.
Minor refactor format-args * Move all linting logic into a single format implementations struct This should help with the future format-args improvements. **NOTE TO REVIEWERS**: use "hide whitespace" in the github diff -- most of the code has shifted, but relatively low number of lines actually modified. Followig up from rust-lang#12274 r? `@xFrednet` --- changelog: none
Change
os::args()
andos::env()
to usestr::from_utf8_lossy()
.Add new functions
os::args_as_bytes()
andos::env_as_bytes()
to retrieve the args/env as byte vectors instead.The existing methods were left returning strings because I expect that the common use-case is to want string handling.
Fixes #7188.