-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OsString::from_wide (in Windows OsStringExt) is unsound #72760
Comments
I think this bug is not in the contract of |
I would classify the priority of this bug as low. Although this call to
rust/src/librustc_middle/ty/layout.rs Lines 505 to 508 in 0e9e408
|
By the way, the docs at https://doc.rust-lang.org/std/char/fn.from_u32_unchecked.html#safety feel insufficient:
Which values are invalid? Similarly, https://doc.rust-lang.org/std/char/index.html and https://doc.rust-lang.org/std/primitive.char.html say that Is this specified elsewhere? What is the closest we have to a written down normative resource that specifies what is or isn’t UB in the the Rust language? (For APIs I assume this is the responsibility of their respective doc-comments.) |
I agree the impact is low, but it's not just Miri -- this blocks #72683, which is how I discovered the problem.
That would be https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html |
But then what is "unchecked" about it? Elsewhere in that file the code is careful not to construct a rust/src/libstd/sys_common/wtf8.rs Lines 92 to 97 in 0e9e408
That made me assume the author was aware that surrogates in |
The
Ok I looked into it, there are three different people involved. The original code in 2014 was YOLO Later in 2014, SimonSapin/rust-wtf8@8a42f9e moved to duplicating logic instead. In 2015, PR #21488 which first imported WTF-8 support in libstd deduplicated that logic by adding a In 2016, PR #32204 changed the signature of |
Okay, so the fix would be to revert the part of #32204 that removed |
That or copy |
Even with that done, there's still UB in another testcase: use std::os::windows::ffi::{OsStrExt, OsStringExt};
use std::ffi::{OsStr, OsString};
fn main() {
let mut base: Vec<u16> = OsStr::new("aé ").encode_wide().collect();
base.push(0xD83D);
let mut _res: Vec<u16> = OsString::from_wide(&base).encode_wide().collect();
} This is converting something to a |
…imulacrum from_u32_unchecked: check validity, and fix UB in Wtf8 Fixes rust-lang#72760
test WTF8 encoding corner cases This adds a Miri-side test for rust-lang/rust#72760. Blocked on rust-lang/rust#72683.
The following program causes UB:
Miri says:
The problem is this code:
rust/src/libstd/sys_common/wtf8.rs
Lines 293 to 305 in 96dd469
This calls
push_code_point_unchecked
unless the new code point is in0xDC00..=0xDFFF
, but what about surrogates in0xD800..0xDC00
?This code is unchanged since its introduction in c5369eb. I am not sure what the intended safety contract of
push_code_point_unchecked
is. That method is not markedunsafe
but clearly should be -- it callschar::from_u32_unchecked
. So my guess is the safety precondition is thatCodePoint
must not be part of a surrogate pair, but the thing is,push
calls it without actually ensuring that condition. The condition it does ensure is that the codepoint is not in0xDC00..=0xDFFF
, but that does not help.The text was updated successfully, but these errors were encountered: