-
Notifications
You must be signed in to change notification settings - Fork 560
remove language-level UB for non-UTF-8 str #792
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
However, I don't believe I can actually use @rfcbot in this repo,
so I think you will need to create an issue on rust-lang/rust
perhaps with most of the PR description pasted there so that I can initiate FCP.
Here are also some textual nits according to the style guide we recently merged.
f40dd70 to
e2fceb4
Compare
Centril
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Approving the text itself to be merged once FCP itself is done and stuff.)
|
Opened rust-lang/rust#71033. |
|
FCP passed. So can we land this? |
|
AIUI, |
strings do not have to be valid UTF-8 any more Cc rust-lang/reference#792 r? @oli-obk
strings do not have to be valid UTF-8 any more Cc rust-lang/reference#792 r? @oli-obk
strings do not have to be valid UTF-8 any more Cc rust-lang/reference#792 r? @oli-obk
clarify that the str invariant is a safety, not validity, invariant Updates these docs to match rust-lang/reference#792
Rollup merge of rust-lang#117534 - RalfJung:str, r=Mark-Simulacrum clarify that the str invariant is a safety, not validity, invariant Updates these docs to match rust-lang/reference#792
Ever since Rust 1.0, the reference said that a non-UTF-8
strcauses immediate UB. In terms of today's terminology, that means thatstrhas a validity invariant of being valid UTF-8.However, that seems unnecessary: the compiler does not actually exploit this, nor is there any clear way it could exploit this. Making UTF-8 a library-level safety invariant is more than enough for everything
strdoes. Most likely, it was made a validity invariant because we had not yet properly teased apart those two concepts when the document was initially written.This is also the conclusion that the UCG WG arrived at in rust-lang/unsafe-code-guidelines#78.
I therefore propose we remove the UTF-8 clause from the language spec, so that
strwill have the same validity invariant as[u8]. @Centril suggested I open this a PR here to put this through FCP, so here we go.Fixes rust-lang/rust#71033