-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Formatting/debug of OS strings #22766
Comments
Also, |
Yes, I think this was just an oversight in the initial implementation.
Yes.
They're subtly different in that |
Ok. Then |
Is all that is required to close this issue just changing rust/src/libstd/sys/common/wtf8.rs Lines 388 to 419 in 64a8ffe
fmt::Debug::fmt(&self.to_string_lossy(), formatter) ?
(I'm tagging as E-easy under that expectation, but it'd be nice to have confirmation.) |
I'd personally prefer if the debug implementation on Windows was lossless and had escaped lone surrogates since if I do encounter an OsString that had lone surrogates I'd really like to know what they are and Debug is the tool I'd reach for first. |
So perhaps we should take the existing code that escapes the surrogates and wrap that in another call to |
In my opinion:
I would be prepared to consider a PR with these changes, and I think we can go from there. |
Add lossless debug implementation for unix OsStrs Fixes #22766 Invalid utf8 byte sequences are replaced with `\xFF` style escape codes, while valid utf8 goes through the normal `Debug` implementation. This is necessarily different from the windows Debug implementation, which uses `\u{xxxx}` style escape sequences for unpaired surrogates, but both implementations are consistent in that they are both lossless, and display invalid sequences in the way most similar to existing language syntax. r? @dtolnay
Does the Windows implementation currently print lone surrogates? #46798 only changed how |
@retep998 the windows implementation always printed lone surrogates. |
OsStr
implementsfmt::Debug
:to_lossy_str
, thenfmt::Debug::fmt
on the resultingString
, which escapes all control or non-ASCII characters.impl
I originally wrote for rust-wft8, which is lossless but only escapes surrogate code points. In particular, it happily prints control characters.So there are two issues: they should be consistent across platforms, and is it OK for a
Debug
impl to be lossy?CC @aturon, @alexcrichton
The text was updated successfully, but these errors were encountered: