Skip to content

Add write_utf8 to io::Write #282

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ChrisDenton opened this issue Oct 18, 2023 · 3 comments
Closed

Add write_utf8 to io::Write #282

ChrisDenton opened this issue Oct 18, 2023 · 3 comments
Labels
api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api

Comments

@ChrisDenton
Copy link
Member

ChrisDenton commented Oct 18, 2023

Proposal

Problem statement

When writing through io::Write we need to convert to bytes, thus losing encoding information. When writing to the Windows console, we recover this information by doing a UTF-8 check on the bytes. This should be redundant in many cases.

Requiring bytes also means that users are often required to pepper as_bytes() when writing strings, unless they use write_fmt. Though this is admittedly more of a minor annoyance then a serious issue.

Motivating examples or use cases

out.write("foo".as_bytes());
out.write("bar".as_byte());
out.write("baz".as_bytes());
// etc, etc, etc

Solution sketch

Add write_utf8 to io::Write. So named to avoid conflict with any write_str function that may be implemented on a type.

pub trait Write {
    fn write_utf8(&mut self, buf: &str) -> io::Result<usize>;
}

Admittedly write_utf8 does have the issue of what to do when a partial write falls outside of a code point boundary. This could be addressed in an implementation defined manner or just by always using write_all semantics. The only difference between the two is the first option allows for short writes that happen to fall on a boundary.

Alternatives

Incomplete UTF-8 writes

pub trait Write {
    // [u8] buffer is assumed to be UTF-8.
    // However, it may start with a partial UTF-8 sequence if it completes a previously written incomplete sequence.
    // Otherwise it's an error.
    unsafe fn write_utf8(&mut self, buf: &[u8]) -> io::Result<usize>;
}

This doesn't fully solve the issue (still needs .as_bytes()!) but allows for the implementation to do whatever it likes under the assumption that the bytes really are str that's in the process of being written.


EDIT: Remove references to ascii.

@ChrisDenton ChrisDenton added api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api labels Oct 18, 2023
@ChrisDenton
Copy link
Member Author

After thinking about this some more, I also opened rust-lang/rust#116871 for not erroring if given invalid Unicode (instead it's lossy). I think although the two issues are related, either would be useful whether or not the other is accepted.

@ChrisDenton
Copy link
Member Author

This was discussed in the libs-api meeting. An important point that was raised is that, to be most useful the "is valid UTF-8" property would need to be preserved by intermediaries (e.g. buffer types). This means that all existing types (in std and the wider crate ecosystem) would need updating otherwise it'd be of very limited use. Tbh, I do agree that this is a strong argument against this proposal.

That aside, I can look at how this would affect performance as that would provide an argument for it. Though for the above reason I'm minded to close.

@ChrisDenton ChrisDenton changed the title Add write_utf8 and write_ascii to io::Write Add write_utf8 to io::Write Oct 25, 2023
@ChrisDenton
Copy link
Member Author

Closing this as per the above. I'm now convinced this is too much churn and complexity for implementers of the Write trait and when writing a large ish buffer (which people who care about perf will do) this doesn't really help as the time take to print dominates the performance.

@ChrisDenton ChrisDenton closed this as not planned Won't fix, can't repro, duplicate, stale Oct 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api
Projects
None yet
Development

No branches or pull requests

1 participant