Add `write_utf8` to `io::Write` #282

ChrisDenton · 2023-10-18T00:07:23Z

Proposal

Problem statement

When writing through io::Write we need to convert to bytes, thus losing encoding information. When writing to the Windows console, we recover this information by doing a UTF-8 check on the bytes. This should be redundant in many cases.

Requiring bytes also means that users are often required to pepper as_bytes() when writing strings, unless they use write_fmt. Though this is admittedly more of a minor annoyance then a serious issue.

Motivating examples or use cases

out.write("foo".as_bytes());
out.write("bar".as_byte());
out.write("baz".as_bytes());
// etc, etc, etc

Solution sketch

Add write_utf8 to io::Write. So named to avoid conflict with any write_str function that may be implemented on a type.

pub trait Write {
    fn write_utf8(&mut self, buf: &str) -> io::Result<usize>;
}

Admittedly write_utf8 does have the issue of what to do when a partial write falls outside of a code point boundary. This could be addressed in an implementation defined manner or just by always using write_all semantics. The only difference between the two is the first option allows for short writes that happen to fall on a boundary.

Alternatives

Incomplete UTF-8 writes

pub trait Write {
    // [u8] buffer is assumed to be UTF-8.
    // However, it may start with a partial UTF-8 sequence if it completes a previously written incomplete sequence.
    // Otherwise it's an error.
    unsafe fn write_utf8(&mut self, buf: &[u8]) -> io::Result<usize>;
}

This doesn't fully solve the issue (still needs .as_bytes()!) but allows for the implementation to do whatever it likes under the assumption that the bytes really are str that's in the process of being written.

EDIT: Remove references to ascii.

The text was updated successfully, but these errors were encountered:

ChrisDenton · 2023-10-18T02:45:53Z

After thinking about this some more, I also opened rust-lang/rust#116871 for not erroring if given invalid Unicode (instead it's lossy). I think although the two issues are related, either would be useful whether or not the other is accepted.

ChrisDenton · 2023-10-24T16:17:15Z

This was discussed in the libs-api meeting. An important point that was raised is that, to be most useful the "is valid UTF-8" property would need to be preserved by intermediaries (e.g. buffer types). This means that all existing types (in std and the wider crate ecosystem) would need updating otherwise it'd be of very limited use. Tbh, I do agree that this is a strong argument against this proposal.

That aside, I can look at how this would affect performance as that would provide an argument for it. Though for the above reason I'm minded to close.

ChrisDenton · 2023-10-25T05:11:00Z

Closing this as per the above. I'm now convinced this is too much churn and complexity for implementers of the Write trait and when writing a large ish buffer (which people who care about perf will do) this doesn't really help as the time take to print dominates the performance.

ChrisDenton added api-change-proposal A proposal to add or alter unstable APIs in the standard libraries T-libs-api labels Oct 18, 2023

ChrisDenton changed the title ~~Add write_utf8 and write_ascii to io::Write~~ Add write_utf8 to io::Write Oct 25, 2023

ChrisDenton closed this as not planned Won't fix, can't repro, duplicate, stale Oct 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `write_utf8` to `io::Write` #282

Add `write_utf8` to `io::Write` #282

ChrisDenton commented Oct 18, 2023 •

edited

Loading

ChrisDenton commented Oct 18, 2023

ChrisDenton commented Oct 24, 2023

ChrisDenton commented Oct 25, 2023

Add write_utf8 to io::Write #282

Add write_utf8 to io::Write #282

Comments

ChrisDenton commented Oct 18, 2023 • edited Loading

Proposal

Problem statement

Motivating examples or use cases

Solution sketch

Alternatives

Incomplete UTF-8 writes

ChrisDenton commented Oct 18, 2023

ChrisDenton commented Oct 24, 2023

ChrisDenton commented Oct 25, 2023

Add `write_utf8` to `io::Write` #282

Add `write_utf8` to `io::Write` #282

ChrisDenton commented Oct 18, 2023 •

edited

Loading