Skip to content

Commit

Permalink
Auto merge of #87869 - thomcc:skinny-io-error, r=yaahc
Browse files Browse the repository at this point in the history
Make io::Error use 64 bits on targets with 64 bit pointers.

I've wanted this for a long time, but didn't see a good way to do it without having extra allocation. When looking at it yesterday, it was more clear what to do for some reason.

This approach avoids any additional allocations, and reduces the size by half (8 bytes, down from 16). AFAICT it doesn't come additional runtime cost, and the compiler seems to do a better job with code using it.

Additionally, this `io::Error` has a niche (still), so `io::Result<()>` is *also* 64 bits (8 bytes, down from 16), and `io::Result<usize>` (used for lots of io trait functions) is 2x64 bits (16 bytes, down from 24 — this means on x86_64 it can use the nice rax/rdx 2-reg struct return). More generally, it shaves a whole 64 bit integer register off of the size of basically any `io::Result<()>`.

(For clarity: Improving `io::Result` (rather than io::Error) was most of the motivation for this)

On 32 bit (or other non-64bit) targets we still use something equivalent the old repr — I don't think think there's improving it, since one of the fields it stores is a `i32`, so we can't get below that, and it's already about as close as we can get to it.

---

### Isn't Pointer Tagging Dodgy?

The details of the layout, and why its implemented the way it is, are explained in the header comment of library/std/src/io/error/repr_bitpacked.rs. There's probably more details than there need to be, but I didn't trim it down that much, since there's a lot of stuff I did deliberately, that might have not seemed that way.

There's actually only one variant holding a pointer which gets tagged. This one is the (holder for the) user-provided error.

I believe the scheme used to tag it is not UB, and that it preserves pointer provenance (even though often pointer tagging does not) because the tagging operation is just `core::ptr::add`, and untagging is `core::ptr::sub`. The result of both operations lands inside the original allocation, so it would follow the safety contract of `core::ptr::{add,sub}`.

The other pointer this had to encode is not tagged — or rather, the tagged repr is equivalent to untagged (it's tagged with 0b00, and has >=4b alignment, so we can reuse the bottom bits). And the other variants we encode are just integers, which (which can be untagged using bitwise operations without worry — they're integers).

CC `@RalfJung` for the stuff in repr_bitpacked.rs, as my comments are informed by a lot of the UCG work, but it's possible I missed something or got it wrong (even if the implementation is okay, there are parts of the header comment that says things like "We can't do $x" which could be false).

---

### Why So Many Changes?

The repr change was mostly internal, but changed one widely used API: I had to switch how `io::Error::new_const` works.

This required switching `io::Error::new_const` to take the full message data (including the kind) as a `&'static`, rather than just the string. This would have been really tedious, but I made a macro that made it much simpler, but it was a wide change since `io::Error::new_const` is used everywhere.

This included changing files for a lot of targets I don't have easy access to (SGX? Haiku? Windows? Who has heard of these things), so I expect there to be spottiness in CI initially, unless luck is on my side.

Anyway this large only tangentially-related change is all in the first commit (although that commit also pulls the previous repr out into its own file), whereas the packing stuff is all in commit 2.

---

P.S. I haven't looked at all of this since writing it, and will do a pass over it again later, sorry for any obvious typos or w/e. I also definitely repeat myself in comments and such.

(It probably could use more tests too. I did some basic testing, and made it so we `debug_assert!` in cases the decode isn't what we encoded, but I don't know the degree which I can assume libstd's testing of IO would exercise this. That is: it wouldn't be surprising to me if libstds IO testing were minimal, especially around error cases, although I have no idea).
  • Loading branch information
bors committed Feb 7, 2022
2 parents f52c318 + 9cbe994 commit 734368a
Show file tree
Hide file tree
Showing 50 changed files with 837 additions and 269 deletions.
2 changes: 1 addition & 1 deletion library/std/src/ffi/c_str.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1077,7 +1077,7 @@ impl fmt::Display for NulError {
impl From<NulError> for io::Error {
/// Converts a [`NulError`] into a [`io::Error`].
fn from(_: NulError) -> io::Error {
io::Error::new_const(io::ErrorKind::InvalidInput, &"data provided contains a nul byte")
io::const_io_error!(io::ErrorKind::InvalidInput, "data provided contains a nul byte")
}
}

Expand Down
4 changes: 2 additions & 2 deletions library/std/src/fs.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2263,9 +2263,9 @@ impl DirBuilder {
match path.parent() {
Some(p) => self.create_dir_all(p)?,
None => {
return Err(io::Error::new_const(
return Err(io::const_io_error!(
io::ErrorKind::Uncategorized,
&"failed to create whole tree",
"failed to create whole tree",
));
}
}
Expand Down
4 changes: 2 additions & 2 deletions library/std/src/io/buffered/bufreader.rs
Original file line number Diff line number Diff line change
Expand Up @@ -357,9 +357,9 @@ impl<R: Read> Read for BufReader<R> {
let mut bytes = Vec::new();
self.read_to_end(&mut bytes)?;
let string = crate::str::from_utf8(&bytes).map_err(|_| {
io::Error::new_const(
io::const_io_error!(
io::ErrorKind::InvalidData,
&"stream did not contain valid UTF-8",
"stream did not contain valid UTF-8",
)
})?;
*buf += string;
Expand Down
6 changes: 3 additions & 3 deletions library/std/src/io/buffered/bufwriter.rs
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
use crate::error;
use crate::fmt;
use crate::io::{
self, Error, ErrorKind, IntoInnerError, IoSlice, Seek, SeekFrom, Write, DEFAULT_BUF_SIZE,
self, ErrorKind, IntoInnerError, IoSlice, Seek, SeekFrom, Write, DEFAULT_BUF_SIZE,
};
use crate::mem;
use crate::ptr;
Expand Down Expand Up @@ -168,9 +168,9 @@ impl<W: Write> BufWriter<W> {

match r {
Ok(0) => {
return Err(Error::new_const(
return Err(io::const_io_error!(
ErrorKind::WriteZero,
&"failed to write the buffered data",
"failed to write the buffered data",
));
}
Ok(n) => guard.consume(n),
Expand Down
10 changes: 5 additions & 5 deletions library/std/src/io/cursor.rs
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ mod tests;
use crate::io::prelude::*;

use crate::cmp;
use crate::io::{self, Error, ErrorKind, IoSlice, IoSliceMut, ReadBuf, SeekFrom};
use crate::io::{self, ErrorKind, IoSlice, IoSliceMut, ReadBuf, SeekFrom};

use core::convert::TryInto;

Expand Down Expand Up @@ -297,9 +297,9 @@ where
self.pos = n;
Ok(self.pos)
}
None => Err(Error::new_const(
None => Err(io::const_io_error!(
ErrorKind::InvalidInput,
&"invalid seek to a negative or overflowing position",
"invalid seek to a negative or overflowing position",
)),
}
}
Expand Down Expand Up @@ -400,9 +400,9 @@ fn slice_write_vectored(
// Resizing write implementation
fn vec_write(pos_mut: &mut u64, vec: &mut Vec<u8>, buf: &[u8]) -> io::Result<usize> {
let pos: usize = (*pos_mut).try_into().map_err(|_| {
Error::new_const(
io::const_io_error!(
ErrorKind::InvalidInput,
&"cursor position exceeds maximum possible vector length",
"cursor position exceeds maximum possible vector length",
)
})?;
// Make sure the internal buffer is as least as big as where we
Expand Down
187 changes: 122 additions & 65 deletions library/std/src/io/error.rs
Original file line number Diff line number Diff line change
@@ -1,6 +1,16 @@
#[cfg(test)]
mod tests;

#[cfg(target_pointer_width = "64")]
mod repr_bitpacked;
#[cfg(target_pointer_width = "64")]
use repr_bitpacked::Repr;

#[cfg(not(target_pointer_width = "64"))]
mod repr_unpacked;
#[cfg(not(target_pointer_width = "64"))]
use repr_unpacked::Repr;

use crate::convert::From;
use crate::error;
use crate::fmt;
Expand Down Expand Up @@ -66,15 +76,58 @@ impl fmt::Debug for Error {
}
}

enum Repr {
// Only derive debug in tests, to make sure it
// doesn't accidentally get printed.
#[cfg_attr(test, derive(Debug))]
enum ErrorData<C> {
Os(i32),
Simple(ErrorKind),
// &str is a fat pointer, but &&str is a thin pointer.
SimpleMessage(ErrorKind, &'static &'static str),
Custom(Box<Custom>),
SimpleMessage(&'static SimpleMessage),
Custom(C),
}

// `#[repr(align(4))]` is probably redundant, it should have that value or
// higher already. We include it just because repr_bitpacked.rs's encoding
// requires an alignment >= 4 (note that `#[repr(align)]` will not reduce the
// alignment required by the struct, only increase it).
//
// If we add more variants to ErrorData, this can be increased to 8, but it
// should probably be behind `#[cfg_attr(target_pointer_width = "64", ...)]` or
// whatever cfg we're using to enable the `repr_bitpacked` code, since only the
// that version needs the alignment, and 8 is higher than the alignment we'll
// have on 32 bit platforms.
//
// (For the sake of being explicit: the alignment requirement here only matters
// if `error/repr_bitpacked.rs` is in use — for the unpacked repr it doesn't
// matter at all)
#[repr(align(4))]
#[derive(Debug)]
pub(crate) struct SimpleMessage {
kind: ErrorKind,
message: &'static str,
}

impl SimpleMessage {
pub(crate) const fn new(kind: ErrorKind, message: &'static str) -> Self {
Self { kind, message }
}
}

/// Create and return an `io::Error` for a given `ErrorKind` and constant
/// message. This doesn't allocate.
pub(crate) macro const_io_error($kind:expr, $message:expr $(,)?) {
$crate::io::error::Error::from_static_message({
const MESSAGE_DATA: $crate::io::error::SimpleMessage =
$crate::io::error::SimpleMessage::new($kind, $message);
&MESSAGE_DATA
})
}

// As with `SimpleMessage`: `#[repr(align(4))]` here is just because
// repr_bitpacked's encoding requires it. In practice it almost certainly be
// already be this high or higher.
#[derive(Debug)]
#[repr(align(4))]
struct Custom {
kind: ErrorKind,
error: Box<dyn error::Error + Send + Sync>,
Expand Down Expand Up @@ -396,7 +449,7 @@ impl From<ErrorKind> for Error {
/// ```
#[inline]
fn from(kind: ErrorKind) -> Error {
Error { repr: Repr::Simple(kind) }
Error { repr: Repr::new_simple(kind) }
}
}

Expand Down Expand Up @@ -461,20 +514,22 @@ impl Error {
}

fn _new(kind: ErrorKind, error: Box<dyn error::Error + Send + Sync>) -> Error {
Error { repr: Repr::Custom(Box::new(Custom { kind, error })) }
Error { repr: Repr::new_custom(Box::new(Custom { kind, error })) }
}

/// Creates a new I/O error from a known kind of error as well as a
/// constant message.
/// Creates a new I/O error from a known kind of error as well as a constant
/// message.
///
/// This function does not allocate.
///
/// This function should maybe change to
/// `new_const<const MSG: &'static str>(kind: ErrorKind)`
/// in the future, when const generics allow that.
/// You should not use this directly, and instead use the `const_io_error!`
/// macro: `io::const_io_error!(ErrorKind::Something, "some_message")`.
///
/// This function should maybe change to `from_static_message<const MSG: &'static
/// str>(kind: ErrorKind)` in the future, when const generics allow that.
#[inline]
pub(crate) const fn new_const(kind: ErrorKind, message: &'static &'static str) -> Error {
Self { repr: Repr::SimpleMessage(kind, message) }
pub(crate) const fn from_static_message(msg: &'static SimpleMessage) -> Error {
Self { repr: Repr::new_simple_message(msg) }
}

/// Returns an error representing the last OS error which occurred.
Expand Down Expand Up @@ -532,7 +587,7 @@ impl Error {
#[must_use]
#[inline]
pub fn from_raw_os_error(code: i32) -> Error {
Error { repr: Repr::Os(code) }
Error { repr: Repr::new_os(code) }
}

/// Returns the OS error that this error represents (if any).
Expand Down Expand Up @@ -568,11 +623,11 @@ impl Error {
#[must_use]
#[inline]
pub fn raw_os_error(&self) -> Option<i32> {
match self.repr {
Repr::Os(i) => Some(i),
Repr::Custom(..) => None,
Repr::Simple(..) => None,
Repr::SimpleMessage(..) => None,
match self.repr.data() {
ErrorData::Os(i) => Some(i),
ErrorData::Custom(..) => None,
ErrorData::Simple(..) => None,
ErrorData::SimpleMessage(..) => None,
}
}

Expand Down Expand Up @@ -607,11 +662,11 @@ impl Error {
#[must_use]
#[inline]
pub fn get_ref(&self) -> Option<&(dyn error::Error + Send + Sync + 'static)> {
match self.repr {
Repr::Os(..) => None,
Repr::Simple(..) => None,
Repr::SimpleMessage(..) => None,
Repr::Custom(ref c) => Some(&*c.error),
match self.repr.data() {
ErrorData::Os(..) => None,
ErrorData::Simple(..) => None,
ErrorData::SimpleMessage(..) => None,
ErrorData::Custom(c) => Some(&*c.error),
}
}

Expand Down Expand Up @@ -681,11 +736,11 @@ impl Error {
#[must_use]
#[inline]
pub fn get_mut(&mut self) -> Option<&mut (dyn error::Error + Send + Sync + 'static)> {
match self.repr {
Repr::Os(..) => None,
Repr::Simple(..) => None,
Repr::SimpleMessage(..) => None,
Repr::Custom(ref mut c) => Some(&mut *c.error),
match self.repr.data_mut() {
ErrorData::Os(..) => None,
ErrorData::Simple(..) => None,
ErrorData::SimpleMessage(..) => None,
ErrorData::Custom(c) => Some(&mut *c.error),
}
}

Expand Down Expand Up @@ -720,11 +775,11 @@ impl Error {
#[must_use = "`self` will be dropped if the result is not used"]
#[inline]
pub fn into_inner(self) -> Option<Box<dyn error::Error + Send + Sync>> {
match self.repr {
Repr::Os(..) => None,
Repr::Simple(..) => None,
Repr::SimpleMessage(..) => None,
Repr::Custom(c) => Some(c.error),
match self.repr.into_data() {
ErrorData::Os(..) => None,
ErrorData::Simple(..) => None,
ErrorData::SimpleMessage(..) => None,
ErrorData::Custom(c) => Some(c.error),
}
}

Expand All @@ -750,44 +805,46 @@ impl Error {
#[must_use]
#[inline]
pub fn kind(&self) -> ErrorKind {
match self.repr {
Repr::Os(code) => sys::decode_error_kind(code),
Repr::Custom(ref c) => c.kind,
Repr::Simple(kind) => kind,
Repr::SimpleMessage(kind, _) => kind,
match self.repr.data() {
ErrorData::Os(code) => sys::decode_error_kind(code),
ErrorData::Custom(c) => c.kind,
ErrorData::Simple(kind) => kind,
ErrorData::SimpleMessage(m) => m.kind,
}
}
}

impl fmt::Debug for Repr {
fn fmt(&self, fmt: &mut fmt::Formatter<'_>) -> fmt::Result {
match *self {
Repr::Os(code) => fmt
match self.data() {
ErrorData::Os(code) => fmt
.debug_struct("Os")
.field("code", &code)
.field("kind", &sys::decode_error_kind(code))
.field("message", &sys::os::error_string(code))
.finish(),
Repr::Custom(ref c) => fmt::Debug::fmt(&c, fmt),
Repr::Simple(kind) => fmt.debug_tuple("Kind").field(&kind).finish(),
Repr::SimpleMessage(kind, &message) => {
fmt.debug_struct("Error").field("kind", &kind).field("message", &message).finish()
}
ErrorData::Custom(c) => fmt::Debug::fmt(&c, fmt),
ErrorData::Simple(kind) => fmt.debug_tuple("Kind").field(&kind).finish(),
ErrorData::SimpleMessage(msg) => fmt
.debug_struct("Error")
.field("kind", &msg.kind)
.field("message", &msg.message)
.finish(),
}
}
}

#[stable(feature = "rust1", since = "1.0.0")]
impl fmt::Display for Error {
fn fmt(&self, fmt: &mut fmt::Formatter<'_>) -> fmt::Result {
match self.repr {
Repr::Os(code) => {
match self.repr.data() {
ErrorData::Os(code) => {
let detail = sys::os::error_string(code);
write!(fmt, "{} (os error {})", detail, code)
}
Repr::Custom(ref c) => c.error.fmt(fmt),
Repr::Simple(kind) => write!(fmt, "{}", kind.as_str()),
Repr::SimpleMessage(_, &msg) => msg.fmt(fmt),
ErrorData::Custom(ref c) => c.error.fmt(fmt),
ErrorData::Simple(kind) => write!(fmt, "{}", kind.as_str()),
ErrorData::SimpleMessage(msg) => msg.message.fmt(fmt),
}
}
}
Expand All @@ -796,29 +853,29 @@ impl fmt::Display for Error {
impl error::Error for Error {
#[allow(deprecated, deprecated_in_future)]
fn description(&self) -> &str {
match self.repr {
Repr::Os(..) | Repr::Simple(..) => self.kind().as_str(),
Repr::SimpleMessage(_, &msg) => msg,
Repr::Custom(ref c) => c.error.description(),
match self.repr.data() {
ErrorData::Os(..) | ErrorData::Simple(..) => self.kind().as_str(),
ErrorData::SimpleMessage(msg) => msg.message,
ErrorData::Custom(c) => c.error.description(),
}
}

#[allow(deprecated)]
fn cause(&self) -> Option<&dyn error::Error> {
match self.repr {
Repr::Os(..) => None,
Repr::Simple(..) => None,
Repr::SimpleMessage(..) => None,
Repr::Custom(ref c) => c.error.cause(),
match self.repr.data() {
ErrorData::Os(..) => None,
ErrorData::Simple(..) => None,
ErrorData::SimpleMessage(..) => None,
ErrorData::Custom(c) => c.error.cause(),
}
}

fn source(&self) -> Option<&(dyn error::Error + 'static)> {
match self.repr {
Repr::Os(..) => None,
Repr::Simple(..) => None,
Repr::SimpleMessage(..) => None,
Repr::Custom(ref c) => c.error.source(),
match self.repr.data() {
ErrorData::Os(..) => None,
ErrorData::Simple(..) => None,
ErrorData::SimpleMessage(..) => None,
ErrorData::Custom(c) => c.error.source(),
}
}
}
Expand Down
Loading

0 comments on commit 734368a

Please sign in to comment.