Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check for lengths in ULE and revise safety docs #1121

Merged
merged 5 commits into from
Oct 4, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
200 changes: 104 additions & 96 deletions utils/zerovec/src/ule/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -17,30 +17,36 @@ use core::{fmt, mem, slice};

/// Fixed-width, byte-aligned data that can be cast to and from a little-endian byte slice.
///
/// Types that are not fixed-width can implement [`VarULE`] instead.
///
/// "ULE" stands for "Unaligned little-endian"
///
/// # Safety
///
/// There must be no padding bytes involved in this type: [`Self::as_byte_slice()`] *must* return
/// a slice of initialized bytes provided that `Self` is initialized.
///
/// [`ULE::validate_byte_slice()`] *must* be implemented to validate a byte slice and return error
/// for the same slices as [`ULE::parse_byte_slice()`].
/// Safety checklist for `ULE`:
///
/// [`ULE::from_bytes_unchecked()`] *must* be implemented to return the same result as [`ULE::parse_byte_slice()`],
/// and both should return slices to the same region of memory.
///
/// [`ULE::as_byte_slice()`] should return a slice that is the in-memory representation of `Self`,
/// i.e. it should be just a pointer cast, and `mem::size_of_val(self) == mem::size_of_val(self.as_byte_slice())`=
/// 1. The type *must not* include any uninitialized or padding bytes.
/// 2. The impl of [`ULE::validate_byte_slice()`] *must* return an error if the given byte slice
/// would not represent a valid slice of this type.
/// 3. The impl of [`ULE::validate_byte_slice()`] *must* return an error if the given byte slice
/// cannot be used in its entirety (if its length is not a multiple of `size_of::<Self>()`).
/// 4. All other methods *must* be left with their default impl, or else implemented according to
/// their respective safety guidelines.
/// 5. Acknowledge the following note about the equality invariant.
///
/// # Equality invariant
///
/// A non-safety invariant is that if `Self` implements `PartialEq`, it *must* be logically equivalent to
/// byte equality on `.as_byte_slice()`. If `PartialEq` does not imply byte-for-byte equality, then
/// `.validate_byte_slice()` should return an error code for any values that are not in canonical form.
/// A non-safety invariant is that if `Self` implements `PartialEq`, the it *must* be logically
/// equivalent to byte equality on [`Self::as_byte_slice()`].
///
/// It may be necessary to introduce a "canonical form" of the ULE if logical equality does not
/// equal byte equality. In such a case, [`Self::validate_byte_slice()`] should return an error
/// for any values that are not in canonical form. For example, the decimal strings "1.23e4" and
/// "12.3e3" are logically equal, but not byte-for-byte equal, so we could define a canonical form
/// where only a single digit is allowed before `.`.
///
/// Failure to follow this invariant will cause surprising behavior in `PartialEq`, which may result in
/// unpredictable operations on `ZeroVec`, `VarZeroVec`, and `ZeroMap`.
/// Failure to follow this invariant will cause surprising behavior in `PartialEq`, which may
/// result in unpredictable operations on `ZeroVec`, `VarZeroVec`, and `ZeroMap`.
pub unsafe trait ULE
where
Self: Sized,
Expand All @@ -51,69 +57,69 @@ where

/// Validates a byte slice, `&[u8]`.
///
/// If `Self` is not well-defined for all possible bit values, the bytes should be validated,
/// that they can be transumted to a `Self` and `Self::Error` should be returned otherwise.
fn validate_byte_slice(_bytes: &[u8]) -> Result<(), Self::Error>;
/// If `Self` is not well-defined for all possible bit values, the bytes should be validated.
/// If the bytes can be transmuted, *in their entirety*, to a valid slice of `Self`, then `Ok`
/// should be returned; otherwise, `Self::Error` should be returned.
fn validate_byte_slice(bytes: &[u8]) -> Result<(), Self::Error>;

/// Parses a byte slice, `&[u8]`, and return it as `&[Self]` with the same lifetime.
///
/// If `Self` is not well-defined for all possible bit values, the bytes should be validated,
/// and `Self::Error` should be returned if they are not valid.
///
/// # Safety
/// and an error should be returned in the same cases as [`Self::validate_byte_slice()`].
///
/// Implementations of this method may involve `unsafe{}` blocks to cast the pointer to the
/// correct type. It is up to the implementation to reason about the safety. Keep in mind that
/// `&[Self]` and `&[u8]` may have different lengths.
/// The default implementation executes [`Self::validate_byte_slice()`] followed by
/// [`Self::from_byte_slice_unchecked`].
///
/// Implementors must return a slice to the same region of memory if parsing succeeds.
///
/// Ideally, implementations call [`ULE::from_byte_slice_unchecked()`] after validation.
///
/// The default implementation executes `Self::validate_byte_slice` followed by
/// `Self::from_byte_slice_unchecked`.
/// Note: The following equality should hold: `bytes.len() % size_of::<Self>() == 0`. This
/// means that the returned slice can span the entire byte slice.
fn parse_byte_slice(bytes: &[u8]) -> Result<&[Self], Self::Error> {
Self::validate_byte_slice(bytes)?;
debug_assert_eq!(bytes.len() % mem::size_of::<Self>(), 0);
Ok(unsafe { Self::from_byte_slice_unchecked(bytes) })
}

/// Takes a byte slice, `&[u8]`, and return it as `&[Self]` with the same lifetime, assuming that
/// this byte slice has previously been run through [`ULE::parse_byte_slice()`] with success.
/// Takes a byte slice, `&[u8]`, and return it as `&[Self]` with the same lifetime, assuming
/// that this byte slice has previously been run through [`Self::parse_byte_slice()`] with
/// success.
///
/// There is no need to perform any validation here, this should almost always be a straight pointer
/// cast.
/// The default implementation performs a pointer cast to the same region of memory.
///
/// # Safety
///
/// ## Callers
///
/// Callers of this method must take care to ensure that `bytes` was previously passed through
/// [`ULE::validate_byte_slice()`] with success (and was not changed since then).
/// [`Self::validate_byte_slice()`] with success (and was not changed since then).
///
/// ## Implementors
/// This method _must_ be implemented to return the same result as [`ULE::parse_byte_slice()`].
///
/// Implementors must return a slice to the same region of memory. The default implementation
/// does this directly.
/// Implementations of this method may call unsafe functions to cast the pointer to the correct
/// type, assuming the "Callers" invariant above.
///
/// Keep in mind that `&[Self]` and `&[u8]` may have different lengths.
///
/// Implementations of this method may involve `unsafe{}` blocks to cast the pointer to the
/// correct type. It is up to the implementation to reason about the safety, assuming the invariant
/// above.
/// Safety checklist:
///
/// 1. This method *must* return the same result as [`Self::parse_byte_slice()`].
/// 2. This method *must* return a slice to the same region of memory as the argument.
#[inline]
unsafe fn from_byte_slice_unchecked(bytes: &[u8]) -> &[Self] {
let data = bytes.as_ptr();
let len = bytes.len() / mem::size_of::<Self>();
debug_assert_eq!(bytes.len() % mem::size_of::<Self>(), 0);
core::slice::from_raw_parts(data as *const Self, len)
}

/// Given `&[Self]`, returns a `&[u8]` with the same lifetime.
///
/// The default implementation performs a pointer cast to the same region of memory.
///
/// # Safety
///
/// The implementation of this function should involve re-casting the pointer.
/// It is up to the implementation to reason about the safety. Keep in mind that `&[Self]` and
/// `&[u8]` may have different lengths (but should cover the same data).
/// Implementations of this method should call potentially unsafe functions to cast the
/// pointer to the correct type.
///
/// The default implementation already does this, however it can be overridden with
/// a fully-safe method if possible.
/// Keep in mind that `&[Self]` and `&[u8]` may have different lengths.
#[inline]
#[allow(clippy::wrong_self_convention)] // https://github.com/rust-lang/rust-clippy/issues/7219
fn as_byte_slice(slice: &[Self]) -> &[u8] {
Expand Down Expand Up @@ -148,14 +154,15 @@ pub trait AsULE {
///
/// # Safety
///
/// This function is infallible because bit validation should have occured when `Self::ULE`
/// This function is infallible because bit validation should have occurred when `Self::ULE`
/// was first constructed. An implementation may therefore involve an `unsafe{}` block, like
/// `from_bytes_unchecked()`.
fn from_unaligned(unaligned: &Self::ULE) -> Self;
}

/// An [`EqULE`] type is one whose byte sequence equals the byte sequence of its ULE type on
/// little-endian platforms. This enables certain performance optimizations.
/// little-endian platforms. This enables certain performance optimizations, such as
/// [`ZeroVec::try_from_slice`](crate::ZeroVec::try_from_slice).
///
/// # Implementation safety
///
Expand Down Expand Up @@ -234,93 +241,94 @@ pub trait AsVarULE {

/// Variable-width, byte-aligned data that can be cast to and from a little-endian byte slice.
///
/// This trait is mostly for unsized types like `str` and `[T]`. It can be implemented on sized types,
/// however it is much more preferable to use [`ULE`] for that purpose.
/// This trait is mostly for unsized types like `str` and `[T]`. It can be implemented on sized types;
/// however, it is much more preferable to use [`ULE`] for that purpose.
///
/// # Safety
///
/// There must be no padding bytes involved in this type: [`Self::as_byte_slice()`] MUST return
/// a slice of initialized bytes provided that `Self` is initialized.
///
/// [`ULE::validate_byte_slice()`] *must* be implemented to validate a byte slice and return error
/// for the same slices as [`ULE::parse_byte_slice()`].
/// Safety checklist for `VarULE`:
///
/// [`VarULE::from_byte_slice_unchecked()`] *must* be implemented to return the same result
/// as [`VarULE::parse_byte_slice()`] provided both are passed the same validly parsing byte slices.
/// Both should return a pointer to the same region of memory that was passed in, covering that region
/// completely.
///
/// [`VarULE::as_byte_slice()`] should return a slice that is the in-memory representation of `Self`,
/// i.e. it should be just a pointer cast, and `mem::size_of_val(self) == mem::size_of_val(self.as_byte_slice())`
/// 1. The type *must not* include any uninitialized or padding bytes.
/// 2. The impl of [`ULE::validate_byte_slice()`] *must* return an error if the given byte slice
/// would not represent a valid slice of this type.
/// 3. The impl of [`ULE::validate_byte_slice()`] *must* return an error if the given byte slice
/// cannot be used in its entirety.
/// 4. All other methods *must* be left with their default impl, or else implemented according to
/// their respective safety guidelines.
/// 5. Acknowledge the following note about the equality invariant.
///
/// # Equality invariant
///
/// A non-safety invariant is that if `Self` implements `PartialEq`, it *must* be logically equivalent to
/// byte equality on `.as_byte_slice()`. If `PartialEq` does not imply byte-for-byte equality, then
/// `.validate_byte_slice()` should return an error code for any values that are not in canonical form.
/// A non-safety invariant is that if `Self` implements `PartialEq`, the it *must* be logically
/// equivalent to byte equality on [`Self::as_byte_slice()`].
///
/// It may be necessary to introduce a "canonical form" of the ULE if logical equality does not
/// equal byte equality. In such a case, [`Self::validate_byte_slice()`] should return an error
/// for any values that are not in canonical form. For example, the decimal strings "1.23e4" and
/// "12.3e3" are logically equal, but not byte-for-byte equal, so we could define a canonical form
/// where only a single digit is allowed before `.`.
///
/// Failure to follow this invariant will cause surprising behavior in `PartialEq`, which may result in
/// unpredictable operations on `ZeroVec`, `VarZeroVec`, and `ZeroMap`.
/// Failure to follow this invariant will cause surprising behavior in `PartialEq`, which may
/// result in unpredictable operations on `ZeroVec`, `VarZeroVec`, and `ZeroMap`.
pub unsafe trait VarULE: 'static {
/// The error type to used by [`VarULE::parse_byte_slice()`]
/// The error that occurs if a byte array is not valid for this ULE.
type Error: fmt::Display;

/// Validates a byte slice, `&[u8]`.
///
/// If `Self` is not well-defined for all possible bit values, the bytes should be validated,
/// that they can be transumted to a `Self` and `Self::Error` should be returned otherwise.
/// If `Self` is not well-defined for all possible bit values, the bytes should be validated.
/// If the bytes can be transmuted, *in their entirety*, to a valid `&Self`, then `Ok` should
/// be returned; otherwise, `Self::Error` should be returned.
fn validate_byte_slice(_bytes: &[u8]) -> Result<(), Self::Error>;

/// Parses a byte slice, `&[u8]`, and return it as `&Self` with the same lifetime.
///
/// If `Self` is not well-defined for all possible bit values, the bytes should be validated,
/// and `Self::Error` should be returned if they are not valid.
///
/// # Safety
/// and an error should be returned in the same cases as [`Self::validate_byte_slice()`].
///
/// Implementations of this method may involve `unsafe{}` blocks to cast the pointer to the
/// correct type. It is up to the implementation to reason about the safety.
/// The default implementation executes [`Self::validate_byte_slice()`] followed by
/// [`Self::from_byte_slice_unchecked`].
///
/// Implementors must return a pointer to the same region of memory if parsing succeeds.
///
/// The default implementation executes `Self::validate_byte_slice` followed by
/// `Self::from_byte_slice_unchecked`.
/// Note: The following equality should hold: `size_of_val(result) == size_of_val(bytes)`,
/// where `result` is the successful return value of the method. This means that the return
/// value spans the entire byte slice.
fn parse_byte_slice(bytes: &[u8]) -> Result<&Self, Self::Error> {
Self::validate_byte_slice(bytes)?;
Ok(unsafe { Self::from_byte_slice_unchecked(bytes) })
let result = unsafe { Self::from_byte_slice_unchecked(bytes) };
debug_assert_eq!(mem::size_of_val(result), mem::size_of_val(bytes));
Ok(result)
}

/// Takes a byte slice, `&[u8]`, and return it as `&self` with the same lifetime, assuming that
/// this byte slice has previously been run through [`VarULE::parse_byte_slice()`] with success.
///
/// There is no need to perform any validation here, this should almost always be a straight pointer
/// cast.
/// Takes a byte slice, `&[u8]`, and return it as `&Self` with the same lifetime, assuming
/// that this byte slice has previously been run through [`Self::parse_byte_slice()`] with
/// success.
///
/// # Safety
///
/// ## Callers
///
/// Callers of this method must take care to ensure that `bytes` was previously passed through
/// [`VarULE::validate_byte_slice()`] with success (and was not changed since then).
/// [`Self::validate_byte_slice()`] with success (and was not changed since then).
///
/// ## Implementors
/// This method _must_ be implemented to return the same result as [`VarULE::parse_byte_slice()`].
///
/// Implementors must return a pointer to the same region of memory.
/// Implementations of this method may call unsafe functions to cast the pointer to the correct
/// type, assuming the "Callers" invariant above.
///
/// Implementations of this method may involve `unsafe{}` blocks to cast the pointer to the
/// correct type. It is up to the implementation to reason about the safety, assuming the invariant
/// above.
/// Safety checklist:
///
/// 1. This method *must* return the same result as [`Self::parse_byte_slice()`].
/// 2. This method *must* return a slice to the same region of memory as the argument.
unsafe fn from_byte_slice_unchecked(bytes: &[u8]) -> &Self;

/// Given `&Self`, returns a `&[u8]` with the same lifetime.
///
/// # Safety
/// The default implementation performs a pointer cast to the same region of memory.
///
/// The implementation of this function should involve re-casting the pointer.
/// It is up to the implementation to reason about the safety.
/// # Safety
///
/// The default implementation already does this, however it can be overridden with
/// a fully-safe method if possible.
/// Implementations of this method should call potentially unsafe functions to cast the
/// pointer to the correct type.
#[inline]
fn as_byte_slice(&self) -> &[u8] {
unsafe { slice::from_raw_parts(self as *const Self as *const u8, mem::size_of_val(self)) }
Expand Down
25 changes: 13 additions & 12 deletions utils/zerovec/src/zerovec/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -579,19 +579,19 @@ mod tests {
);
assert_eq!(
Some(0x04000201),
ZeroVec::<u32>::parse_byte_slice(&TEST_BUFFER_LE[1..])
ZeroVec::<u32>::parse_byte_slice(&TEST_BUFFER_LE[1..77])
.unwrap()
.get(0)
);
assert_eq!(
Some(0x05040002),
ZeroVec::<u32>::parse_byte_slice(&TEST_BUFFER_LE[2..])
ZeroVec::<u32>::parse_byte_slice(&TEST_BUFFER_LE[2..78])
.unwrap()
.get(0)
);
assert_eq!(
Some(0x06050400),
ZeroVec::<u32>::parse_byte_slice(&TEST_BUFFER_LE[3..])
ZeroVec::<u32>::parse_byte_slice(&TEST_BUFFER_LE[3..79])
.unwrap()
.get(0)
);
Expand All @@ -603,13 +603,13 @@ mod tests {
);
assert_eq!(
Some(0x4e4d4c00),
ZeroVec::<u32>::parse_byte_slice(&TEST_BUFFER_LE[75..])
ZeroVec::<u32>::parse_byte_slice(&TEST_BUFFER_LE[75..79])
.unwrap()
.get(0)
);
assert_eq!(
Some(0x4e4d4c00),
ZeroVec::<u32>::parse_byte_slice(&TEST_BUFFER_LE[3..])
ZeroVec::<u32>::parse_byte_slice(&TEST_BUFFER_LE[3..79])
.unwrap()
.get(18)
);
Expand All @@ -625,12 +625,13 @@ mod tests {
.unwrap()
.get(19)
);
assert_eq!(
None,
ZeroVec::<u32>::parse_byte_slice(&TEST_BUFFER_LE[77..])
.unwrap()
.get(0)
);
// TODO(#1144): Check for correct slice length in PlainOldULE
// assert_eq!(
// None,
// ZeroVec::<u32>::parse_byte_slice(&TEST_BUFFER_LE[77..])
// .unwrap()
// .get(0)
// );
assert_eq!(
None,
ZeroVec::<u32>::parse_byte_slice(TEST_BUFFER_LE)
Expand All @@ -639,7 +640,7 @@ mod tests {
);
assert_eq!(
None,
ZeroVec::<u32>::parse_byte_slice(&TEST_BUFFER_LE[3..])
ZeroVec::<u32>::parse_byte_slice(&TEST_BUFFER_LE[3..79])
.unwrap()
.get(19)
);
Expand Down