Skip to content

Commit e3be230

Browse files
authored
Add TryFromBytes trait (#641)
`TryFromBytes` can be implemented for types which are not `FromZeroes` or `FromBytes`; it supports performing a runtime check to determine whether a given byte sequence contains a valid instance of `Self`. This is the first step of #5, and only adds support for some internals. Future commits will add a richer public API, implementations of `TryFromBytes` for built-in types, support for a custom derive, and support for implementing `TryFromBytes` on unsized types. Makes progress on #5
1 parent eb922ca commit e3be230

File tree

2 files changed

+129
-1
lines changed

2 files changed

+129
-1
lines changed

src/lib.rs

Lines changed: 126 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -288,6 +288,10 @@ use alloc::{boxed::Box, vec::Vec};
288288
#[cfg(any(feature = "alloc", kani))]
289289
use core::alloc::Layout;
290290

291+
// Used by `TryFromBytes::is_bit_valid`.
292+
#[doc(hidden)]
293+
pub use crate::util::ptr::Ptr;
294+
291295
// For each polyfill, as soon as the corresponding feature is stable, the
292296
// polyfill import will be unused because method/function resolution will prefer
293297
// the inherent method/function over a trait method/function. Thus, we suppress
@@ -1051,6 +1055,128 @@ safety_comment! {
10511055
#[cfg_attr(doc_cfg, doc(cfg(feature = "derive")))]
10521056
pub use zerocopy_derive::FromZeroes;
10531057

1058+
/// Types whose validity can be checked at runtime, allowing them to be
1059+
/// conditionally converted from byte slices.
1060+
///
1061+
/// WARNING: Do not implement this trait yourself! Instead, use
1062+
/// `#[derive(TryFromBytes)]`.
1063+
///
1064+
/// `TryFromBytes` types can safely be deserialized from an untrusted sequence
1065+
/// of bytes by performing a runtime check that the byte sequence contains a
1066+
/// valid instance of `Self`.
1067+
///
1068+
/// `TryFromBytes` is ignorant of byte order. For byte order-aware types, see
1069+
/// the [`byteorder`] module.
1070+
///
1071+
/// # What is a "valid instance"?
1072+
///
1073+
/// In Rust, each type has *bit validity*, which refers to the set of bit
1074+
/// patterns which may appear in an instance of that type. It is impossible for
1075+
/// safe Rust code to produce values which violate bit validity (ie, values
1076+
/// outside of the "valid" set of bit patterns). If `unsafe` code produces an
1077+
/// invalid value, this is considered [undefined behavior].
1078+
///
1079+
/// Rust's bit validity rules are currently being decided, which means that some
1080+
/// types have three classes of bit patterns: those which are definitely valid,
1081+
/// and whose validity is documented in the language; those which may or may not
1082+
/// be considered valid at some point in the future; and those which are
1083+
/// definitely invalid.
1084+
///
1085+
/// Zerocopy takes a conservative approach, and only considers a bit pattern to
1086+
/// be valid if its validity is a documenteed guarantee provided by the
1087+
/// language.
1088+
///
1089+
/// For most use cases, Rust's current guarantees align with programmers'
1090+
/// intuitions about what ought to be valid. As a result, zerocopy's
1091+
/// conservatism should not affect most users. One notable exception is unions,
1092+
/// whose bit validity is very up in the air; zerocopy does not permit
1093+
/// implementing `TryFromBytes` for any union type.
1094+
///
1095+
/// If you are negatively affected by lack of support for a particular type,
1096+
/// we encourage you to let us know by [filing an issue][github-repo].
1097+
///
1098+
/// # Safety
1099+
///
1100+
/// On its own, `T: TryFromBytes` does not make any guarantees about the layout
1101+
/// or representation of `T`. It merely provides the ability to perform a
1102+
/// validity check at runtime via methods like [`try_from_ref`].
1103+
///
1104+
/// Currently, it is not possible to stably implement `TryFromBytes` other than
1105+
/// by using `#[derive(TryFromBytes)]`. While there are `#[doc(hidden)]` items
1106+
/// on this trait that provide well-defined safety invariants, no stability
1107+
/// guarantees are made with respect to these items. In particular, future
1108+
/// releases of zerocopy may make backwards-breaking changes to these items,
1109+
/// including changes that only affect soundness, which may cause code which
1110+
/// uses those items to silently become unsound.
1111+
///
1112+
/// [undefined behavior]: https://raphlinus.github.io/programming/rust/2018/08/17/undefined-behavior.html
1113+
/// [github-repo]: https://github.com/google/zerocopy
1114+
/// [`try_from_ref`]: #
1115+
// TODO(#5): Update `try_from_ref` doc link once it exists
1116+
#[doc(hidden)]
1117+
pub unsafe trait TryFromBytes {
1118+
/// Does a given memory range contain a valid instance of `Self`?
1119+
///
1120+
/// # Safety
1121+
///
1122+
/// ## Preconditions
1123+
///
1124+
/// The memory referenced by `candidate` may only be accessed via reads for
1125+
/// the duration of this method call. This prohibits writes through mutable
1126+
/// references and through [`UnsafeCell`]s. There may exist immutable
1127+
/// references to the same memory which contain `UnsafeCell`s so long as:
1128+
/// - Those `UnsafeCell`s exist at the same byte ranges as `UnsafeCell`s in
1129+
/// `Self`. This is a bidirectional property: `Self` may not contain
1130+
/// `UnsafeCell`s where other references to the same memory do not, and
1131+
/// vice-versa.
1132+
/// - Those `UnsafeCell`s are never used to perform mutation for the
1133+
/// duration of this method call.
1134+
///
1135+
/// `candidate` is not required to refer to a valid `Self`. However, it must
1136+
/// satisfy the requirement that uninitialized bytes may only be present
1137+
/// where it is possible for them to be present in `Self`. This is a dynamic
1138+
/// property: if, at a particular byte offset, a valid enum discriminant is
1139+
/// set, the subsequent bytes may only have uninitialized bytes as
1140+
/// specificed by the corresponding enum.
1141+
///
1142+
/// Formally, given `len = size_of_val_raw(candidate)`, at every byte
1143+
/// offset, `b`, in the range `[0, len)`:
1144+
/// - If, in all instances `s: Self` of length `len`, the byte at offset `b`
1145+
/// in `s` is initialized, then the byte at offset `b` within `*candidate`
1146+
/// must be initialized.
1147+
/// - Let `c` be the contents of the byte range `[0, b)` in `*candidate`.
1148+
/// Let `S` be the subset of valid instances of `Self` of length `len`
1149+
/// which contain `c` in the offset range `[0, b)`. If, for all instances
1150+
/// of `s: Self` in `S`, the byte at offset `b` in `s` is initialized,
1151+
/// then the byte at offset `b` in `*candidate` must be initialized.
1152+
///
1153+
/// Pragmatically, this means that if `*candidate` is guaranteed to
1154+
/// contain an enum type at a particular offset, and the enum discriminant
1155+
/// stored in `*candidate` corresponds to a valid variant of that enum
1156+
/// type, then it is guaranteed that the appropriate bytes of `*candidate`
1157+
/// are initialized as defined by that variant's bit validity (although
1158+
/// note that the variant may contain another enum type, in which case the
1159+
/// same rules apply depending on the state of its discriminant, and so on
1160+
/// recursively).
1161+
///
1162+
/// ## Postconditions
1163+
///
1164+
/// Unsafe code may assume that, if `is_bit_valid(candidate)` returns true,
1165+
/// `*candidate` contains a valid `Self`.
1166+
///
1167+
/// # Panics
1168+
///
1169+
/// `is_bit_valid` may panic. Callers are responsible for ensuring that any
1170+
/// `unsafe` code remains sound even in the face of `is_bit_valid`
1171+
/// panicking. (We support user-defined validation routines; so long as
1172+
/// these routines are not required to be `unsafe`, there is no way to
1173+
/// ensure that these do not generate panics.)
1174+
///
1175+
/// [`UnsafeCell`]: core::cell::UnsafeCell
1176+
#[doc(hidden)]
1177+
unsafe fn is_bit_valid(candidate: Ptr<'_, Self>) -> bool;
1178+
}
1179+
10541180
/// Types for which a sequence of bytes all set to zero represents a valid
10551181
/// instance of the type.
10561182
///

src/util.rs

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ pub(crate) mod ptr {
3838
/// `Ptr<'a, T>` is [covariant] in `'a` and `T`.
3939
///
4040
/// [covariant]: https://doc.rust-lang.org/reference/subtyping.html
41-
pub(crate) struct Ptr<'a, T: 'a + ?Sized> {
41+
pub struct Ptr<'a, T: 'a + ?Sized> {
4242
// INVARIANTS:
4343
// - `ptr` is derived from some valid Rust allocation, `A`
4444
// - `ptr` has the same provenance as `A`
@@ -72,6 +72,7 @@ pub(crate) mod ptr {
7272

7373
impl<'a, T: ?Sized> Copy for Ptr<'a, T> {}
7474
impl<'a, T: ?Sized> Clone for Ptr<'a, T> {
75+
#[inline]
7576
fn clone(&self) -> Self {
7677
*self
7778
}
@@ -294,6 +295,7 @@ pub(crate) mod ptr {
294295
}
295296

296297
impl<'a, T: 'a + ?Sized> Debug for Ptr<'a, T> {
298+
#[inline]
297299
fn fmt(&self, f: &mut Formatter<'_>) -> core::fmt::Result {
298300
self.ptr.fmt(f)
299301
}

0 commit comments

Comments
 (0)