Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document unsafety in src/libcore/slice/mod.rs #73555

Closed
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
104 changes: 94 additions & 10 deletions src/libcore/slice/mod.rs
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
// ignore-tidy-filelength
// ignore-tidy-undocumented-unsafe

//! Slice management and manipulation.
//!
Expand Down Expand Up @@ -70,6 +69,8 @@ impl<T> [T] {
#[allow(unused_attributes)]
#[allow_internal_unstable(const_fn_union)]
pub const fn len(&self) -> usize {
// SAFETY: this is safe because `&[T]` and `FatPtr<T>` have the same layout.
// Only `std` can make this guarantee.
unsafe { crate::ptr::Repr { rust: self }.raw.len }
}

Expand Down Expand Up @@ -437,7 +438,8 @@ impl<T> [T] {
#[unstable(feature = "slice_ptr_range", issue = "65807")]
#[inline]
pub fn as_ptr_range(&self) -> Range<*const T> {
// The `add` here is safe, because:
let start = self.as_ptr();
// SAFETY: The `add` here is safe, because:
//
// - Both pointers are part of the same object, as pointing directly
// past the object also counts.
Expand All @@ -454,7 +456,6 @@ impl<T> [T] {
// the end of the address space.
//
// See the documentation of pointer::add.
let start = self.as_ptr();
let end = unsafe { start.add(self.len()) };
start..end
}
Expand All @@ -478,8 +479,8 @@ impl<T> [T] {
#[unstable(feature = "slice_ptr_range", issue = "65807")]
#[inline]
pub fn as_mut_ptr_range(&mut self) -> Range<*mut T> {
// See as_ptr_range() above for why `add` here is safe.
let start = self.as_mut_ptr();
// SAFETY: See as_ptr_range() above for why `add` here is safe.
let end = unsafe { start.add(self.len()) };
start..end
}
Expand All @@ -505,6 +506,8 @@ impl<T> [T] {
#[stable(feature = "rust1", since = "1.0.0")]
#[inline]
pub fn swap(&mut self, a: usize, b: usize) {
// SAFETY: `pa` and `pb` have been created from safe mutable references and refer
// to elements in the slice and therefore are guaranteed to be valid and aligned.
unsafe {
// Can't take two mutable loans from one vector, so instead just cast
// them to their raw pointers to do the swap
Expand Down Expand Up @@ -548,6 +551,10 @@ impl<T> [T] {
// Use the llvm.bswap intrinsic to reverse u8s in a usize
let chunk = mem::size_of::<usize>();
while i + chunk - 1 < ln / 2 {
// SAFETY: the condition of the `while` guarantees that
// `i` and `ln - i - chunk` are inside the slice.
// The resulting pointers `pa` and `pb` are therefore valid,
// and can be read from and written to.
unsafe {
let pa: *mut T = self.get_unchecked_mut(i);
let pb: *mut T = self.get_unchecked_mut(ln - i - chunk);
Expand All @@ -564,6 +571,10 @@ impl<T> [T] {
// Use rotate-by-16 to reverse u16s in a u32
let chunk = mem::size_of::<u32>() / 2;
while i + chunk - 1 < ln / 2 {
// SAFETY: the condition of the `while` guarantees that
// `i` and `ln - i - chunk` are inside the slice.
// The resulting pointers `pa` and `pb` are therefore valid,
// and can be read from and written to.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These comments also need to touch on the larger-pieced bytes being still in-bounds based on the indexing. Personally I would like to see some short proof-y wording, maybe something like this.

Personally I feel like just saying "well the condition guarantees things" doesn't help much since it doesn't allow for modifying or understanding why this is the case.

An unaligned u32 can be read from i if i + 1 < ln (and obviously i < ln),
because each element is 2 bytes and we're reading 4.
i + chunk - 1 < ln / 2 # while condition
i + 2 - 1 < ln / 2
i + 1 < ln / 2
Since it's less than the length divided by 2, then it must be in bounds.

unsafe {
let pa: *mut T = self.get_unchecked_mut(i);
let pb: *mut T = self.get_unchecked_mut(ln - i - chunk);
Expand All @@ -577,8 +588,12 @@ impl<T> [T] {
}

while i < ln / 2 {
// Unsafe swap to avoid the bounds check in safe swap.
// SAFETY: the condition of the `while` guarantees that `i` and `ln - i - 1`
// are inside the slice and refer to an element inside the slice.
// The resulting pointers `pa` and `pb` are therefore valid and aligned,
// and can be read from and written to.
unsafe {
// Unsafe swap to avoid the bounds check in safe swap.
let pa: *mut T = self.get_unchecked_mut(i);
let pb: *mut T = self.get_unchecked_mut(ln - i - 1);
ptr::swap(pa, pb);
Expand All @@ -603,6 +618,9 @@ impl<T> [T] {
#[stable(feature = "rust1", since = "1.0.0")]
#[inline]
pub fn iter(&self) -> Iter<'_, T> {
// SAFETY: adding `self.len()` to the starting pointer gives a pointer
// at the end of `self`, which fulfills the expectations of `ptr.add()`
// and `NonNull::new_unchecked()`.
unsafe {
let ptr = self.as_ptr();
assume(!ptr.is_null());
Expand Down Expand Up @@ -631,6 +649,9 @@ impl<T> [T] {
#[stable(feature = "rust1", since = "1.0.0")]
#[inline]
pub fn iter_mut(&mut self) -> IterMut<'_, T> {
// SAFETY: adding `self.len()` to the starting pointer gives a pointer
// at the end of `self`, which fulfills the expectations of `ptr.add()`
// and `NonNull::new_unchecked()`.
unsafe {
let ptr = self.as_mut_ptr();
assume(!ptr.is_null());
Expand Down Expand Up @@ -1062,6 +1083,8 @@ impl<T> [T] {
let len = self.len();
let ptr = self.as_mut_ptr();

// SAFETY: `[ptr;mid]` and `[mid;len]` are inside `self`, which fulfills the
// requirements of `from_raw_parts_mut`.
unsafe {
assert!(mid <= len);

Expand Down Expand Up @@ -1548,14 +1571,14 @@ impl<T> [T] {
while size > 1 {
let half = size / 2;
let mid = base + half;
// mid is always in [0, size), that means mid is >= 0 and < size.
// SAFETY: mid is always in [0, size), that means mid is >= 0 and < size.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// SAFETY: mid is always in [0, size), that means mid is >= 0 and < size.
// SAFETY:

Obviously pre-existing but this is not really saying anything and the next two lines are precisely what we need

// mid >= 0: by definition
// mid < size: mid = size / 2 + size / 4 + size / 8 ...
let cmp = f(unsafe { s.get_unchecked(mid) });
base = if cmp == Greater { base } else { mid };
size -= half;
}
// base is always in [0, size) because base <= mid.
// SAFETY: base is always in [0, size) because base <= mid.
let cmp = f(unsafe { s.get_unchecked(base) });
if cmp == Equal { Ok(base) } else { Err(base + (cmp == Less) as usize) }
}
Expand Down Expand Up @@ -2013,6 +2036,13 @@ impl<T> [T] {
let mut next_read: usize = 1;
let mut next_write: usize = 1;

// SAFETY: the `while` condition guarantees `next_read` and `next_write`
// are less than `len`, thus are inside `self`. `prev_ptr_write` points to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should make a note that next_write is incremented at most once per loop somewhere in here.

// one element before `ptr_write`, but `next_write` starts at 1, so
// `prev_ptr_write` is never less than 0 and is inside the slice.
// This fulfils the requirements for dereferencing `ptr_read`, `prev_ptr_write`
// and `ptr_write`, and for using `ptr.add(next_read)`, `ptr.add(next_write - 1)`
// and `prev_ptr_write.offset(1)`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW some of these constructions seem pretty strange, I wouldn't want to edit the code in this PR but ptr.add(next_write - 1) seems like it should be next_write.sub(1).

unsafe {
// Avoid bounds checks by using raw pointers.
while next_read < len {
Expand Down Expand Up @@ -2097,6 +2127,8 @@ impl<T> [T] {
assert!(mid <= self.len());
let k = self.len() - mid;

// SAFETY: `[mid - mid;mid+k]` corresponds to the entire
// `self` slice, thus is valid for reads and writes.
unsafe {
let p = self.as_mut_ptr();
rotate::ptr_rotate(mid, p.add(mid), k);
Expand Down Expand Up @@ -2138,6 +2170,8 @@ impl<T> [T] {
assert!(k <= self.len());
let mid = self.len() - k;

// SAFETY: `[mid - mid;mid+k]` corresponds to the entire
// `self` slice, thus is valid for reads and writes.
unsafe {
let p = self.as_mut_ptr();
rotate::ptr_rotate(mid, p.add(mid), k);
Expand Down Expand Up @@ -2300,6 +2334,9 @@ impl<T> [T] {
T: Copy,
{
assert_eq!(self.len(), src.len(), "destination and source slices have different lengths");
// SAFETY: `self` is valid for `self.len()` bytes by definition, and `src` was
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// SAFETY: `self` is valid for `self.len()` bytes by definition, and `src` was
// SAFETY: `self` is valid for `self.len()` elements by definition, and `src` was

// checked to have the same length. Both slices cannot be overlapping because
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// checked to have the same length. Both slices cannot be overlapping because
// checked to have the same length. The slices cannot overlap because

// Rust's mutable references are exclusive.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Rust's mutable references are exclusive.
// mutable references are exclusive.

unsafe {
ptr::copy_nonoverlapping(src.as_ptr(), self.as_mut_ptr(), self.len());
}
Expand Down Expand Up @@ -2353,6 +2390,7 @@ impl<T> [T] {
assert!(src_end <= self.len(), "src is out of bounds");
let count = src_end - src_start;
assert!(dest <= self.len() - count, "dest is out of bounds");
// SAFETY: the conditions for `ptr::copy` have been checked above.
unsafe {
ptr::copy(self.as_ptr().add(src_start), self.as_mut_ptr().add(dest), count);
}
Expand Down Expand Up @@ -2408,6 +2446,9 @@ impl<T> [T] {
#[stable(feature = "swap_with_slice", since = "1.27.0")]
pub fn swap_with_slice(&mut self, other: &mut [T]) {
assert!(self.len() == other.len(), "destination and source slices have different lengths");
// SAFETY: `self` is valid for `self.len()` bytes by definition, and `src` was
// checked to have the same length. Both slices cannot be overlapping because
// Rust's mutable references are exclusive.
unsafe {
ptr::swap_nonoverlapping(self.as_mut_ptr(), other.as_mut_ptr(), self.len());
}
Expand Down Expand Up @@ -2439,6 +2480,8 @@ impl<T> [T] {
// iterative stein’s algorithm
// We should still make this `const fn` (and revert to recursive algorithm if we do)
// because relying on llvm to consteval all this is… well, it makes me uncomfortable.

// SAFETY: `a` and `b` are checked to be non-zero values.
let (ctz_a, mut ctz_b) = unsafe {
if a == 0 {
return b;
Expand All @@ -2458,6 +2501,7 @@ impl<T> [T] {
mem::swap(&mut a, &mut b);
}
b = b - a;
// SAFETY: `b` is checked to be non-zero.
unsafe {
if b == 0 {
break;
Expand Down Expand Up @@ -2848,11 +2892,13 @@ impl<T> SliceIndex<[T]> for usize {

#[inline]
fn get(self, slice: &[T]) -> Option<&T> {
// SAFETY: `self` is checked to be in bounds.
if self < slice.len() { unsafe { Some(self.get_unchecked(slice)) } } else { None }
}

#[inline]
fn get_mut(self, slice: &mut [T]) -> Option<&mut T> {
// SAFETY: `self` is checked to be in bounds.
if self < slice.len() { unsafe { Some(self.get_unchecked_mut(slice)) } } else { None }
}

Expand Down Expand Up @@ -2888,6 +2934,7 @@ impl<T> SliceIndex<[T]> for ops::Range<usize> {
if self.start > self.end || self.end > slice.len() {
None
} else {
// SAFETY: `self` is checked to be valid and in bounds above.
unsafe { Some(self.get_unchecked(slice)) }
}
}
Expand All @@ -2897,6 +2944,7 @@ impl<T> SliceIndex<[T]> for ops::Range<usize> {
if self.start > self.end || self.end > slice.len() {
None
} else {
// SAFETY: `self` is checked to be valid and in bounds above.
unsafe { Some(self.get_unchecked_mut(slice)) }
}
}
Expand All @@ -2918,6 +2966,7 @@ impl<T> SliceIndex<[T]> for ops::Range<usize> {
} else if self.end > slice.len() {
slice_index_len_fail(self.end, slice.len());
}
// SAFETY: `self` is checked to be valid and in bounds above.
unsafe { self.get_unchecked(slice) }
}

Expand All @@ -2928,6 +2977,7 @@ impl<T> SliceIndex<[T]> for ops::Range<usize> {
} else if self.end > slice.len() {
slice_index_len_fail(self.end, slice.len());
}
// SAFETY: `self` is checked to be valid and in bounds above.
unsafe { self.get_unchecked_mut(slice) }
}
}
Expand Down Expand Up @@ -3239,6 +3289,8 @@ macro_rules! iterator {
// Helper function for creating a slice from the iterator.
#[inline(always)]
fn make_slice(&self) -> &'a [T] {
// SAFETY: the iterator was created from a slice with pointer `self.ptr` and length `len!(self)`.
// This guarantees that all the prerequisites for `from_raw_parts` are fulfilled.
unsafe { from_raw_parts(self.ptr.as_ptr(), len!(self)) }
}

Expand Down Expand Up @@ -3292,6 +3344,10 @@ macro_rules! iterator {
#[inline]
fn next(&mut self) -> Option<$elem> {
// could be implemented with slices, but this avoids bounds checks

// SAFETY: `assume` calls are safe since a slice's start pointer must be non-null,
// and slices over non-ZSTs must also have a non-null end pointer.
// The call to `next_unchecked!` is safe since we check if the iterator is empty first.
unsafe {
assume(!self.ptr.as_ptr().is_null());
if mem::size_of::<T>() != 0 {
Expand Down Expand Up @@ -3325,14 +3381,14 @@ macro_rules! iterator {
// could be (due to wrapping).
self.end = self.ptr.as_ptr();
} else {
// SAFETY: end can't be 0 if T isn't ZST because ptr isn't 0 and end >= ptr
unsafe {
// End can't be 0 if T isn't ZST because ptr isn't 0 and end >= ptr
self.ptr = NonNull::new_unchecked(self.end as *mut T);
}
}
return None;
}
// We are in bounds. `post_inc_start` does the right thing even for ZSTs.
// SAFETY: we are in bounds. `post_inc_start` does the right thing even for ZSTs.
unsafe {
self.post_inc_start(n as isize);
Some(next_unchecked!(self))
Expand Down Expand Up @@ -3439,6 +3495,8 @@ macro_rules! iterator {
let mut i = 0;
while let Some(x) = self.next() {
if predicate(x) {
// SAFETY: we are guaranteed to be in bounds by the loop invariant:
// when `i >= n`, `self.next()` returns `None` and the loop breaks.
unsafe { assume(i < n) };
return Some(i);
}
Expand All @@ -3460,6 +3518,8 @@ macro_rules! iterator {
while let Some(x) = self.next_back() {
i -= 1;
if predicate(x) {
// SAFETY: `i` must be lower than `n` since it starts at `n`
// and is only decreasing.
unsafe { assume(i < n) };
return Some(i);
}
Expand All @@ -3475,6 +3535,10 @@ macro_rules! iterator {
#[inline]
fn next_back(&mut self) -> Option<$elem> {
// could be implemented with slices, but this avoids bounds checks

// SAFETY: `assume` calls are safe since a slice's start pointer must be non-null,
// and slices over non-ZSTs must also have a non-null end pointer.
// The call to `next_back_unchecked!` is safe since we check if the iterator is empty first.
unsafe {
assume(!self.ptr.as_ptr().is_null());
if mem::size_of::<T>() != 0 {
Expand All @@ -3495,7 +3559,7 @@ macro_rules! iterator {
self.end = self.ptr.as_ptr();
return None;
}
// We are in bounds. `pre_dec_end` does the right thing even for ZSTs.
// SAFETY: we are in bounds. `pre_dec_end` does the right thing even for ZSTs.
unsafe {
self.pre_dec_end(n as isize);
Some(next_back_unchecked!(self))
Expand Down Expand Up @@ -3690,6 +3754,8 @@ impl<'a, T> IterMut<'a, T> {
/// ```
#[stable(feature = "iter_to_slice", since = "1.4.0")]
pub fn into_slice(self) -> &'a mut [T] {
// SAFETY: the iterator was created from a mutable slice with pointer `self.ptr` and length `len!(self)`.
// This guarantees that all the prerequisites for `from_raw_parts_mut` are fulfilled.
unsafe { from_raw_parts_mut(self.ptr.as_ptr(), len!(self)) }
}

Expand Down Expand Up @@ -5855,12 +5921,20 @@ pub unsafe fn from_raw_parts_mut<'a, T>(data: *mut T, len: usize) -> &'a mut [T]
/// Converts a reference to T into a slice of length 1 (without copying).
#[stable(feature = "from_ref", since = "1.28.0")]
pub fn from_ref<T>(s: &T) -> &[T] {
// SAFETY: a reference is guaranteed to be valid for reads. The returned
// reference cannot be mutated as it is an immutable reference.
// `mem::size_of::<T>()` cannot be larger than `isize::MAX`.
// Thus the call to `from_raw_parts` is safe.
unsafe { from_raw_parts(s, 1) }
}

/// Converts a reference to T into a slice of length 1 (without copying).
#[stable(feature = "from_ref", since = "1.28.0")]
pub fn from_mut<T>(s: &mut T) -> &mut [T] {
// SAFETY: a mutable reference is guaranteed to be valid for writes.
// The reference cannot be accessed by another pointer as it is an mutable reference.
// `mem::size_of::<T>()` cannot be larger than `isize::MAX`.
// Thus the call to `from_raw_parts_mut` is safe.
unsafe { from_raw_parts_mut(s, 1) }
}

Expand Down Expand Up @@ -5993,6 +6067,9 @@ where
if self.as_ptr().guaranteed_eq(other.as_ptr()) {
return true;
}

// SAFETY: `self` and `other` are references and are thus guaranteed to be valid.
// The two slices have been checked to have the same size above.
unsafe {
let size = mem::size_of_val(self);
memcmp(self.as_ptr() as *const u8, other.as_ptr() as *const u8, size) == 0
Expand Down Expand Up @@ -6095,6 +6172,9 @@ impl SliceOrd for u8 {
#[inline]
fn compare(left: &[Self], right: &[Self]) -> Ordering {
let order =
// SAFETY: `left` and `right` are references and are thus guaranteed to be valid.
// We use the minimum of both lengths which guarantees that both regions are
// valid for reads in that interval.
unsafe { memcmp(left.as_ptr(), right.as_ptr(), cmp::min(left.len(), right.len())) };
if order == 0 {
left.len().cmp(&right.len())
Expand Down Expand Up @@ -6164,6 +6244,10 @@ impl SliceContains for u8 {
impl SliceContains for i8 {
fn slice_contains(&self, x: &[Self]) -> bool {
let byte = *self as u8;
// SAFETY: `i8` and `u8` have the same memory layout, thus casting `x.as_ptr()`
// as `*const u8` is safe. The `x.as_ptr()` comes from a reference and is thus guaranteed
// to be valid for reads for the length of the slice `x.len()`, which cannot be larger
// than `isize::MAX`. The returned slice is never mutated.
let bytes: &[u8] = unsafe { from_raw_parts(x.as_ptr() as *const u8, x.len()) };
memchr::memchr(byte, bytes).is_some()
}
Expand Down