Skip to content

Add [T]::as_ptr_range() and [T]::as_mut_ptr_range(). #65806

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Oct 25, 2019
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
82 changes: 81 additions & 1 deletion src/libcore/slice/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ use crate::fmt;
use crate::intrinsics::{assume, exact_div, unchecked_sub, is_aligned_and_not_null};
use crate::isize;
use crate::iter::*;
use crate::ops::{FnMut, self};
use crate::ops::{FnMut, Range, self};
use crate::option::Option;
use crate::option::Option::{None, Some};
use crate::result::Result;
Expand Down Expand Up @@ -407,6 +407,86 @@ impl<T> [T] {
self as *mut [T] as *mut T
}

/// Returns the two raw pointers spanning the slice.
///
/// The returned range is half-open, which means that the end pointer
/// points *one past* the last element of the slice. This way, an empty
/// slice is represented by two equal pointers, and the difference between
/// the two pointers represents the size of the size.
///
/// See [`as_ptr`] for warnings on using these pointers. The end pointer
/// requires extra caution, as it does not point to a valid element in the
/// slice.
///
/// This function is useful for interacting with foreign interfaces which
/// use two pointers to refer to a range of elements in memory, as is
/// common in C++.
///
/// It can also be useful to check if a pointer to an element refers to an
/// element of this slice:
///
/// ```
/// #![feature(slice_ptr_range)]
///
/// let a = [1, 2, 3];
/// let x = &a[1] as *const _;
/// let y = &5 as *const _;
///
/// assert!(a.as_ptr_range().contains(&x));
/// assert!(!a.as_ptr_range().contains(&y));
/// ```
///
/// [`as_ptr`]: #method.as_ptr
#[unstable(feature = "slice_ptr_range", issue = "65807")]
#[inline]
pub fn as_ptr_range(&self) -> Range<*const T> {
// The `add` here is safe, because:
//
// - Both pointers are part of the same object, as pointing directly
// past the object also counts.
//
// - The size of the slice is never larger than isize::MAX bytes, as
// noted here:
// - https://github.com/rust-lang/unsafe-code-guidelines/issues/102#issuecomment-473340447
// - https://doc.rust-lang.org/reference/behavior-considered-undefined.html
// - https://doc.rust-lang.org/core/slice/fn.from_raw_parts.html#safety
// (This doesn't seem normative yet, but the very same assumption is
// made in many places, including the Index implementation of slices.)
//
// - There is no wrapping around involved, as slices do not wrap past
// the end of the address space.
//
// See the documentation of pointer::add.
let start = self.as_ptr();
let end = unsafe { start.add(self.len()) };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please provide a brief safety argument reasoning around the requirements .add states. (Notably, vec.as_ptr().add(vec.len()) is mentioned.) (Same below.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, now I'm starting to wonder if it actually is safe. A Vec is never larger than isize::MAX bytes, but that guarantee is not there for [T], is it?

Copy link
Member Author

@m-ou-se m-ou-se Oct 25, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it to wrapping_add instead.

Copy link
Contributor

@CryZe CryZe Oct 25, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mmh, I'm wondering where this limitation with isize::MAX is coming from. The documentation doesn't seem to mention why that is, only that this is the case. Is this some LLVM specific problem or something? I don't think this is something Rust is inheriting from the C standard.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @rust-lang/lang @rust-lang/wg-unsafe-code-guidelines

Note that slices point to their entire range, so it is important that the length metadata is never too large. In particular, allocations and therefore slices cannot be bigger than isize::MAX bytes.

The total size of the slice must be no larger than isize::MAX bytes in memory. See the safety documentation of pointer::offset.

(Assuming these statements are normative.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first mentions allocations (does that mean heap allocations?),

I think it refers to any sort of allocation, including stack allocations (as your array example indicates).

and the second one seems specific to alloc.

That's a mistake on my part, it is defined in core as well.


So working on the assumption that this is unspecified / may become implementation-defined it seems to me that we have two choices:

  1. Use .wrapping_add(...) -- potentially a foot-gun if you use .contains tho not going to happen for real programs because of LLVM limitations and because of the sheer size of the allocation.

  2. Use assert!(...) as above. I'm not sure if LLVM would see the assertion and reason that it must always hold and so optimize it away.

We can probably switch between the two behaviors even after stabilizing because they are so pathological. However, it seems to me we should add a note to the documentation as well as the tracking issue and then the libs team can pick between one of them.


Changed it back to unsafe { add }, with a comment describing why it should be safe.

Haha; Seems we had the opposite take-aways.

Copy link
Member Author

@m-ou-se m-ou-se Oct 25, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, the Index implementation of slices assumes the same thing: https://doc.rust-lang.org/nightly/src/core/slice/mod.rs.html#2678-2724

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to wade into the issue of what's normative, but ptr.offset and related methods with restrictions to isize::MAX bytes are used pervasively from practically all safe slice APIs (at least, anything that ever accesses or takes the address of a slice element). Therefore:

  • it is impossible in practice to have a safe slice (in the sense that you can do safe operations on it and rest assures they won't cause UB) that spans more than isize::MAX bytes
  • we would have to audit and change a whole bunch of code throughout the standard library (or remove the isize::MAX restriction from offset and friends) if we ever wanted to support larger slices

So I think the current implementation (unsafe { add } with comment) is perfectly fine.´

I would advise against wrapping_add because it can cause performance degradation in some situations and there's no reason to risk that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that seems like a good conclusion. Let's keep the PR as-is.

@m-ou-se Could you add a note to the second bullet re. this not being normative (at least not yet) as well as link to rust-lang/unsafe-code-guidelines#102 (comment). With that I think we should be good to merge the PR.

(Also let's keep this thread visible since it was an interesting conversation.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

start..end
}

/// Returns the two unsafe mutable pointers spanning the slice.
///
/// The returned range is half-open, which means that the end pointer
/// points *one past* the last element of the slice. This way, an empty
/// slice is represented by two equal pointers, and the difference between
/// the two pointers represents the size of the size.
///
/// See [`as_mut_ptr`] for warnings on using these pointers. The end
/// pointer requires extra caution, as it does not point to a valid element
/// in the slice.
///
/// This function is useful for interacting with foreign interfaces which
/// use two pointers to refer to a range of elements in memory, as is
/// common in C++.
///
/// [`as_mut_ptr`]: #method.as_mut_ptr
#[unstable(feature = "slice_ptr_range", issue = "65807")]
#[inline]
pub fn as_mut_ptr_range(&mut self) -> Range<*mut T> {
// See as_ptr_range() above for why `add` here is safe.
let start = self.as_mut_ptr();
let end = unsafe { start.add(self.len()) };
start..end
}

/// Swaps two elements in the slice.
///
/// # Arguments
Expand Down