-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make ptr range iterable #91000
Make ptr range iterable #91000
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you add tests for start > end
(check is_empty()
is handling these cases) and the wrapping behavior (the case mentioned in #89900)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code looks quite branchy, do for in iter
and iter.for_each
loops produce reasonable assembly?
Still looking in to this. Would you consider this a blocker for this API if the assembly isn't the best? |
No, if this is primarily used for FFI then a few extra instructions won't be the bottleneck compared to the non-inlined calls. |
Hmm it turns out quite good actually (disclaimer: I am not an expert at this stuff). All the branching has been compiled down to conditional moves, aside from the backedge which is expected. Here is what I tried: extern "C" {
fn foreign(ptr: *const i32);
}
pub fn forloop(start: *const i32, end: *const i32) {
for ptr in start..end {
unsafe { foreign(ptr) }
}
}
pub fn for_each(start: *const i32, end: *const i32) {
(start..end).for_each(|ptr| unsafe { foreign(ptr) });
} and according to asm::forloop:
push r15
push r14
push rbx
cmp rdi, rsi
jae .LBB0_3
mov r15, rsi
mov r14, qword, ptr, [rip, +, foreign@GOTPCREL]
.LBB0_2:
lea rax, [rdi, +, 4]
cmp rax, r15
mov rbx, r15
cmovb rbx, rax
cmp rdi, rax
cmova rbx, r15
call r14
mov rdi, rbx
cmp rbx, r15
jb .LBB0_2
.LBB0_3:
pop rbx
pop r14
pop r15
ret |
Here is a comparison against the naive loop. In this case the cmov is making it so that this doesn't become an infinite loop or UB when
|
I don't know what the wisdom for current CPUs is. My (possibly outdated) understanding is that inside loop bodies branch instructions should be better as long as they're highly predictable because it shortens the effective instruction length of the loop because branches can be speculated away but cmovs can't.
So that's three
Yeah that's expected, the type sizes are constant and often a power of two so it turns them into shifts. |
Test code: extern "C" { fn foreign(ptr: *const i32); } pub fn forloop(start: *const i32, end: *const i32) { for ptr in start..end { unsafe { foreign(ptr) } } } Before: asm::forloop: push r15 push r14 push rbx cmp rdi, rsi jae .LBB0_3 mov r15, rsi mov r14, qword, ptr, [rip, +, foreign@GOTPCREL] .LBB0_2: lea rax, [rdi, +, 4] cmp rax, r15 mov rbx, r15 cmovb rbx, rax cmp rdi, rax cmova rbx, r15 call r14 mov rdi, rbx cmp rbx, r15 jb .LBB0_2 .LBB0_3: pop rbx pop r14 pop r15 ret After: asm::forloop: push r15 push r14 push rbx cmp rdi, rsi jae .LBB0_3 mov r15, rsi mov r14, qword, ptr, [rip, +, foreign@GOTPCREL] .LBB0_2: mov rax, r15 sub rax, rdi lea rbx, [rdi, +, 4] cmp rax, 4 cmovb rbx, r15 call r14 mov rdi, rbx cmp rbx, r15 jb .LBB0_2 .LBB0_3: pop rbx pop r14 pop r15 ret
Okay I fixed it to merge two of them.
|
Maybe throw a If not that's probably as good as it is gonna get. |
I get the same asm with the following diff. - self.start = if byte_offset >= mem::size_of::<T>() {
+ self.start = if intrinsics::likely(byte_offset >= mem::size_of::<T>()) { |
The code review isn't done yet but gonna check whether anyone else would be on board with having this impl. @rust-lang/libs-api — please see the PR description. @rfcbot fcp merge |
Team member @dtolnay has proposed to merge this. The next step is review by the rest of the tagged team members: No concerns currently listed. Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! See this document for info about what commands tagged team members can give me. |
Yeah, It's worth noting that reverse iteration in C/C++ uses the alignment of the end pointer rather than the begin pointer. I can't believe I'm actually arguing this, but I think this is the more correct behavior for pointer range iteration. That is, It's also worth considering if we don't also want |
It seems very unfortunate that this has to use To me, I'd much rather see some sort of And in addition to optimizing better and avoiding the That type could also require that the pointer range be canonical (begin <= end), like C++ does, to further simplify what it needs to handle. |
The creation of We already have This cleanly sidesteps the issue of whether The one limitation is that it requires converting from So that'd be roughly: // mod ptr;
/// # Safety
/// - that of pointer::offset
/// - pointers into same allocated object
/// - distance is a multiple of size_of::<T>() bytes
unsafe fn slice_from_ptr_range_mut<T>(range: Range<*mut T>) -> *mut [T] {
slice_from_raw_parts_mut(range.start, range.start.offset(range.end))
}
fn slice_from_wrapping_ptr_range_mut(range: Range<*mut T> -> *mut [T] {
slice_from_raw_parts_mut(range.start, range.start.wrapping_offset(range.end))
}
impl<T> *mut [T] {
/// # Safety
/// Valid pointer & length, no assertions of memory content's validity
/// Key is that this can use inbounds pointer offsets, and
/// slice references can use this and just `&mut*` the pointer
unsafe fn iter(self) -> slice::RawIter {
// it's slice::Iter but yielding raw pointers
}
} |
Agreed. I only mention it here because the OP's example has a mention of So I think I was meaning something like
With the same Plus there might as well be some safe helpers like impl<T> [T] {
fn raw_iter(&self) -> RawIter<T>;
fn raw_iter_mut(&mut self) -> RawIterMut<T>;
} Since those are definitely in-bounds and canonical and such, so that you can easily get the pointers to deal in if you want to use them for implementing a slice method. |
It sounds like the two concerns so far are:
Regarding the machine code: I think my preference would be to have Regarding reverse iteration: my inclination would be to simply delete the DoubleEndedIterator impl at this point. I believe that reverse iterating this kind of range will be a highly uncommon use case, and having non-multiple-of-size separation between the start and end is going to be even more rare on top of that, so we simply don't need to figure out an answer. |
I think the separate iterator type decreases subtleties. From the
which means that any code that wants to read the pointers that come out from this thing need to meet the requirements of Thus having a separate I think this is particularly true under the "systems interacting with C++" rationale. If one uses this in a And as a bonus, the new type would be an obvious place to put a |
Correct.
For "pointers pointing to different allocations", I fully agree. |
@scottmcm's comment #91000 (comment) is compelling — previously I didn't fully appreciate the argument about having a clear place to put the proof obligation for the eventual dereferenceability of the computed pointers. An unsafe constructor where the caller is providing I've put up some new iterator types with an unsafe constructor in #91390. |
Since this has moved to a new PR should we close this one / end the FCP? |
I didn't want to immediately kill discussion here because this PR is still relevant if there turns out to be an intransitive preference: if we prefer #91390 over this PR, but we prefer doing nothing over #91390 (i.e. it isn't worth adding), but I prefer this PR over doing nothing. We can reopen if this turns out preferred after looking at #91390. @rfcbot fcp cancel |
@dtolnay proposal cancelled. |
This PR adds:
enabling iteration over a range of pointers, such as produced by
<[T]>::as_ptr_range
and<[T]>::as_mut_ptr_range
.Rationale
It is common for systems interacting with C++ to need to use pointer ranges for iteration, as
begin
/end
pointer pairs are a design pattern in the C++ standard library.In pure Rust code such a thing would be done using slices, but in FFI it's not necessarily good to try to turn the ptr pair into a Rust slice at the language boundary — materializing a reference (to a slice, or even to individual elements) is more prone to UB in Rust than directly working with the foreign pointers as pointers. One basically has to be an expert in Rust aliasing rules, which are not authoritatively written down at this point yet, in order to do it correctly. Whereas someone with basic C++ knowledge can write the pointer-based code and have it be correct.
Compare:
Step
Note that the new impls are independent of the previously existing Iterator impl on Range:
rust/library/core/src/iter/range.rs
Lines 706 to 707 in 003d8d3
That's because the current Step definition has several incompatibilities with the situation of
start
/end
which are not separated by a multiple ofsize_of::<T>()
bytes. For example consider this required invariant on Step::steps_between:rust/library/core/src/iter/range.rs
Lines 30 to 34 in 003d8d3
This says that for arbitrary pointers
a
andb
, which haven
range elements in between them, we need thatb
be losslessly reconstructible froma
andn
. That's not the case for pointers, where e.g.0x2 as *const u16..0x7 as *const u16
and0x2 as *const u16..0x8 as *const u16
contain the same sequence of items when iterated, but differentb
.Step
is unstable though (#42168). If it were to evolve in a way that accommodatesRange<*const T>
andRange<*mut T>
then we can switch all this over toStep
backward compatibly. I don't consider this a goal however.r? @kennytm