-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stop re-implementing slice iterators in vec::IntoIter
#124421
base: master
Are you sure you want to change the base?
Conversation
|
||
/// Iterator over all the `NonNull<T>` pointers to the elements of a slice. | ||
#[must_use = "iterators are lazy and do nothing unless consumed"] | ||
pub struct NonNullIter<T> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment has been minimized.
This comment has been minimized.
6a434c9
to
0178215
Compare
This comment has been minimized.
This comment has been minimized.
16a13b6
to
01617bb
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
a396abc
to
dc7e707
Compare
// FIXME: for some reason, just `self.drain.__iterator_get_unchecked(i)` | ||
// never worked for me. If you know a way to fix that, please do. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Due to the Copy
bound? try_get_unchecked
would probably fix that... which is not public.
But reimplementing the pointer logic seems simple enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried impl TRANC for IntoIter where DrainRaw: TRANC
which I thought would have made it work, and saved the extra NonDrop
copy, but it didn't :(
I agree it's no big deal since the logic is simple.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it'd make sense to run the microbenchmarks we have for alloc and core here. They're quite noisy so you'll have to quiet your system (disabling turbo boost can help) but they cover more than the codegen tests.
And could DrainRaw
help with vec::Drain
and vec::Splice
too?
library/core/src/slice/drain.rs
Outdated
// SAFETY: If they're accessing things in random order we don't know when to drop | ||
// things, so only allow this for `Copy` things where that doesn't matter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not the random order that's causing it, it's that the accesses don't update the pointers, which the TRA contract does not require. Like the NonDrop
already says.
if len >= N { | ||
// SAFETY: If we have more elements than were requested, they can be | ||
// read directly because arrays need no extra alignment. | ||
Ok(unsafe { to_copy.cast::<[T; N]>().read() }) | ||
} else { | ||
let mut raw_ary = MaybeUninit::uninit_array(); | ||
// SAFETY: If we don't have enough elements left, then copy all the | ||
// ones we do have into the local array, which cannot overlap because | ||
// new locals are always distinct storage. | ||
Err(unsafe { | ||
MaybeUninit::<T>::slice_as_mut_ptr(&mut raw_ary) | ||
.copy_from_nonoverlapping(to_copy.as_mut_ptr(), len); | ||
array::IntoIter::new_unchecked(raw_ary, 0..len) | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you checked the codegen of this? When I wrote the vec::IntoIter one I found that copying into the same uninit array in both arms helped a little. But it's possible that it's no longer necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like both are codegen'ing way better than I remember, actually. I think the new dead_on_unwind
flag on the sret
is helping a ton for things like this. (It's in https://releases.llvm.org/18.1.0/docs/LangRef.html but not in https://releases.llvm.org/17.0.1/docs/LangRef.html, so new in nightly as of #120055 in Feb.)
I'll add a codegen test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example, this PR produces the following:
; Function Attrs: mustprogress nofree norecurse nosync nounwind willreturn uwtable
define void @vec_iter_next_chunk_long(ptr dead_on_unwind noalias nocapture noundef writable writeonly sret([152 x i8]) align 8 dereferenceable(152) %_0, ptr noalias nocapture noundef align 8 dereferenceable(32) %it) unnamed_addr #3 personality ptr @__CxxFrameHandler3 {
start:
%_2 = getelementptr inbounds i8, ptr %it, i64 8
tail call void @llvm.experimental.noalias.scope.decl(metadata !54)
tail call void @llvm.experimental.noalias.scope.decl(metadata !57)
%self1.i.i = getelementptr inbounds i8, ptr %it, i64 16
%end.i.i = load ptr, ptr %self1.i.i, align 8, !alias.scope !59, !noalias !54, !nonnull !5, !noundef !5
%subtracted.i.i = load ptr, ptr %_2, align 8, !alias.scope !57, !noalias !54, !nonnull !5, !noundef !5
%0 = ptrtoint ptr %end.i.i to i64
%1 = ptrtoint ptr %subtracted.i.i to i64
%2 = sub nuw i64 %0, %1
%_0.sroa.0.0.sroa.speculated.i.i = tail call noundef i64 @llvm.umin.i64(i64 %2, i64 123)
%_13.i.i = getelementptr inbounds i8, ptr %subtracted.i.i, i64 %_0.sroa.0.0.sroa.speculated.i.i
store ptr %_13.i.i, ptr %_2, align 8, !alias.scope !62, !noalias !54
%_7.i = icmp ugt i64 %2, 122
br i1 %_7.i, label %bb2.i, label %bb3.i
bb3.i: ; preds = %start
%_11.sroa.5.0..sroa_idx.i = getelementptr inbounds i8, ptr %_0, i64 24
tail call void @llvm.memcpy.p0.p0.i64(ptr nonnull align 8 %_11.sroa.5.0..sroa_idx.i, ptr nonnull align 1 %subtracted.i.i, i64 %2, i1 false), !noalias !57
%3 = getelementptr inbounds i8, ptr %_0, i64 8
store i64 0, ptr %3, align 8, !alias.scope !54, !noalias !57
%_11.sroa.4.0..sroa_idx.i = getelementptr inbounds i8, ptr %_0, i64 16
store i64 %2, ptr %_11.sroa.4.0..sroa_idx.i, align 8, !alias.scope !54, !noalias !57
br label %"_ZN96_$LT$core..slice..drain..DrainRaw$LT$T$GT$$u20$as$u20$core..iter..traits..iterator..Iterator$GT$10next_chunk17h4db823eaa1311d9bE.exit"
bb2.i: ; preds = %start
%4 = getelementptr inbounds i8, ptr %_0, i64 1
tail call void @llvm.memcpy.p0.p0.i64(ptr noundef nonnull align 1 dereferenceable(123) %4, ptr noundef nonnull align 1 dereferenceable(123) %subtracted.i.i, i64 123, i1 false), !noalias !57
br label %"_ZN96_$LT$core..slice..drain..DrainRaw$LT$T$GT$$u20$as$u20$core..iter..traits..iterator..Iterator$GT$10next_chunk17h4db823eaa1311d9bE.exit"
"_ZN96_$LT$core..slice..drain..DrainRaw$LT$T$GT$$u20$as$u20$core..iter..traits..iterator..Iterator$GT$10next_chunk17h4db823eaa1311d9bE.exit": ; preds = %bb3.i, %bb2.i
%.sink.i = phi i8 [ 0, %bb2.i ], [ 1, %bb3.i ]
store i8 %.sink.i, ptr %_0, align 8, !alias.scope !54, !noalias !57
ret void
}
which is nearly identical to https://rust.godbolt.org/z/zEddbzcYG.
The difference is that now LLVM has decided to hoist the "move the start pointer forward" into the header instead of having each arm do it. Seems at least plausibly better -- a cmov for something we already have a data dependency on anyway is probably better than each branch having a different store.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, no, maybe it's way newer than I'd realized: #121298
#[inline] | ||
pub(crate) fn make_shortlived_slice<'b>(&'b self) -> &'b [T] { | ||
// SAFETY: Everything expanded with this macro is readable while | ||
// the iterator exists and is unchanged, so we can force a shorter | ||
// lifetime than `'a` to make this safe to call. | ||
unsafe { self.make_nonnull_slice().as_ref() } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get why the explicit lifetime here is necessary. &self
borrows usually already are shorter than Self<'a>
lifetimes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's definitely not necessary. I just wanted to be explicit about it when as_ref
returns an unbounded lifetime and there's an 'a
in scope from the impl. I've updated the comment to hopefully communicate better.
dc7e707
to
dc17e00
Compare
Rather than encoding things as method calls or token expansions. This isn't directly useful for Iter/IterMut, but sets this up for the next commit that adds another invocation of the macro.
ffaafa4
to
f4fbb48
Compare
Good idea. Looking into that a bit I end up at // drop_ptr comes from a slice::Iter which only gives us a &[T] but for drop_in_place
// a pointer with mutable provenance is necessary. which looks highly promising that this will help simplify things. |
f4fbb48
to
c52540b
Compare
☔ The latest upstream changes (presumably #123878) made this pull request unmergeable. Please resolve the merge conflicts. |
Add `slice::DrainRaw` for internal use I'd like to separate out this hopefully-non-controversial part of rust-lang#124421 to have fewer commits to juggle there :) r? `@the8472`
Today,
vec::IntoIter
has essentially an inlined copy of an out-of-date version ofslice::Iter
, including all the distinctT::IS_ZST
checks and everything. That's because it can't actually useslice::Iter<'a, T>
due to the lifetime and the references.But it's silly that it needs to reimplement all that logic, and that we thus end up needing things like #114205 to redo various optimizations on it after they're done on slice iterators.
This PR addresses that as follows:
slice::NonNullIter<T>
implemented using the sameiterator!
macro that implementsIterator
forIter
andIterMut
, so that it'll have all the optimizations without needing to duplicate code, but without the problematic lifetime and SB/TB implications of references.slice::DrainRaw<T>
that moves/drops the items in the iterated range.array::Drain<'a, T>
andvec::IntoIter<T>
usingDrainRaw
, rather than them both needing to have the "read the pointer on next, drop_in_place on drop" logic.Thus
vec::IntoIter
's implementation gets a bunch shorter since it no longer has to worry about things like the best way to iterate a slice of ZSTs. It can just focus on ensuring that the deallocating happens in the correct place & order.(This intentionally leaves
slice::Iter
andslice::ItemMut
almost entirely alone, as the details there are very hot.)