-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add slice methods for indexing via an array of indices. #83608
Conversation
r? @sfackler (rust-highfive has picked a reviewer for you, use r? to override) |
This comment has been minimized.
This comment has been minimized.
To reduce the API surface is it a good idea to offer only the mut versions? Getting multiple immutable references is already possible in simpler ways. And regarding the Presorted, in the D language stdlib, the sort functions return a SortedRange (https://dlang.org/phobos/std_algorithm_sorting.html#sort ), it's a type that can be converted to a regular slice, but it can also be used as is. A type that remembers the sortdness of a slice is handy, it performs a safer binary search (the compiler forbids you from performing a binary search on an unsorted range by mistake. You have assumeSorted function (https://dlang.org/phobos/std_range.html#.SortedRangeOptions ), but in debug mode it performs a statistically cheap test that the range is actually sorted) and you can use it in other cases like indexing for your functions. |
See also #78541 where I asked this for just two indices... that's my most common case. |
The std lib usually provides variants for both mutable and immutable slices the same way, even if the immutable one can be expressed differently already. See |
I give some related convenience methods around |
☔ The latest upstream changes (presumably #83530) made this pull request unmergeable. Please resolve the merge conflicts. |
Yeah, but increasing API surface has a cognitive and other kind of costs. So if a function isn't that useful, it could be better to break the symmetry. |
An immutable variant often helps whenever anyone writes another method with both mutable and immutable variants, because then their two methods read identically. Our eyes run What code replaces
or
|
On Nightly we can write shorter code now :-) #![feature(array_map)]
let [a, b, c] = [5, 7, 13].map(|idx| slice[idx]); |
Nice. :) Appears the
|
Shouldn't this panic if the indices are unsorted? Seems like a pitfall that it silently returns None (which is also used to indicate the indices being out of bounds). Also since the common case will be for these arrays to be relatively small (I would imagine 2 is the most common case), I wonder if it's even worth requiring the indices to be sorted? Or maybe there could be two separate methods? I'm thinking about other collections like HashMap where we will presumably want this operation as well, eventually. Inconsistent preconditions between HashMap and slice could easily lead to confusion/bugs. |
r? @m-ou-se |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for making this!
Also since the common case will be for these arrays to be relatively small (I would imagine 2 is the most common case), I wonder if it's even worth requiring the indices to be sorted?
While discussing this in the libs team meeting recently, we were asking ourselves the same question. We imagine this will mostly be used with small values of N (e.g. 2 or 3), in which case checking for duplicates without requiring it to be sorted would be fine. (Or maybe it could first check if it's sorted to have a fast common case, but fall back to the O(N²) check otherwise?) (And since N will be low in most cases, maybe multiple
might be a better name than many
. But we can leave the bikeshedding for later. ^^)
That would solve the question of what to do for unsorted indices (panic vs returning None) by making the question irrelevant.
It'd also match better with the unchecked version. Otherwise we'd have cases where the unchecked version can safely be used, (e.g. [1, 0]
), but the checked version would panic or return None, which can be confusing.
And for that same reason, I think it'd be best to weaken the requirements on the non-mut version as much as possible too.
What do you think?
A also have a few comments on the implementation:
library/core/src/slice/mod.rs
Outdated
let mut arr: mem::MaybeUninit<[&T; N]> = mem::MaybeUninit::uninit(); | ||
let arr_ptr = arr.as_mut_ptr(); | ||
|
||
// SAFETY: We expect `indices` to contain disjunct values that are | ||
// in bounds of `self`. | ||
unsafe { | ||
for i in 0..N { | ||
let idx = *indices.get_unchecked(i); | ||
*(*arr_ptr).get_unchecked_mut(i) = &*slice.get_unchecked(idx); | ||
} | ||
arr.assume_init() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use array::map
to simplify this:
let mut arr: mem::MaybeUninit<[&T; N]> = mem::MaybeUninit::uninit(); | |
let arr_ptr = arr.as_mut_ptr(); | |
// SAFETY: We expect `indices` to contain disjunct values that are | |
// in bounds of `self`. | |
unsafe { | |
for i in 0..N { | |
let idx = *indices.get_unchecked(i); | |
*(*arr_ptr).get_unchecked_mut(i) = &*slice.get_unchecked(idx); | |
} | |
arr.assume_init() | |
} | |
indices.map(|i| unsafe { &*slice.get_unchecked(i) }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get_unchecked()
returns &T
, so the &*
would be unneeded. Apart from that - I started out with the .map
function, but it generated worse code when I tried it. Believe me, I did not arrive at this complicated piece of unsafe code as a first choice. 😄
let mut arr: mem::MaybeUninit<[&mut T; N]> = mem::MaybeUninit::uninit(); | ||
let arr_ptr = arr.as_mut_ptr(); | ||
|
||
// SAFETY: We expect `indices` to contain disjunct values that are | ||
// in bounds of `self`. | ||
unsafe { | ||
for i in 0..N { | ||
let idx = *indices.get_unchecked(i); | ||
*(*arr_ptr).get_unchecked_mut(i) = &mut *slice.get_unchecked_mut(idx); | ||
} | ||
arr.assume_init() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let mut arr: mem::MaybeUninit<[&mut T; N]> = mem::MaybeUninit::uninit(); | |
let arr_ptr = arr.as_mut_ptr(); | |
// SAFETY: We expect `indices` to contain disjunct values that are | |
// in bounds of `self`. | |
unsafe { | |
for i in 0..N { | |
let idx = *indices.get_unchecked(i); | |
*(*arr_ptr).get_unchecked_mut(i) = &mut *slice.get_unchecked_mut(idx); | |
} | |
arr.assume_init() | |
} | |
indices.map(|i| unsafe { &mut *slice.get_unchecked_mut(i) }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get_unchecked_mut()
returns&mut T
, so the&mut *
would not be needed.- This would not compile due to the static borrow checker for mutable borrows
i.map(|i| unsafe { &mut *(s.get_unchecked_mut(i) as *mut _) })
to make it raw pointers does not passmiri
checks.
☔ The latest upstream changes (presumably #91970) made this pull request unmergeable. Please resolve the merge conflicts. |
@Ten0 It was never abandoned from my side. Rather, the idea was that once we have this merged, we could have a follow-up PR for replacing the hardcoded types with generics that do what you propose. Of course, I did not expect this PR to remain open so long... |
Hey! It looks like you've submitted a new PR for the library teams! If this PR contains changes to any Examples of
|
@rustbot label +T-libs-api -T-libs |
☔ The latest upstream changes (presumably #100809) made this pull request unmergeable. Please resolve the merge conflicts. |
Triage: |
38d91fc
to
95e46a4
Compare
95e46a4
to
3fe37b8
Compare
Filed a tracking issue (#104642), rebased, and added the link to it from the feature gates here. I think this can go in as-is, but I noted the unresolved questions on the tracking issue in terms of refactoring/expanding the API to fit more use cases. Normally we'd push back for an ACP but this has received attention from a number of libs-api meetings (albeit a while back) and the desire for it seems clear, even if we wish to iterate on the precise API, which can happen in future PRs. @bors r+ |
…earth Rollup of 7 pull requests Successful merges: - rust-lang#83608 (Add slice methods for indexing via an array of indices.) - rust-lang#95583 (Deprecate the unstable `ptr_to_from_bits` feature) - rust-lang#101655 (Make the Box one-liner more descriptive) - rust-lang#102207 (Constify remaining `Layout` methods) - rust-lang#103193 (mark sys_common::once::generic::Once::new const-stable) - rust-lang#104622 (Use clang for the UEFI targets) - rust-lang#104638 (Move macro_rules diagnostics to diagnostics module) Failed merges: r? `@ghost` `@rustbot` modify labels: rollup
# Objective - std's new APIs do the same thing as `Query::get_multiple_mut`, but are called `get_many`: rust-lang/rust#83608 ## Solution - Find and replace `get_multiple` with `get_many`
Disclaimer: It's been a while since I contributed to the main Rust repo, apologies in advance if this is large enough already that it should've been an RFC.
Update:
&[T]
variant of this API, and removed the requirements for the indices to be sorted.Description
This adds the following slice methods to
core
:This allows creating multiple mutable references to disjunct positions in a slice, which previously required writing some awkward code with
split_at_mut()
oriter_mut()
. For the bound-checked variant, the indices are checked against each other and against the bounds of the slice, which requiresN * (N + 1) / 2
comparison operations.This has a proof-of-concept standalone implementation here: https://crates.io/crates/index_many
Care has been taken that the implementation passes miri borrow checks, and generates straight-forward assembly (though this was only checked on x86_64).
Example
Codegen Examples
Click to expand!
Disclaimer: Taken from local tests with the standalone implementation.
Unchecked Indexing:
Checked Indexing (Option):
Checked Indexing (Panic):
Extensions
There are multiple optional extensions to this.
Indexing With Ranges
This could easily be expanded to allow indexing with
[I; N]
whereI: SliceIndex<Self>
. I wanted to keep the initial implementation simple, so I didn't include it yet.Panicking Variant
We could also add this method:
This would work similar to the regular index operator and panic with out-of-bound indices. The advantage would be that we could more easily ensure good codegen with a useful panic message, which is non-trivial with the
Option
variant.This is implemented in the standalone implementation, and used as basis for the codegen examples here and there.