-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Vec::splice and String::splice #32355
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1228,7 +1228,7 @@ impl String { | |
self.vec.clear() | ||
} | ||
|
||
/// Create a draining iterator that removes the specified range in the string | ||
/// Creates a draining iterator that removes the specified range in the string | ||
/// and yields the removed chars. | ||
/// | ||
/// Note: The element range is removed even if the iterator is not | ||
|
@@ -1286,6 +1286,63 @@ impl String { | |
} | ||
} | ||
|
||
/// Creates a splicing iterator that removes the specified range in the string, | ||
/// replaces with the given string, and yields the removed chars. | ||
/// The given string doesn’t need to be the same length as the range. | ||
/// | ||
/// Note: The element range is removed even if the iterator is not | ||
/// consumed until the end. | ||
/// | ||
/// # Panics | ||
/// | ||
/// Panics if the starting point or end point do not lie on a [`char`] | ||
/// boundary, or if they're out of bounds. | ||
/// | ||
/// [`char`]: ../../std/primitive.char.html | ||
/// | ||
/// # Examples | ||
/// | ||
/// Basic usage: | ||
/// | ||
/// ``` | ||
/// #![feature(splice)] | ||
/// let mut s = String::from("α is alpha, β is beta"); | ||
/// let beta_offset = s.find('β').unwrap_or(s.len()); | ||
/// | ||
/// // Replace the range up until the β from the string | ||
/// let t: String = s.splice(..beta_offset, "Α is capital alpha; ").collect(); | ||
/// assert_eq!(t, "α is alpha, "); | ||
/// assert_eq!(s, "Α is capital alpha; β is beta"); | ||
/// ``` | ||
#[unstable(feature = "splice", reason = "recently added", issue = "32310")] | ||
pub fn splice<'a, 'b, R>(&'a mut self, range: R, replace_with: &'b str) -> Splice<'a, 'b> | ||
where R: RangeArgument<usize> | ||
{ | ||
// Memory safety | ||
// | ||
// The String version of Splice does not have the memory safety issues | ||
// of the vector version. The data is just plain bytes. | ||
// Because the range removal happens in Drop, if the Splice iterator is leaked, | ||
// the removal will not happen. | ||
let len = self.len(); | ||
let start = *range.start().unwrap_or(&0); | ||
let end = *range.end().unwrap_or(&len); | ||
|
||
// Take out two simultaneous borrows. The &mut String won't be accessed | ||
// until iteration is over, in Drop. | ||
let self_ptr = self as *mut _; | ||
// slicing does the appropriate bounds checks | ||
let chars_iter = self[start..end].chars(); | ||
|
||
Splice { | ||
start: start, | ||
end: end, | ||
iter: chars_iter, | ||
string: self_ptr, | ||
replace_with: replace_with | ||
} | ||
} | ||
|
||
/// Converts this `String` into a `Box<str>`. | ||
/// | ||
/// This will drop any excess capacity. | ||
|
@@ -1884,3 +1941,74 @@ impl<'a> DoubleEndedIterator for Drain<'a> { | |
self.iter.next_back() | ||
} | ||
} | ||
|
||
/// A splicing iterator for `String`. | ||
/// | ||
/// This struct is created by the [`splice()`] method on [`String`]. See its | ||
/// documentation for more. | ||
/// | ||
/// [`splice()`]: struct.String.html#method.splice | ||
/// [`String`]: struct.String.html | ||
#[unstable(feature = "splice", reason = "recently added", issue = "32310")] | ||
pub struct Splice<'a, 'b> { | ||
/// Will be used as &'a mut String in the destructor | ||
string: *mut String, | ||
/// Start of part to remove | ||
start: usize, | ||
/// End of part to remove | ||
end: usize, | ||
/// Current remaining range to remove | ||
iter: Chars<'a>, | ||
replace_with: &'b str, | ||
} | ||
|
||
#[unstable(feature = "splice", reason = "recently added", issue = "32310")] | ||
unsafe impl<'a, 'b> Sync for Splice<'a, 'b> {} | ||
#[unstable(feature = "splice", reason = "recently added", issue = "32310")] | ||
unsafe impl<'a, 'b> Send for Splice<'a, 'b> {} | ||
|
||
#[unstable(feature = "splice", reason = "recently added", issue = "32310")] | ||
impl<'a, 'b> Drop for Splice<'a, 'b> { | ||
fn drop(&mut self) { | ||
unsafe { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This seems pretty similar to the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That’s what I wanted to do at first, but I saw that
So I kept the same pattern.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hm ok, I just found this pretty gnarly to read and it seems like something we may not want to jump to specializing just yet There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Another detail is that But that iterator is not as useful as on |
||
let vec = (*self.string).as_mut_vec(); | ||
let range_len = self.end - self.start; | ||
let replacement_len = self.replace_with.len(); | ||
let tail_len = vec.len() - self.end; | ||
if replacement_len > range_len { | ||
vec.reserve(replacement_len - range_len); | ||
} | ||
if replacement_len != range_len { | ||
let src = vec.as_ptr().offset(self.end as isize); | ||
let dst = vec.as_mut_ptr().offset((self.start + replacement_len) as isize); | ||
ptr::copy(src, dst, tail_len); | ||
} | ||
let src = self.replace_with.as_ptr(); | ||
let dst = vec.as_mut_ptr().offset(self.start as isize); | ||
ptr::copy(src, dst, replacement_len); | ||
vec.set_len(self.start + replacement_len + tail_len); | ||
} | ||
} | ||
} | ||
|
||
#[unstable(feature = "splice", reason = "recently added", issue = "32310")] | ||
impl<'a, 'b> Iterator for Splice<'a, 'b> { | ||
type Item = char; | ||
|
||
#[inline] | ||
fn next(&mut self) -> Option<char> { | ||
self.iter.next() | ||
} | ||
|
||
fn size_hint(&self) -> (usize, Option<usize>) { | ||
self.iter.size_hint() | ||
} | ||
} | ||
|
||
#[unstable(feature = "splice", reason = "recently added", issue = "32310")] | ||
impl<'a, 'b> DoubleEndedIterator for Splice<'a, 'b> { | ||
#[inline] | ||
fn next_back(&mut self) -> Option<char> { | ||
self.iter.next_back() | ||
} | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -764,7 +764,7 @@ impl<T> Vec<T> { | |
} | ||
} | ||
|
||
/// Create a draining iterator that removes the specified range in the vector | ||
/// Creates a draining iterator that removes the specified range in the vector | ||
/// and yields the removed items. | ||
/// | ||
/// Note 1: The element range is removed even if the iterator is not | ||
|
@@ -826,6 +826,53 @@ impl<T> Vec<T> { | |
} | ||
} | ||
|
||
/// Creates a splicing iterator that replaces the specified range in the vector | ||
/// with the given `replace_with` iterator and yields the removed items. | ||
/// `replace_with` does not need to be the same length as `range`. | ||
/// | ||
/// Note 1: The element range is removed even if the iterator is not | ||
/// consumed until the end. | ||
/// | ||
/// Note 2: It is unspecified how many elements are removed from the vector, | ||
/// if the `Splice` value is leaked. | ||
/// | ||
/// Note 3: The input iterator `replace_with` is only consumed | ||
/// when the `Splice` value is dropped. | ||
/// | ||
/// Note 4: This is optimal if: | ||
/// | ||
/// * The tail (elements in the vector after `range`) is empty, | ||
/// * or `replace_with` yields fewer elements than `range`’s length | ||
/// * or the lower bound of its `size_hint()` is exact. | ||
/// | ||
/// Otherwise, a temporary vector is allocated and the tail is moved twice. | ||
/// | ||
/// # Panics | ||
/// | ||
/// Panics if the starting point is greater than the end point or if | ||
/// the end point is greater than the length of the vector. | ||
/// | ||
/// # Examples | ||
/// | ||
/// ``` | ||
/// #![feature(splice)] | ||
/// let mut v = vec![1, 2, 3]; | ||
/// let new = [7, 8]; | ||
/// let u: Vec<_> = v.splice(..2, new.iter().cloned()).collect(); | ||
/// assert_eq!(v, &[7, 8, 3]); | ||
/// assert_eq!(u, &[1, 2]); | ||
/// ``` | ||
#[inline] | ||
#[unstable(feature = "splice", reason = "recently added", issue = "32310")] | ||
pub fn splice<R, I>(&mut self, range: R, replace_with: I) -> Splice<I::IntoIter> | ||
where R: RangeArgument<usize>, I: IntoIterator<Item=T> | ||
{ | ||
Splice { | ||
drain: self.drain(range), | ||
replace_with: replace_with.into_iter(), | ||
} | ||
} | ||
|
||
/// Clears the vector, removing all values. | ||
/// | ||
/// # Examples | ||
|
@@ -1696,7 +1743,7 @@ impl<T> Drop for IntoIter<T> { | |
} | ||
} | ||
|
||
/// A draining iterator for `Vec<T>`. | ||
/// A draining iterator for `Vec<T>`. See the [`Vec::drain`](struct.Vec.html#method.drain) method. | ||
#[stable(feature = "drain", since = "1.6.0")] | ||
pub struct Drain<'a, T: 'a> { | ||
/// Index of tail to preserve | ||
|
@@ -1756,6 +1803,119 @@ impl<'a, T> Drop for Drain<'a, T> { | |
} | ||
} | ||
|
||
|
||
#[stable(feature = "rust1", since = "1.0.0")] | ||
impl<'a, T> ExactSizeIterator for Drain<'a, T> {} | ||
|
||
/// A splicing iterator for `Vec<T>`. See the [`Vec::splice`](struct.Vec.html#method.splice) method. | ||
#[unstable(feature = "splice", reason = "recently added", issue = "32310")] | ||
pub struct Splice<'a, I: Iterator + 'a> { | ||
drain: Drain<'a, I::Item>, | ||
replace_with: I, | ||
} | ||
|
||
#[unstable(feature = "splice", reason = "recently added", issue = "32310")] | ||
impl<'a, I: Iterator> Iterator for Splice<'a, I> { | ||
type Item = I::Item; | ||
|
||
fn next(&mut self) -> Option<Self::Item> { | ||
self.drain.next() | ||
} | ||
|
||
fn size_hint(&self) -> (usize, Option<usize>) { | ||
self.drain.size_hint() | ||
} | ||
} | ||
|
||
#[unstable(feature = "splice", reason = "recently added", issue = "32310")] | ||
impl<'a, I: Iterator> DoubleEndedIterator for Splice<'a, I> { | ||
fn next_back(&mut self) -> Option<Self::Item> { | ||
self.drain.next_back() | ||
} | ||
} | ||
|
||
#[unstable(feature = "splice", reason = "recently added", issue = "32310")] | ||
impl<'a, I: Iterator> ExactSizeIterator for Splice<'a, I> {} | ||
|
||
|
||
#[unstable(feature = "splice", reason = "recently added", issue = "32310")] | ||
impl<'a, I: Iterator> Drop for Splice<'a, I> { | ||
fn drop(&mut self) { | ||
// exhaust drain first | ||
while let Some(_) = self.drain.next() {} | ||
|
||
|
||
unsafe { | ||
if self.drain.tail_len == 0 { | ||
let vec = &mut *self.drain.vec; | ||
vec.extend(self.replace_with.by_ref()); | ||
return | ||
} | ||
|
||
// First fill the range left by drain(). | ||
if !self.drain.fill(&mut self.replace_with) { | ||
return | ||
} | ||
|
||
// There may be more elements. Use the lower bound as an estimate. | ||
// FIXME: Is the upper bound a better guess? Or something else? | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I’d like some other opinions here. Getting this guess wrong either way causes the tail to be moved twice.
Maybe over-estimating is better, and this should use the upper bound when it’s There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe if the optimistic route is only used if There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If |
||
let (lower_bound, _upper_bound) = self.replace_with.size_hint(); | ||
if lower_bound > 0 { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This condition There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok, I see. In that case we’d avoid moving "tail" elements twice, but we’d allocate a temporary |
||
self.drain.move_tail(lower_bound); | ||
if !self.drain.fill(&mut self.replace_with) { | ||
return | ||
} | ||
} | ||
|
||
// Collect any remaining elements. | ||
// This is a zero-length vector which does not allocate if `lower_bound` was exact. | ||
let mut collected = self.replace_with.by_ref().collect::<Vec<I::Item>>().into_iter(); | ||
// Now we have an exact count. | ||
if collected.len() > 0 { | ||
self.drain.move_tail(collected.len()); | ||
let filled = self.drain.fill(&mut collected); | ||
debug_assert!(filled); | ||
debug_assert_eq!(collected.len(), 0); | ||
} | ||
} | ||
// Let `Drain::drop` move the tail back if necessary and restore `vec.len`. | ||
} | ||
} | ||
|
||
/// Private helper methods for `Splice::drop` | ||
impl<'a, T> Drain<'a, T> { | ||
/// The range from `self.vec.len` to `self.tail_start` contains elements | ||
/// that have been moved out. | ||
/// Fill that range as much as possible with new elements from the `replace_with` iterator. | ||
/// Return whether we filled the entire range. (`replace_with.next()` didn’t return `None`.) | ||
unsafe fn fill<I: Iterator<Item=T>>(&mut self, replace_with: &mut I) -> bool { | ||
let vec = &mut *self.vec; | ||
let range_start = vec.len; | ||
let range_end = self.tail_start; | ||
let range_slice = slice::from_raw_parts_mut( | ||
vec.as_mut_ptr().offset(range_start as isize), | ||
range_end - range_start); | ||
|
||
for place in range_slice { | ||
if let Some(new_item) = replace_with.next() { | ||
ptr::write(place, new_item); | ||
vec.len += 1; | ||
} else { | ||
return false | ||
} | ||
} | ||
true | ||
} | ||
|
||
/// Make room for inserting more elements before the tail. | ||
unsafe fn move_tail(&mut self, extra_capacity: usize) { | ||
let vec = &mut *self.vec; | ||
let used_capacity = self.tail_start + self.tail_len; | ||
vec.buf.reserve(used_capacity, extra_capacity); | ||
|
||
let new_tail_start = self.tail_start + extra_capacity; | ||
let src = vec.as_ptr().offset(self.tail_start as isize); | ||
let dst = vec.as_mut_ptr().offset(new_tail_start as isize); | ||
ptr::copy(src, dst, self.tail_len); | ||
self.tail_start = new_tail_start; | ||
} | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -438,6 +438,16 @@ fn test_drain_range() { | |
assert_eq!(v, &[(), ()]); | ||
} | ||
|
||
#[test] | ||
fn splice() { | ||
let mut v = vec![1, 2, 3, 4, 5]; | ||
let a = [10, 11, 12]; | ||
v.splice(2..4, a.iter().cloned()); | ||
assert_eq!(v, &[1, 2, 10, 11, 12, 5]); | ||
v.splice(1..3, Some(20)); | ||
assert_eq!(v, &[1, 20, 11, 12, 5]); | ||
} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. These are some pretty small tests for a pretty meaty
etc There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I’ll add some later. |
||
|
||
#[test] | ||
fn test_into_boxed_slice() { | ||
let xs = vec![1, 2, 3]; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a
PhantomData
here with&'a mut String
as it's essentially whatstring
is a reference to?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can, but would it affect anything? If it’s necessary, isn’t it also necessary in
struct Drain
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm yes it might be, I believe it has implications on the variance of the lifetime parameters of the struct (e.g. invariant, not covariant or whichever it is)