Skip to content

Introduce String::into_chars #50845

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 52 additions & 0 deletions src/liballoc/string.rs
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,7 @@
use core::char::{decode_utf16, REPLACEMENT_CHARACTER};
use core::fmt;
use core::hash;
use core::mem;
use core::iter::{FromIterator, FusedIterator};
use core::ops::Bound::{Excluded, Included, Unbounded};
use core::ops::{self, Add, AddAssign, Index, IndexMut, RangeBounds};
Expand Down Expand Up @@ -1452,6 +1453,20 @@ impl String {
self.vec.clear()
}

/// Converts the `String` into an iterator of `char`s.
///
/// # Examples
///
/// ```
/// let s = String::from("easy as α β γ");
/// assert_eq!(s.into_chars().count(), 13);
/// ```
#[unstable(feature = "into_chars", reason = "new API", issue="0")]
pub fn into_chars(self) -> IntoChars {
let chars = unsafe { mem::transmute(self.chars()) };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lying about lifetimes is not nice and there were some soundness issues related to that in the past. For example this can very silently become a read of freed memory if _s is freed and drop for chars happens to do something with the data it references.

If we’re doing this, can we do this properly, for example, by adding a new CommonChars into core that’s generic over the iterator and making the original Chars a type Chars = CommonChars<SliceIter> and the string’s type Chars = core::CommonChars<vec::IntoIter<u8>> or something?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example this can very silently become a read of freed memory if _s is freed and drop for chars happens to do something with the data it references.

Yeah, that's why I commented that Chars doesn't implement Drop 😜

adding a new CommonChars into core that’s generic over the iterator

Certainly! I'm very much in favor of removing the unsafe here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

making the original Chars a type Chars =

I've done this, but it exposes the CommonChars in the documentation; is that acceptable?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think CommonChars should be private, or at least doc-hidden and perma-unstable. I think it follows that Chars should remain a struct with private fields, even if that involves adding trival forwarding impls.

(But regardless I remain skeptical that String::into_chars should exist in the first place, see other comment.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I figured we'd want to delegate all the parts, I just wanted to make sure others agreed before I put in that legwork.

IntoChars { _s: self, chars }
}

/// Creates a draining iterator that removes the specified range in the string
/// and yields the removed chars.
///
Expand Down Expand Up @@ -2360,3 +2375,40 @@ impl<'a> DoubleEndedIterator for Drain<'a> {

#[stable(feature = "fused", since = "1.26.0")]
impl<'a> FusedIterator for Drain<'a> {}

// This struct is safe because the `String` is heap-allocated (stable
// address) and will never be modified (stable address). `chars` will
// not outlive the struct, so lying about the lifetime should be
// fine. `Chars` does not have a destructor.
#[unstable(feature = "into_chars", reason = "new API", issue="0")]
pub struct IntoChars {
_s: String,
chars: Chars<'static>,
}

#[unstable(feature = "into_chars", reason = "new API", issue="0")]
impl fmt::Debug for IntoChars {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
self.chars.fmt(f)
}
}
#[unstable(feature = "into_chars", reason = "new API", issue="0")]
impl Iterator for IntoChars {
type Item = char;

#[inline]
fn next(&mut self) -> Option<Self::Item> {
self.chars.next()
}
}

#[unstable(feature = "into_chars", reason = "new API", issue="0")]
impl DoubleEndedIterator for IntoChars {
#[inline]
fn next_back(&mut self) -> Option<Self::Item> {
self.chars.next_back()
}
}

#[unstable(feature = "into_chars", reason = "new API", issue="0")]
impl FusedIterator for IntoChars {}