Skip to content

Introduce String::into_chars #50845

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

shepmaster
Copy link
Member

This has come up a few times on Stack Overflow, so might as well see if it can be upstreamed. It's also nice when combined with Iterator::flat_map on an iterator of Strings.

@shepmaster shepmaster added C-enhancement Category: An issue proposing an enhancement or a PR with one. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels May 17, 2018
@kennytm kennytm added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 17, 2018
@kennytm
Copy link
Member

kennytm commented May 17, 2018

rust-highfive is sleeping.

r? @SimonSapin

/// ```
#[unstable(feature = "into_chars", reason = "new API", issue="0")]
pub fn into_chars(self) -> IntoChars {
let chars = unsafe { mem::transmute(self.chars()) };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lying about lifetimes is not nice and there were some soundness issues related to that in the past. For example this can very silently become a read of freed memory if _s is freed and drop for chars happens to do something with the data it references.

If we’re doing this, can we do this properly, for example, by adding a new CommonChars into core that’s generic over the iterator and making the original Chars a type Chars = CommonChars<SliceIter> and the string’s type Chars = core::CommonChars<vec::IntoIter<u8>> or something?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example this can very silently become a read of freed memory if _s is freed and drop for chars happens to do something with the data it references.

Yeah, that's why I commented that Chars doesn't implement Drop 😜

adding a new CommonChars into core that’s generic over the iterator

Certainly! I'm very much in favor of removing the unsafe here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

making the original Chars a type Chars =

I've done this, but it exposes the CommonChars in the documentation; is that acceptable?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think CommonChars should be private, or at least doc-hidden and perma-unstable. I think it follows that Chars should remain a struct with private fields, even if that involves adding trival forwarding impls.

(But regardless I remain skeptical that String::into_chars should exist in the first place, see other comment.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I figured we'd want to delegate all the parts, I just wanted to make sure others agreed before I put in that legwork.

@hanna-kruppe
Copy link
Contributor

It seems like this is a one-off workaround for the lack of self-referential structs and could just as well be implemented atop the owning_ref crate? If so, I'd prefer people do that rather than adding a type specifically for this use case. Adding new types for every kind of self-reference does not scale (and I see no good reason to support only String::chars and not others).

@rust-highfive
Copy link
Contributor

The job x86_64-gnu-llvm-3.9 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
[00:53:03] ..................................i.................................................................
[00:53:28] .....................................................................................test [run-pass] run-pass/mir_heavy_promoted.rs has been running for over 60 seconds
[00:53:31] ...............
[00:53:44] ....................................................................................................
[00:54:24] .ii..............................................................i..................................
[00:54:49] ..................i.ii.....................................................................test [run-pass] run-pass/saturating-float-casts.rs has been running for over 60 seconds
[00:55:18] ..........................iiiiiii...................................................................
[00:55:37] ....................................................................................................
[00:55:50] ....................................................................................................
[00:56:06] .................................................................................................
---
[01:07:45] 
[01:07:45] running 396 tests
[01:08:04] ....................................................................................................
[01:08:23] .....................................i..............................................................
[01:08:40] ............................................................................F.......................
[01:08:58] failures:
[01:08:58] 
[01:08:58] ---- string.rs - string::String::into_chars (line 1460) stdout ----
[01:08:58] ---- string.rs - string::String::into_chars (line 1460) stdout ----
[01:08:58]  error[E0658]: use of unstable library feature 'into_chars': new API
[01:08:58]  --> string.rs:1462:14
[01:08:58]   |
[01:08:58] 5 | assert_eq!(s.into_chars().count(), 13);
[01:08:58]   |
[01:08:58]   |
[01:08:58]   = help: add #![feature(into_chars)] to the crate attributes to enable
[01:08:58] thread 'string.rs - string::String::into_chars (line 1460)' panicked at 'couldn't compile the test', librustdoc/test.rs:325:17
[01:08:58] note: Run with `RUST_BACKTRACE=1` for a backtrace.
[01:08:58] 
[01:08:58] 
---
[01:08:58] 
[01:08:58] error: test failed, to rerun pass '--doc'
[01:08:58] 
[01:08:58] 
[01:08:58] command did not execute successfully: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo" "test" "--target" "x86_64-unknown-linux-gnu" "--release" "--locked" "--color" "always" "--features" "panic-unwind jemalloc backtrace" "--manifest-path" "/checkout/src/libstd/Cargo.toml" "-p" "alloc" "--" "--quiet"
[01:08:58] 
[01:08:58] 
[01:08:58] failed to run: /checkout/obj/build/bootstrap/debug/bootstrap test
[01:08:58] Build completed unsuccessfully in 0:24:21
[01:08:58] Build completed unsuccessfully in 0:24:21
[01:08:58] make: *** [check] Error 1
[01:08:58] Makefile:58: recipe for target 'check' failed

The command "stamp sh -x -c "$RUN_SCRIPT"" exited with 2.
travis_time:start:04766e80
$ date && (curl -fs --head https://google.com | grep ^Date: | sed 's/Date: //g' || true)

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@shepmaster
Copy link
Member Author

It seems like this is a one-off workaround for the lack of self-referential structs and could just as well be implemented atop the owning_ref crate?

I don't see it like that, but I can see how people might.

Instead, I see it as the missing piece to this table:

method receiver trait
String::chars &String Iterator<Item = char>
String::drain &mut String Iterator<Item = char>
??? String Iterator<Item = char>

@hanna-kruppe
Copy link
Contributor

That was the perspective I initially had when I saw the title and signature, but that confused the hell out of me because into_chars doesn't add any value from that perspective: it doesn't do anything different from chars, since that one already gives you an iterator of owned chars (unlike e.g. Vec::iter vs Vec::into_iter). into_chars just needlessly1 consumes the argument. drain, in contrast, modifies the String so its existence is justified.

1 In this perspective, i.e., if we ignore owning_ref style situations

@pietroalbini
Copy link
Member

Ping from triage @SimonSapin! This PR needs your review.

@SimonSapin
Copy link
Contributor

I understand the flat_map use case but also agree with both concerns expressed so far:

  • This seems a very ad-hoc way to solve a general problem. Many iterators borrow some input. Do we expect to add new "input-owning" variations of each of them, one by one? Could something like the owning-ref be a generic solution instead?

  • If we do accept this API, I’ve very very uncomfortable with the current transmute-to-'static implementation and I’d much rather see the internals of str::Chars be refactored somehow to allow string::Chars to exist without "lying" about lifetimes.

(Re triage, I don’t know if there’s an appropriate label for "waiting on an alternative design".)

@SimonSapin SimonSapin removed the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 21, 2018
@pietroalbini pietroalbini added the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label May 21, 2018
@shepmaster
Copy link
Member Author

I’d much rather see the internals of str::Chars be refactored somehow

I commented that I've already implemented that and was waiting on review feedback.

@shepmaster
Copy link
Member Author

be implemented atop the owning_ref crate

Could something like the owning-ref be a generic solution instead?

I like to consider myself a reasonably capable Rust programmer, but I have never successfully used owning-ref to solve a problem, FWIW. On the other hand, Rental has solved problems for me. Because of this, I would discourage people from suggesting owning-ref as a "use this instead" solution.

@shepmaster
Copy link
Member Author

Anyway, I understand the team does not want this addition. I hope self-referential structs are ergonomic to use someday.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-enhancement Category: An issue proposing an enhancement or a PR with one. S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants