Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added slice_chars_from and slice_chars_to methods to StrSlice #10544

Closed
wants to merge 1 commit into from

Conversation

Kimundi
Copy link
Member

@Kimundi Kimundi commented Nov 18, 2013

They are occasional useful, and not having them is inconsistent to the byte-indice slice methods.

slice_chars, slice_chars_from and slice_chars_to are now implemented in terms of a common slice_chars_common function.

I also made the section headers in str.rs more visible, as they are they where to small to be really useful.

@huonw
Copy link
Member

huonw commented Nov 18, 2013

Hm, do we want this extra non-DRY code? I was thinking about this a little while ago, and though maybe making either or both arguments to .slice_chars Option<uint>'s might be better since .slice_chars is not super useful itself (directly manipulating non-grapheme characters; is O(n); etc.) and so these specialised versions are even less so.

Furthermore, having it explicitly in the types hopefully makes someone less like to do s.slice_chars(10, Some(s.char_len())) rather than just s.slice_chars(10, None).

@huonw
Copy link
Member

huonw commented Nov 18, 2013

(At the very least, I think an internal helper function of that form would be an antidote to repeating similar code 3 times.)

@Kimundi
Copy link
Member Author

Kimundi commented Nov 18, 2013

Yeah, at the very least it could all live in a internal helper, that's true.

In a way, this is only a temporary API improvement anyway, as the string API still needs some major overhauls, and decisions (For example #9391 and #9440)

slice_from and slice_to (including these new char variants) are incomplete anyway in that they take an index starting from the front, while taking one from the back can be useful too, and at least for chars and grapheme clusters is faster than doing char_len() - n.

I think the most general solution would be to factor out the the ability to view a string as a sequence of bytes, a sequence of chars, and a sequence of grapheme clusters, and the provide an identical API for those three:

"äbc".as_char_str().slice_from(1) == "bc"
"äbc".as_char_str().slice_to(1) == "ä"
"äbc".as_char_str().reverse_slice_from(1) == "äb"
"äbc".as_char_str().reverse_slice_to(1) == "c"

"äbc".as_utf8_str().slice_from(2) == "bc"
"äbc".as_utf8_str().slice_to(2) == "ä"
"äbc".as_utf8_str().reverse_slice_from(1) == "äb"
"äbc".as_utf8_str().reverse_slice_to(1) == "c"

@Kimundi
Copy link
Member Author

Kimundi commented Nov 19, 2013

I pushed a commit that moves the common code into one function

// only finding the char boundaries
for (idx, _) in s.char_offset_iter() {
match (begin, begin_byte) {
(Some(begin), None) if count == begin => { begin_byte = Some(idx) }, _ => ()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would be clearer if the _ arm wasn't hiding at the end (also the , is unnecessary).

@Kimundi
Copy link
Member Author

Kimundi commented Nov 19, 2013

Addressed the stylistic issue

They use a `slice_char_common` function internally

Also made source code section markers more visible
@alexcrichton
Copy link
Member

Closing due to a lack of activity.

I'm worried that if we start adding methods for chars as well as bytes then are we also going to have to start adding methods for graphemes as well? I don't think that this PR should fundamentally not be merged, but we need to give the str API careful thought before adding functions like this.

flip1995 pushed a commit to flip1995/rust that referenced this pull request Apr 6, 2023
…=Jarcho

Ignore `file!()` macro in `print_literal`, `write_literal`

changelog: [`print_literal`], [`write_literal`]: Ignore the `file!()` macro

`file!()` expands to a string literal with its span set to that of the `file!()` callsite, but isn't marked as coming from an expansion. To fix this we make sure we actually find a string/char literal instead of assuming it's one and slicing

It would also ignore any other macros that result in the same situation, but that shouldn't be common as `proc_macro::Span::call_site()` returns a span that is marked as from expansion

Fixes rust-lang#10544
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants