Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Guide: strings #15593

Closed
wants to merge 1 commit into from
Closed

Conversation

steveklabnik
Copy link
Member

I decided to change it up a little today and hack out the beginning of the String guide. Strings are different enough in Rust that I think they deserve a specific guide, especially for those who are used to managed languages.

I decided to start with Strings because they get asked about a lot in IRC, and also based on discussions like this one on reddit: http://www.reddit.com/r/rust/comments/2ac390/generic_string_literals/

I blatantly stole bits from our other documentation on Strings. It's a little sparse at current, but I wanted to start somewhere.

I am not exactly sure what should go in "Best Practices," and would like the feedback from the team on this. Specifically due to comments like this one: http://www.reddit.com/r/rust/comments/2ac390/generic_string_literals/citmxb5

details. Luckily, Rust has lots of tools to help us here.

A **string** is a sequence of unicode codepoints encoded as a stream of UTF-8
bytes. All safely-created strings are guaranteed to be validly encoded UTF-8
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically incorrect. It's a sequence of unicode scalar values. Notably, U+D800 is a codepoint, but it's not a scalar value.

For reference, we already talk about unicode scalar values in the documentation for std::char.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be changed in the std::str documentation, then.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably true. I haven't actually read that in a while. But let's focus on making a good strings guide, then we can go back and fix up the API documentation as appropriate.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course, I only make mention of it to document where I gained the wrong impression from, and to make sure that I change that as well.

@MatejLach
Copy link
Contributor

@kud1ing

Seriously? It's an introductory sentence that's meant to get the reader into the text. It doesn't have to "add" anything really. I also takes nothing away.
I really hate how every single sentence in technical docs supposedly has to be "technical".
We humans are not robots, when one reads a Guide, it should read continuously and smoothly, whereas when I am reading a reference, I am expecting dry, technical information.
A Guide is meant to introduce you into a topic, not be super dry.
Really, check how many people actually red any exclusively-technical docs in any amount of details.

There are many books on programming in various languages that cover topics one can get from the official docs, but people still buy the expensive books, because they're less dry. It's a good think that @steveklabnik wants to make the Guides like a book, there's also the Manual for reference.

Or would we prefer something like the following as our only point of reference:
http://golang.org/ref/spec
Docs like the above link are great for getting a quick glimpse at how to use something, but not to really read about it.

@nielsle
Copy link
Contributor

nielsle commented Jul 11, 2014

Perhaps the chapter could focus more on helping the reader to choose the right string type in a given situation. That is the difficult part. (Perhaps add some text add the beginning of the chapter about slices)

It could be nice to have text about concatenating strings, splitting strings and formatting strings.

[Try in-browser](http://is.gd/orj55o)


Like vector slices, string slices are simply a pointer plus a length. This
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps you should mention here that a slice is not growable.

@steveklabnik
Copy link
Member Author

@nielsle that's my intention for the 'best practices' section.

@steveklabnik
Copy link
Member Author

I've addressed the easy comments, gonna give some thought to the harder ones.

@kud1ing
Copy link

kud1ing commented Jul 13, 2014

@MatejLach: I am believer in "perfection is finally attained not when there is no longer anything to add, but when there is no longer anything to take away". To me this especially applies to documentation, when i am actually busy doing something else other than reading documentation. I am not against collegial writing style, but i always wonder "if this was missing, what would be the reasons to add it? who would say: i want this because ... ?"

@steveklabnik
Copy link
Member Author

I've squashed all those commits together, and added a best practice about comparing strings. What do we think?


Like any Rust type, string slices have an associated lifetime. A string literal
is a `&'static str`. A string slice can be taken as an argument to a function,
in which case, it has the usual associated lifetime:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still not happy with the phrasing here, "it has the usual associated lifetime". The goal of this sentence is to convey the idea that the lifetime can be omitted in a lot of cases, right? Omitting the lifetime is of course not limited to just function arguments, you can also say let x: &str = "foo".

Is there some other guide that uses a similar phrasing here that you're trying to reference? If not, I'd suggest perhaps something like

The &str type can be written without an explicit lifetime in many cases, such as in function arguments. In these cases the lifetime will be inferred:

although I feel like it would be good to explicitly correlate this with the inferred lifetimes of arbitrary &T refs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that wording. No connection to anything else, just my own words. I'll change it to that after i'm done rebuilding...

@lilyball
Copy link
Contributor

Looks good overall. I would not be unhappy if it were committed as-is (except for the one mistaken reference to to_slice()). But I've left a number of nitpicks in here. 👍 if they get fixed.

either kind of string into `foo` by using `.to_slice()` on any `String` you
need to pass in, so the `&str` version is more flexible.

### Comparions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Comparions/Comparisons/?

@steveklabnik
Copy link
Member Author

Thanks for the close review, @kballard . Fixed all of that, including the nits :)


Like vector slices, string slices are simply a pointer plus a length. This
means that they're a 'view' into an already-allocated string: either a
`&'static str` or a `String`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mentioned before that a &str could actually have been created from something that isn't a &str or a String, e.g. str::from_utf8() creates it from &[u8]. At the time I said I wasn't sure if it was worth mentioning.

Well, this still bugs me. But I think we can still avoid complicating it merely by changing the word "either" to "such as" (and replacing the colon with a comma). This reads

This means that they're a 'view' into an already-allocated string, such as a &'static str or a String.

@lilyball
Copy link
Contributor

Ok, I think those last two nitpicks are it. r=me

@steveklabnik
Copy link
Member Author

@kballard both fixed.

bors added a commit that referenced this pull request Jul 18, 2014
I decided to change it up a little today and hack out the beginning of the String guide. Strings are different enough in Rust that I think they deserve a specific guide, especially for those who are used to managed languages.

I decided to start with Strings because they get asked about a lot in IRC, and also based on discussions like this one on reddit: http://www.reddit.com/r/rust/comments/2ac390/generic_string_literals/

I blatantly stole bits from our other documentation on Strings. It's a little sparse at current, but I wanted to start somewhere.

I am not exactly sure what should go in "Best Practices," and would like the feedback from the team on this. Specifically due to comments like this one: http://www.reddit.com/r/rust/comments/2ac390/generic_string_literals/citmxb5
@bors bors closed this Jul 18, 2014
UTF-8 bytes. All strings are guaranteed to be validly-encoded UTF-8 sequences.
Additionally, strings are not null-terminated and can contain null bytes.

Rust has two main types of strings: `&str` and `String`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather write "two string-related types" instead of "two main types of strings" because the way it is now suggests to me that one can use &str like any other "value type" (me with my C++ hat on). But maybe that's just me.

@l0kod
Copy link
Contributor

l0kod commented Aug 18, 2014

The guide should maybe add a note to highlight the Str trait who can be used as a generic parameter if the function doesn't care about owning (or not) the string. This way, it's possible to use &str or String, which might be convenient:

fn foo<T: Str>(msg: T) {
    std::io::stdio::print(msg.as_slice());
}
foo("hello");
foo(" world".to_string());

@steveklabnik
Copy link
Member Author

@l0kod could you please add this to the 'improving the strings guide' ticket? Thanks.

@l0kod
Copy link
Contributor

l0kod commented Aug 18, 2014

cc #15994

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.