-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Guide: strings #15593
Guide: strings #15593
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,129 @@ | ||
% The Strings Guide | ||
|
||
# Strings | ||
|
||
Strings are an important concept to master in any programming language. If you | ||
come from a managed language background, you may be surprised at the complexity | ||
of string handling in a systems programming language. Efficient access and | ||
allocation of memory for a dynamically sized structure involves a lot of | ||
details. Luckily, Rust has lots of tools to help us here. | ||
|
||
A **string** is a sequence of unicode scalar values encoded as a stream of | ||
UTF-8 bytes. All strings are guaranteed to be validly-encoded UTF-8 sequences. | ||
Additionally, strings are not null-terminated and can contain null bytes. | ||
|
||
Rust has two main types of strings: `&str` and `String`. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd rather write "two string-related types" instead of "two main types of strings" because the way it is now suggests to me that one can use &str like any other "value type" (me with my C++ hat on). But maybe that's just me. |
||
|
||
## &str | ||
|
||
The first kind is a `&str`. This is pronounced a 'string slice.' String literals | ||
are of the type `&str`: | ||
|
||
```{rust} | ||
let string = "Hello there."; | ||
``` | ||
|
||
Like any Rust type, string slices have an associated lifetime. A string literal | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think you're muddying two kinds of lifetimes together here. Surely, there are two lifetimes involved with a string slice (or any other type with ampersand at the front). The lifetime of the variable holding the pointer and length, and the lifetime of the memory pointed at by the slice. With "Like any Rust type" you seem to refer to the former lifetime while the lifetime you actually want to talk about is the "2nd kind" that only applies to references and alike. I'm not sure how much value there is in special casing &str as a funny string type in the introduction when it really is a reference to a slice of a string like the ampersand suggests. There is some consistency here. str is almost the same as [u8] except for the fact that it guarantees a valid UTF-8 encoding. |
||
is a `&'static str`. A string slice can be written without an explicit | ||
lifetime in many cases, such as in function arguments. In these cases the | ||
lifetime will be inferred: | ||
|
||
```{rust} | ||
fn takes_slice(slice: &str) { | ||
println!("Got: {}", slice); | ||
} | ||
``` | ||
|
||
Like vector slices, string slices are simply a pointer plus a length. This | ||
means that they're a 'view' into an already-allocated string, such as a | ||
`&'static str` or a `String`. | ||
|
||
## String | ||
|
||
A `String` is a heap-allocated string. This string is growable, and is also | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. &str could just as well refer to a heap allocated string. The difference is in the ownership. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Saying String is heap allocated does not imply that &str is not. |
||
guaranteed to be UTF-8. | ||
|
||
```{rust} | ||
let mut s = "Hello".to_string(); | ||
println!("{}", s); | ||
|
||
s.push_str(", world."); | ||
println!("{}", s); | ||
``` | ||
|
||
You can coerce a `String` into a `&str` with the `as_slice()` method: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure but I think you just misused the word "coerce". I thought that it refers to a form of implicit conversion in the context of types and programming languages. But correct me if I'm wrong. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Note this was (somewhat) discussed previously: #15593 (comment) |
||
|
||
```{rust} | ||
fn takes_slice(slice: &str) { | ||
println!("Got: {}", slice); | ||
} | ||
|
||
fn main() { | ||
let s = "Hello".to_string(); | ||
takes_slice(s.as_slice()); | ||
} | ||
``` | ||
|
||
You can also get a `&str` from a stack-allocated array of bytes: | ||
|
||
```{rust} | ||
use std::str; | ||
|
||
let x: &[u8] = &[b'a', b'b']; | ||
let stack_str: &str = str::from_utf8(x).unwrap(); | ||
``` | ||
|
||
## Best Practices | ||
|
||
### `String` vs. `&str` | ||
|
||
In general, you should prefer `String` when you need ownership, and `&str` when | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This information is a little late for my taste. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Well if you're looking for best practices, I expect you to click on the best practices header. And it would be weird to put this before the explanation of what they are. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why do you think String is for ownership and &str is not? It's because these types are defined like that. But you did not tell the reader about it so far. The first time "ownership" comes up is at this line. And that's what I was referring to. It's too late. Ownership is part of what defines the distinction between String and &str. The best practice pretty much falls out of that given that cloning Strings is much more expensive than lending them via slices (if cloning them would be as cheap as copying an int we would probably not bother with &str in lots of cases). Perhaps a good way of introducing the topic is to say something along the lines of "If you need a variable type for a variable to hold a string value just like an int variable holds an integer value, you probably want the String type. One big reason though to use the &str type in addition is that copying string values is expensive. In many cases one does not need to copy a string value. In those cases it suffices to just refer to string values, to borrow them. And that's what &str is for" or something like that. |
||
you just need to borrow a string. This is very similar to using `Vec<T>` vs. `&[T]`, | ||
and `T` vs `&T` in general. | ||
|
||
This means starting off with this: | ||
|
||
```{rust,ignore} | ||
fn foo(s: &str) { | ||
``` | ||
|
||
and only moving to this: | ||
|
||
```{rust,ignore} | ||
fn foo(s: String) { | ||
``` | ||
|
||
If you have good reason. It's not polite to hold on to ownership you don't | ||
need, and it can make your lifetimes more complex. Furthermore, you can pass | ||
either kind of string into `foo` by using `.as_slice()` on any `String` you | ||
need to pass in, so the `&str` version is more flexible. | ||
|
||
### Comparisons | ||
|
||
To compare a String to a constant string, prefer `as_slice()`... | ||
|
||
```{rust} | ||
fn compare(string: String) { | ||
if string.as_slice() == "Hello" { | ||
println!("yes"); | ||
} | ||
} | ||
``` | ||
|
||
... over `to_string()`: | ||
|
||
```{rust} | ||
fn compare(string: String) { | ||
if string == "Hello".to_string() { | ||
println!("yes"); | ||
} | ||
} | ||
``` | ||
|
||
Converting a `String` to a `&str` is cheap, but converting the `&str` to a | ||
`String` involves an allocation. | ||
|
||
## Other Documentation | ||
|
||
* [the `&str` API documentation](/std/str/index.html) | ||
* [the `String` API documentation](std/string/index.html) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd leave the part "Strings are an important concept to master in any programming language." out, it's not adding much information.