-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: new lifetime elision rules #141
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,279 @@ | ||
- Start Date: (2014-06-24) | ||
- RFC PR #: (leave this empty) | ||
- Rust Issue #: (leave this empty) | ||
|
||
# Summary | ||
|
||
This RFC proposes to | ||
|
||
1. Expand the rules for eliding lifetimes in `fn` definitions, and | ||
2. Follow the same rules in `impl` headers. | ||
|
||
By doing so, we can avoid writing lifetime annotations ~87% of the time that | ||
they are currently required, based on a survey of the standard library. | ||
|
||
# Motivation | ||
|
||
In today's Rust, lifetime annotations make code more verbose, both for methods | ||
|
||
```rust | ||
fn get_mut<'a>(&'a mut self) -> &'a mut T | ||
``` | ||
|
||
and for `impl` blocks: | ||
|
||
```rust | ||
impl<'a> Reader for BufReader<'a> { ... } | ||
``` | ||
|
||
In the vast majority of cases, however, the lifetimes follow a very simple | ||
pattern. | ||
|
||
By codifying this pattern into simple rules for filling in elided lifetimes, we | ||
can avoid writing any lifetimes in ~87% of the cases where they are currently | ||
required. | ||
|
||
Doing so is a clear ergonomic win. | ||
|
||
# Detailed design | ||
|
||
## Today's lifetime elision rules | ||
|
||
Rust currently supports eliding lifetimes in functions, so that | ||
|
||
```rust | ||
fn print(s: &str); | ||
fn get_str() -> &str; | ||
``` | ||
|
||
become | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. becomes. and isn't this backwards? To elide is to remove, so the ones with the rules become the ones without the rules. |
||
|
||
```rust | ||
fn print<'a>(s: &'a str); | ||
fn get_str<'a>() -> &'a str; | ||
``` | ||
|
||
The ellision rules work well for functions that consume references, but not for | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. s/ellision/elision/ |
||
functions that produce them. The `get_str` signature above, for example, | ||
promises to produce a string slice that lives arbitrarily long, and is | ||
either incorrect or should be replaced by | ||
|
||
```rust | ||
fn get_str() -> &'static str; | ||
``` | ||
|
||
Returning `'static` is relatively rare, and it has been proposed to make leaving | ||
off the lifetime in output position an error for this reason. | ||
|
||
Moreover, lifetimes cannot be elided in `impl` headers. | ||
|
||
## The proposed rules | ||
|
||
### Overview | ||
|
||
This RFC proposes two changes to the lifetime elision rules: | ||
|
||
1. Since eliding a lifetime in output position is usually wrong or undesirable | ||
under today's elision rules, interpret it in a different and more useful way. | ||
|
||
2. Interpret elided lifetimes for `impl` headers analogously to `fn` definitions. | ||
|
||
### Lifetime positions | ||
|
||
A _lifetime position_ is anywhere you can write a lifetime in a type: | ||
|
||
```rust | ||
&'a T | ||
&'a mut T | ||
T<'a> | ||
``` | ||
|
||
As with today's Rust, the proposed elision rules do _not_ distinguish between | ||
different lifetime positions. For example, both `&str` and `Ref<uint>` have | ||
elided a single lifetime. | ||
|
||
Lifetime positions can appear as either "input" or "output": | ||
|
||
* For `fn` definitions, input refers to argument types while output refers to | ||
result types. So `fn foo(s: &str) -> (&str, &str)` has elided one lifetime in | ||
input position and two lifetimes in output position. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For an In other words, is a method considered to be in the scope of its (I will follow up to this comment with a concrete set of examples elaborating my question in a moment.) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Okay, here is a gist with my attempt to survey the space here: https://gist.github.com/pnkfelix/a4054e51400152c63714 It could well be that the intent is (and has always been) to not consider an impl header in scope for lifetime elision on methods. But if so, this needs to be spelled out explicitly in the RFC itself. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm I guess since this was already merged I should instead open an issue against it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. |
||
|
||
* For `impl` headers, input refers to the lifetimes appears in the type | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Trait definitions themselves are also a form that offers lifetime positions. That may or may not be relevant (I'll be posting a question about that soon -- see a few lines up), but should probably be addressed explicitly. |
||
receiving the `impl`, while output refers to the trait, if any. So `impl<'a> | ||
Foo<'a>` has `'a` in input position, while `impl<'a> SomeTrait<'a> Foo<'a>` | ||
has `'a` in both input and output positions. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the word (As for the “the”, I think that should be there in all these cases, or “an” as the case may be in some places. This affects much of the document.) |
||
|
||
### The rules | ||
|
||
* Each elided lifetime in input position becomes a distinct lifetime | ||
parameter. This is the current behavior for `fn` definitions. | ||
|
||
* If there is exactly one input lifetime position (elided or not), that lifetime | ||
is assigned to _all_ elided output lifetimes. | ||
|
||
* If there are multiple input lifetime positions, but one of them is `&self` or | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I find this rule a bit surprising (the others make perfect sense). I can intuitively see the motivation that self ought to be privileged but, I can't really justify why that is so. Looking at the examples below, the ones using this rule took me a lot longer to grok. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The rationale is that in several usage surveys, this was essentially the only pattern we saw when I believe that the reason for this is that when you're borrowing something out of I suspect that in cases where this pattern could occur, people use standalone functions instead of methods. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @wycats, what proportion of the cited 87% would be lost if this rule were not accepted? I don't personally object to it, but I can see how it's a bit more flimsy than the others, and I would be willing to live without it if the statistics bore it out. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think that it should use the lifetime of the first input parameter, regardless of whether it is self or not, and only if it is an elided lifetime. This avoids issues with UFC and makes method and non-method functions work the same. Supporting elision of lifetimes only in the return value when they are explicit on self seems a bad idea, since it is counterintuitive. Also, it doesn't work for multiple explicit lifetimes (e.g. &'a Block<'b>). |
||
`&mut sef`, the lifetime of `self` is assigned to _all_ elided output lifetimes. | ||
|
||
* Otherwise, it is an error to elide an output lifetime. | ||
|
||
Notice that the _actual_ signature of a `fn` or `impl` is based on the expansion | ||
rules above; the elided form is just a shorthand. | ||
|
||
### Examples | ||
|
||
```rust | ||
fn print(s: &str); // elided | ||
fn print<'a>(s: &'a str); // expanded | ||
|
||
fn get_str() -> &str; // ILLEGAL | ||
|
||
fn frob(s: &str, t: &str) -> &str; // ILLEGAL | ||
|
||
fn get_mut(&mut self) -> &mut T; // elided | ||
fn get_mut<'a>(&'a mut self) -> &'a mut T; // expanded | ||
|
||
fn args<T:ToCStr>(&mut self, args: &[T]) -> &mut Command // elided | ||
fn args<'a, 'b, T:ToCStr>(&'a mut self, args: &'b [T]) -> &'a mut Command // expanded | ||
|
||
fn new(buf: &mut [u8]) -> BufWriter; // elided | ||
fn new<'a>(buf: &'a mut [u8]) -> BufWriter<'a> // expanded | ||
|
||
impl Reader for BufReader { ... } // elided | ||
impl<'a> Reader for BufReader<'a> { .. } // expanded | ||
|
||
impl Reader for (&str, &str) { ... } // elided | ||
impl<'a, 'b> Reader for (&'a str, &'b str) { ... } // expanded | ||
|
||
impl StrSlice for &str { ... } // elided | ||
impl<'a> StrSlice<'a> for &'a str { ... } // expanded | ||
``` | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. A by-value arg is not a lifetime position, so the following is legal?
That is, it would possible to use multiple args and still have the lifetimes elided, right? I think the answer is yes but it's not shown in these examples. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jfager Yes, that's right, and good point about the examples. Will update. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please also add examples of methods within |
||
## Error messages | ||
|
||
Since the shorthand described above should eliminate most uses of explicit | ||
lifetimes, there is a potential "cliff". When a programmer first encounters a | ||
situation that requires explicit annotations, it is important that the compiler | ||
gently guide them toward the concept of lifetimes. | ||
|
||
An error can arise with the above shorthand only when the program elides an | ||
output lifetime and neither of the rules can determine how to annotate it. | ||
|
||
### For `fn` | ||
|
||
The error message should guide the programmer toward the concept of lifetime by | ||
talking about borrowed values: | ||
|
||
> This function's return type contains a borrowed value, but the signature does | ||
> not say which parameter it is borrowed from. It could be one of a, b, or | ||
> c. Mark the input parameter it borrows from using lifetimes, | ||
> e.g. [generated example]. See [url] for an introduction to lifetimes. | ||
|
||
This message is slightly inaccurate, since the presence of a lifetime parameter | ||
does not necessarily imply the presence of a borrowed value, but there are no | ||
known use-cases of phantom lifetime parameters. | ||
|
||
### For `impl` | ||
|
||
The error case on `impl` is exceedingly rare: it requires (1) that the `impl` is | ||
for a trait with a lifetime argument, which is uncommon, and (2) that the `Self` | ||
type has multiple lifetime arguments. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does this example arise today in any known Rust codebase? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @bstrie I don't know of any cases offhand, which is why the error message here is probably not so important. |
||
|
||
Since there are no clear "borrowed values" for an `impl`, this error message | ||
speaks directly in terms of lifetimes. This choice seems warranted given that a | ||
programmer implementing a trait with lifetime parameters will almost certainly | ||
already understand lifetimes. | ||
|
||
> TraitName requires lifetime arguments, and the impl does not say which | ||
> lifetime parameters of TypeName to use. Mark the parameters explicitly, | ||
> e.g. [generated example]. See [url] for an introduction to lifetimes. | ||
|
||
## The impact | ||
|
||
To asses the value of the proposed rules, we conducted a survey of the code | ||
defined _in_ `libstd` (as opposed to the code it reexports). This corpus is | ||
large and central enough to be representative, but small enough to easily | ||
analyze. | ||
|
||
We found that of the 169 lifetimes that currently require annotation for | ||
`libstd`, 147 would be elidable under the new rules, or 87%. | ||
|
||
_Note: this percentage does not include the large number of lifetimes that are | ||
already elided with today's rules._ | ||
|
||
The detailed data is available at: | ||
https://gist.github.com/aturon/da49a6d00099fdb0e861 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Of the 13% of functions which still require explicit lifetimes, do any seem particularly notable for their nonconformity to the usual patterns? It would also be really great if you could select one of these real-world functions and use it in the example error message above. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Almost all of the remaining cases are situations like: impl<'a> AsciiCast<&'a[Ascii]> for &'a [u8] {
fn unsafe fn to_ascii_nocheck(&self) -> &'a[Ascii] { ... }
...
} where the
Note that this kind of example does not require an annotation according to the rules (so you wouldn't get an annotation error if you elided the lifetime). Rather, the annotation is needed to go beyond the patterns provided by the rule. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @bstrie The other predominant case is: fn difference<'a>(&'a self, other: &'a HashSet<T, H>) -> SetAlgebraItems<'a, T, H>; where the two input lifetimes are required to match. @glaebhoerl Take note -- this is a case where even the rules for input positions don't give you what you want. |
||
|
||
# Drawbacks | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Another drawback: I find full specification of lifetime parameters makes it easier to understand what is going on. Even today, I often write the lifetimes where they could be elided because I think it makes code easier to reason about if you can name things. If I have a lifetime error, the first thing I do is add explicit lifetimes wherever they are missing. I get the impression I'm in the minority with this though. To me, these extra rules trade off easier reading (and writing) when you don't need to think about lifetimes too much against greater cognitive overhead when you do have to think about them. I guess that since reading code is more common than debugging lifetime errors, this trade off is worthwhile. I certainly like the idea of reducing lifetime noise. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 👍 to full specs making things clearer. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @nick29581 It might be worth considering having the compiler optionally show you all of the inferred lifetimes when there are error messages that involve lifetimes: That said, I think the error message improvements in this proposal go a long way to making it obvious what has happened when you inappropriately elided a lifetime. Similar error message work around other lifetime errors would go a long way to improving the general ergonomics of explicit lifetimes as well, and we should work on that! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @nick29581, that same argument can be made for type inference. Just like type inference, nothing is stopping you from being fully explicit with lifetimes if you deem it's better for readability. |
||
## Learning lifetimes | ||
|
||
The main drawback of this change is pedagogical. If lifetime annotations are | ||
rarely used, newcomers may encounter error messages about lifetimes long before | ||
encountering lifetimes in signatures, which may be confusing. Counterpoints: | ||
|
||
* This is already the case, to some extent, with the current elision rules. | ||
|
||
* Most existing error messages are geared to talk about specific borrows not | ||
living long enough, pinpointing their _locations_ in the source, rather than | ||
talking in terms of lifetime annotations. When the errors do mention | ||
annotations, it is usually to suggest specific ones. | ||
|
||
* The proposed error messages above will help programmers transition out of the | ||
fully elided regime when they first encounter a signature requiring it. | ||
|
||
* When combined with a good tutorial on the borrow/lifetime system (which should | ||
be introduced early in the documentation), the above should provide a | ||
reasonably gentle path toward using and understanding explicit lifetimes. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yup, I care about this. |
||
|
||
Programmers learn lifetimes once, but will use them many times. Better to favor | ||
long-term ergonomics, if a simple elision rule can cover 87% of current lifetime | ||
uses (let alone the currently elided cases). | ||
|
||
## Subtlety for non-`&` types | ||
|
||
While the rules are quite simple and regular, they can be subtle when applied to | ||
types with lifetime positions. To determine whether the signature | ||
|
||
```rust | ||
fn foo(r: Bar) -> Bar | ||
``` | ||
|
||
is actually using lifetimes via the elision rules, you have to know whether | ||
`Bar` has a lifetime parameter. But this subtlety already exists with the | ||
current elision rules. The benefit is that library types like `Ref<'a, T>` get | ||
the same status and ergonomics as built-ins like `&'a T`. | ||
|
||
# Alternatives | ||
|
||
* Do not include _output_ lifetime elision for `impl`. Since traits with lifetime | ||
parameters are quite rare, this would not be a great loss, and would simplify | ||
the rules somewhat. | ||
|
||
* Only add elision rules for `fn`, in keeping with current practice. | ||
|
||
* Only add elision for explicit `&` pointers, eliminating one of the drawbacks | ||
mentioned above. Doing so would impose an ergonomic penalty on abstractions, | ||
though: `Ref` would be more painful to use than `&`. | ||
|
||
# Unresolved questions | ||
|
||
The `fn` and `impl` cases tackled above offer the biggest bang for the buck for | ||
lifetime elision. But we may eventually want to consider other opportunities. | ||
|
||
## Double lifetimes | ||
|
||
Another pattern that sometimes arises is types like `&'a Foo<'a>`. We could | ||
consider an additional elision rule that expands `&Foo` to `&'a Foo<'a>`. | ||
|
||
However, such a rule could be easily added later, and it is unclear how common | ||
the pattern is, so it seems best to leave that for a later RFC. | ||
|
||
## Lifetime elision in `struct`s | ||
|
||
We may want to allow lifetime elision in `struct`s, but the cost/benefit | ||
analysis is much less clear. In particular, it could require chasing an | ||
arbitrary number of (potentially private) `struct` fields to discover the source | ||
of a lifetime parameter for a `struct`. There are also some good reasons to | ||
treat elided lifetimes in `struct`s as `'static`. | ||
|
||
Again, since shorthand can be added backwards-compatibly, it seems best to wait. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agreed, I'm fine with leaving structs as they are. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the biggest part of this proposal for me. (well, combined with the data that shows that it is)