From 68572cc0fd7c60cf04426e19df91f4fee16b29c9 Mon Sep 17 00:00:00 2001 From: Philipp Goerz Date: Sat, 1 Apr 2023 16:58:25 +0200 Subject: [PATCH 01/92] [WIP] start writing an initial draft --- text/0000-guaranteed-tco.md | 159 ++++++++++++++++++++++++++++++++++++ 1 file changed, 159 insertions(+) create mode 100644 text/0000-guaranteed-tco.md diff --git a/text/0000-guaranteed-tco.md b/text/0000-guaranteed-tco.md new file mode 100644 index 00000000000..c3d61e0eb9e --- /dev/null +++ b/text/0000-guaranteed-tco.md @@ -0,0 +1,159 @@ +- Feature Name: guaranteed_tco +- Start Date: 2023-04-01 +- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000) +- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000) + +# Summary +[summary]: #summary + +This feature allows guaranteeing that function calls are tail-call optimized (TCO) via the `become` keyword. If this guarantee can not be provided by the compiler an error is generated instead. The check for the guarantee is done by verifying that the candidate function call follows several restrictions such as tail position and a function signature that exactly matches the calling function (it might be possible to loosen the function signature restriction in the future). + +# Motivation +[motivation]: #motivation + +While opportunistic TCO is already supported there currently is no way to natively guarantee TCO. This optimization is interesting for two general goals. One goal is to do function calls without adding a new stack frame to the stack, this mainly has semantic implications as for example recursive algorithms can overflow the stack without this optimization. The other goal is to, in simple words, replace `call` instructions by `jmp` instructions, this optimization has performance implications and can provide massive speedups for algorithms that have a high density of function calls. + +Note that workarounds for the first goal exist by using so called trampolining which limits the stack depth. However, while this functionality is provided by several crates, a inclusion in the language can provide greater adoption of a more functional programming style. + +For the second goal no guaranteed method exists, so if TCO is performed depends on the specific structure of the code and the compiler version. This can result in TCO no longer being performed if non-semantic changes to the code are done or the compiler version changes. + +Some specific use cases that are supported by this feature are new ways to encode state machines and jump tables, allowing code to be written in a continuation-passing style, recursive algorithms to be guaranteed TCO, and faster interpreters. One common example for the usefulness of tail-calls in C is improving performance of Protobuf parsing [blog](https://blog.reverberate.org/2021/04/21/musttail-efficient-interpreters.html), which would then also be possible in Rust. + + +# Guide-level explanation +[guide-level-explanation]: #guide-level-explanation + +The `become` keyword can be used at the same locations as the `return` keyword, however, only a *simple* function call can take the place of the argument. That is supported are calls such as `become foo()`, `become foo(a)`, `become foo(a, b)`, however, **not** supported are calls that contain or are part of a larger expression such as `become foo() + 1`, `become foo(1 + 1)`, `become foo(bar())` (though this may be subject to change). Additionally, there is a further restriction on the tail-callable functions: the function signature must exactly match that of the calling function (a restriction that might be loosened in the future). + +TODO explain in terms of examples + +Now on to some examples. + +### The difference between `return` and `become` +One essential difference to `return` is that `become` drops function **local** variables before the function call instead of after. So the following function ([original example](https://github.com/rust-lang/rfcs/issues/2691#issuecomment-1136728427)): +```rust +fn x() { + let a = Box::new(()); + let b = Box::new(()); + become y(a) +} +``` + +Will be desugared in the following way: +```rust +fn x() { + let a = Box::new(()); + let b = Box::new(()); + let _tmp = a; + drop(b); + become y(_tmp) +} +``` + +This early dropping allows us to avoid many complexities associated with deciding if a call can be TCO, instead the heavy lifting is done by the borrow checker and a lifetime error will be produced if references to local variables are passed to the called function. To be clear a reference to a local variable could be passed if instead of `become` the call would be done with `return y(a);` (or equivalently `y(a)`), indeed this difference between the handling of local variables is also the main difference between `return` and `become`. + + + + +TODO +```rust +fn sum_list(data: Vec, mut offset: usize, mut accum: u64) -> u64 { + if offset < data.len() { + accum += data[offset]; + offset += 1; + become sum_list(data, offset, accum) + } else { + accum + } +} +``` + +TODO as specific as possible .. +So how should a Rust programmer *think* about this feature. This feature is useful only for some specific coding styles, though it might make a function programming style more popular. In general this feature is only of interest for programmers that want to program in a more functional style than was previously possible with rust, or for programmers that want to achieve the best possible performance for + +TODO should sample error messages be provided? migration guidance? + +For new Rust programmers this feature should probably be introduced late into the learning process, it is not a required feature and only useful for niche problems. So it should be taught similarly as to programmers that already know Rust. It is likely enough to provide a description of the feature, compare the differences to `return`, and give examples of possible use-cases and mistakes. + +As this feature introduces a new keyword and is independent of existing code it has no impact on existing code. For code that does use this feature, it is required that a programmer understands the differences between `become` and `return`, it is difficult to judge how big this impact is without an initial implementation. One difference, however, is in debugging code that uses `become`. As the stack is not preserved, debugging context is lost which likely makes debugging more difficult. That is, elided parent functions as well as their variable values are not available during debugging. (Though this issue might be lessened by providing a flag to opt out of TCO, which would, however, break the semantic guarantee of creating further stack frames. This is likely an issue that needs some investigation after creating an initial implementation.) + + +Explain the proposal as if it was already included in the language and you were teaching it to another Rust programmer. That generally means: + +- Introducing new named concepts. +- Explaining the feature largely in terms of examples. +- Explaining how Rust programmers should *think* about the feature, and how it should impact the way they use Rust. It should explain the impact as concretely as possible. +- If applicable, provide sample error messages, deprecation warnings, or migration guidance. +- If applicable, describe the differences between teaching this to existing Rust programmers and new Rust programmers. +- Discuss how this impacts the ability to read, understand, and maintain Rust code. Code is read and modified far more often than written; will the proposed feature make code easier to maintain? + +For implementation-oriented RFCs (e.g. for compiler internals), this section should focus on how compiler contributors should think about the change, and give examples of its concrete impact. For policy RFCs, this section should provide an example-driven introduction to the policy, and explain its impact in concrete terms. + +# Reference-level explanation +[reference-level-explanation]: #reference-level-explanation + +This is the technical portion of the RFC. Explain the design in sufficient detail that: + +- Its interaction with other features is clear. +- It is reasonably clear how the feature would be implemented. +- Corner cases are dissected by example. + +The section should return to the examples given in the previous section, and explain more fully how the detailed proposal makes those examples work. + +# Drawbacks +[drawbacks]: #drawbacks + +Why should we *not* do this? + +# Rationale and alternatives +[rationale-and-alternatives]: #rationale-and-alternatives + +- Why is this design the best in the space of possible designs? +- What other designs have been considered and what is the rationale for not choosing them? +- What is the impact of not doing this? +- If this is a language proposal, could this be done in a library or macro instead? Does the proposed change make Rust code easier or harder to read, understand, and maintain? + +# Prior art +[prior-art]: #prior-art + +Discuss prior art, both the good and the bad, in relation to this proposal. +A few examples of what this can include are: + +- For language, library, cargo, tools, and compiler proposals: Does this feature exist in other programming languages and what experience have their community had? +- For community proposals: Is this done by some other community and what were their experiences with it? +- For other teams: What lessons can we learn from what other communities have done here? +- Papers: Are there any published papers or great posts that discuss this? If you have some relevant papers to refer to, this can serve as a more detailed theoretical background. + +This section is intended to encourage you as an author to think about the lessons from other languages, provide readers of your RFC with a fuller picture. +If there is no prior art, that is fine - your ideas are interesting to us whether they are brand new or if it is an adaptation from other languages. + +Note that while precedent set by other languages is some motivation, it does not on its own motivate an RFC. +Please also take into consideration that rust sometimes intentionally diverges from common language features. + +# Unresolved questions +[unresolved-questions]: #unresolved-questions + +- What parts of the design do you expect to resolve through the RFC process before this gets merged? +- What parts of the design do you expect to resolve through the implementation of this feature before stabilization? +- What related issues do you consider out of scope for this RFC that could be addressed in the future independently of the solution that comes out of this RFC? + +# Future possibilities +[future-possibilities]: #future-possibilities + +Think about what the natural extension and evolution of your proposal would +be and how it would affect the language and project as a whole in a holistic +way. Try to use this section as a tool to more fully consider all possible +interactions with the project and language in your proposal. +Also consider how this all fits into the roadmap for the project +and of the relevant sub-team. + +This is also a good place to "dump ideas", if they are out of scope for the +RFC you are writing but otherwise related. + +If you have tried and cannot think of any future possibilities, +you may simply state that you cannot think of anything. + +Note that having something written down in the future-possibilities section +is not a reason to accept the current or a future RFC; such notes should be +in the section on motivation or rationale in this or subsequent RFCs. +The section merely provides additional information. From 59345b2f89ab9a3d3a37508b1f70184b26da5c57 Mon Sep 17 00:00:00 2001 From: Philipp Goerz Date: Sun, 2 Apr 2023 21:03:17 +0200 Subject: [PATCH 02/92] [WIP] keep writing on RFC --- text/0000-guaranteed-tco.md | 114 ++++++++++++++++++++++++++++-------- 1 file changed, 90 insertions(+), 24 deletions(-) diff --git a/text/0000-guaranteed-tco.md b/text/0000-guaranteed-tco.md index c3d61e0eb9e..dac4881ae00 100644 --- a/text/0000-guaranteed-tco.md +++ b/text/0000-guaranteed-tco.md @@ -8,6 +8,8 @@ This feature allows guaranteeing that function calls are tail-call optimized (TCO) via the `become` keyword. If this guarantee can not be provided by the compiler an error is generated instead. The check for the guarantee is done by verifying that the candidate function call follows several restrictions such as tail position and a function signature that exactly matches the calling function (it might be possible to loosen the function signature restriction in the future). +This RFC discusses a minimal version that restricts function signatures to be exactly matching the calling function. It is possible that some restrictions can be removed with more experience of the implementation and usage of this feature. Also note that the current proposed version does not support general tail call optimization, this likely requires some more changes in Rust and the backends. + # Motivation [motivation]: #motivation @@ -23,14 +25,15 @@ Some specific use cases that are supported by this feature are new ways to encod # Guide-level explanation [guide-level-explanation]: #guide-level-explanation +## Introducing new named concepts. The `become` keyword can be used at the same locations as the `return` keyword, however, only a *simple* function call can take the place of the argument. That is supported are calls such as `become foo()`, `become foo(a)`, `become foo(a, b)`, however, **not** supported are calls that contain or are part of a larger expression such as `become foo() + 1`, `become foo(1 + 1)`, `become foo(bar())` (though this may be subject to change). Additionally, there is a further restriction on the tail-callable functions: the function signature must exactly match that of the calling function (a restriction that might be loosened in the future). -TODO explain in terms of examples - -Now on to some examples. +## Explaining the feature largely in terms of examples. +Now on to some examples. Starting with how `return` and `become` differ, and some potential pitfalls. +TODO add usecases ### The difference between `return` and `become` -One essential difference to `return` is that `become` drops function **local** variables before the function call instead of after. So the following function ([original example](https://github.com/rust-lang/rfcs/issues/2691#issuecomment-1136728427)): +One essential difference to `return` is that `become` drops function local variables **before** the function call instead of after. So the following function ([original example](https://github.com/rust-lang/rfcs/issues/2691#issuecomment-1136728427)): ```rust fn x() { let a = Box::new(()); @@ -50,12 +53,45 @@ fn x() { } ``` -This early dropping allows us to avoid many complexities associated with deciding if a call can be TCO, instead the heavy lifting is done by the borrow checker and a lifetime error will be produced if references to local variables are passed to the called function. To be clear a reference to a local variable could be passed if instead of `become` the call would be done with `return y(a);` (or equivalently `y(a)`), indeed this difference between the handling of local variables is also the main difference between `return` and `become`. +This early dropping allows to avoid many complexities associated with deciding if a call can be TCO, instead the heavy lifting is done by the borrow checker and a lifetime error will be produced if references to local variables are passed to the called function. To be clear a reference to a local variable could be passed if instead of `become` the call would be done with `return y(a);` (or equivalently `y(a)`), indeed this difference between the handling of local variables is also the main difference between `return` and `become`. + +### Omission of the `become` keyword causes the call to be `return` instead. +([original example](https://github.com/rust-lang/rfcs/pull/1888#issuecomment-278988088)) + +```rust +fn foo(x: i32) -> i32 { + if x % 2 { + let x = x / 2; + // one branch uses `become` + become foo(new_x); + } else { + let x = x + 3; + // the other does not + foo(x) // == return foo(x); + } +} +``` + +This is a potential source of confusion, indeed in a function language where every call is expected to be TCO this would be quite unexpected. (Maybe in functions that use `become` a lint should be applied that enforces usage of either `return` or `become`.) +### Alternating `become` and `return` calls +([original example](https://github.com/rust-lang/rfcs/pull/1888#issuecomment-279062656)) +```rust +fn foo(n: i32) { + // ups! we forgot become! + return bar(n); // or alternatively: `bar(n)` +} -TODO +fn bar(n: i32) { + become foo(n); +} +``` + +Here one function uses `become` the other `return`, this is another potential source of confusion. This mutual recursion would eventual overflow the stack. As mutual recursion can also happen across more functions, `become` needs to be used consistently in all functions if TCO should be guaranteed. (Maybe it is also possible to create a lint for these use-cases as well.) + + -TODO as specific as possible .. -So how should a Rust programmer *think* about this feature. This feature is useful only for some specific coding styles, though it might make a function programming style more popular. In general this feature is only of interest for programmers that want to program in a more functional style than was previously possible with rust, or for programmers that want to achieve the best possible performance for -TODO should sample error messages be provided? migration guidance? +## Explaining how Rust programmers should *think* about the feature, and how it should impact the way they use Rust. It should explain the impact as concretely as possible. +This feature is only useful for some specific algorithms, where it can be essential, though it might also create a push towards a more functional programming style in Rust. In general this feature is probably unneeded for most Rust programmers, Rust has been getting on fine without this feature for most applications. As a result it impacts only those few Rust programmers that require TCO provided by this feature. -For new Rust programmers this feature should probably be introduced late into the learning process, it is not a required feature and only useful for niche problems. So it should be taught similarly as to programmers that already know Rust. It is likely enough to provide a description of the feature, compare the differences to `return`, and give examples of possible use-cases and mistakes. -As this feature introduces a new keyword and is independent of existing code it has no impact on existing code. For code that does use this feature, it is required that a programmer understands the differences between `become` and `return`, it is difficult to judge how big this impact is without an initial implementation. One difference, however, is in debugging code that uses `become`. As the stack is not preserved, debugging context is lost which likely makes debugging more difficult. That is, elided parent functions as well as their variable values are not available during debugging. (Though this issue might be lessened by providing a flag to opt out of TCO, which would, however, break the semantic guarantee of creating further stack frames. This is likely an issue that needs some investigation after creating an initial implementation.) +## If applicable, provide sample error messages, deprecation warnings, or migration guidance. +TODO Error messages +As this is a independent new feature there should be no need for deprecation warnings. -Explain the proposal as if it was already included in the language and you were teaching it to another Rust programmer. That generally means: +Regarding migration guidance, it might be interesting to provide a lint that indicates that a trivial transformation from `return` to `become` can be done for function calls where requisites are already fulfilled. However, this lint might be confusing and noisy without too much of a benefit, especially if TCO is already done without `become`. -- Introducing new named concepts. -- Explaining the feature largely in terms of examples. -- Explaining how Rust programmers should *think* about the feature, and how it should impact the way they use Rust. It should explain the impact as concretely as possible. -- If applicable, provide sample error messages, deprecation warnings, or migration guidance. -- If applicable, describe the differences between teaching this to existing Rust programmers and new Rust programmers. -- Discuss how this impacts the ability to read, understand, and maintain Rust code. Code is read and modified far more often than written; will the proposed feature make code easier to maintain? -For implementation-oriented RFCs (e.g. for compiler internals), this section should focus on how compiler contributors should think about the change, and give examples of its concrete impact. For policy RFCs, this section should provide an example-driven introduction to the policy, and explain its impact in concrete terms. +## If applicable, describe the differences between teaching this to existing Rust programmers and new Rust programmers. +For new Rust programmers this feature should probably be introduced late into the learning process, it is not a required feature and only useful for niche problems. So it should be taught similarly as to programmers that already know Rust. It is likely enough to provide a description of the feature, explain TCO, compare the differences to `return`, and give examples of possible use-cases and mistakes. + + +## Discuss how this impacts the ability to read, understand, and maintain Rust code. Code is read and modified far more often than written; will the proposed feature make code easier to maintain? +As this feature introduces a new keyword and is independent of existing code it has no impact on existing code. For code that does use this feature, it is required that a programmer understands the differences between `become` and `return`, it is difficult to judge how big this impact is without an initial implementation. One difference, however, is in debugging code that uses `become`. As the stack is not preserved, debugging context is lost which likely makes debugging more difficult. That is, elided parent functions as well as their variable values are not available during debugging. (Though this issue might be lessened by providing a flag to opt out of TCO, which would, however, break the semantic guarantee of creating further stack frames. This is likely an issue that needs some investigation after creating an initial implementation.) + # Reference-level explanation [reference-level-explanation]: #reference-level-explanation @@ -105,13 +141,43 @@ The section should return to the examples given in the previous section, and exp Why should we *not* do this? +As this feature should be mostly independent from other features the main drawback lies in the implementation and maintenance effort. This feature adds a new keyword which will need to be implemented not only in Rust but also in other tooling. The main effort, however, lies in supporting this feature in the backends: +- LLVM supports a `musttail` marker to indicate that TCO should be performed [docs](https://llvm.org/docs/LangRef.html#id327). Clang which already depends on this feature, seems to only generate correct code for the x86 backend [source](https://github.com/rust-lang/rfcs/issues/2691#issuecomment-1490009983) (as of 30.03.23). +- GCC does not support a equivalent `musttail` marker. +- WebAssembly accepted tail-calls into the [standard](https://github.com/WebAssembly/proposals/pull/157/) and Cranelift is now [working](https://github.com/bytecodealliance/rfcs/pull/29) towards supporting it. + +Additionally, this proposal is limited to exactly matching function signatures which will *not* allow general tail-calls, however, the work towards this initial version could be used for a more comprehensive version. + +There is also a unwanted interaction between TCO and debugging. As TCO by design elides stack frames this information is lost during debugging, that is the parent functions and their local variable values are incomplete. As TCO provides a semantic guarantee of constant stack usage it is also not generally possible to disable TCO for debugging builds as then the stack could overflow. (Still maybe a compiler flag could be provided to temporarily disable TCO for debugging builds.) + + # Rationale and alternatives [rationale-and-alternatives]: #rationale-and-alternatives -- Why is this design the best in the space of possible designs? -- What other designs have been considered and what is the rationale for not choosing them? -- What is the impact of not doing this? -- If this is a language proposal, could this be done in a library or macro instead? Does the proposed change make Rust code easier or harder to read, understand, and maintain? +## Why is this design the best in the space of possible designs? +This design is the best tradeoff between implementation effort and provided functionality. Though it + +## What other designs have been considered and what is the rationale for not choosing them? + +### Loop based approach + +### Attribute on return + +### Attribute on tail-callable functions + +### Using `become` and a marker for tail-callable functions + +### Custom compiler or MIR passes + + +## What is the impact of not doing this? +- https://github.com/rust-lang/rust/issues/102952 +- Clang has support, this feature would restore this deficit parity +- + +## If this is a language proposal, could this be done in a library or macro instead? Does the proposed change make Rust code easier or harder to read, understand, and maintain? +While there exist libraries for a trampoline based method to avoid growing the stack, this is not enough to achieve the possible performance of real TCO, so this feature requires support by the compiler itself. + # Prior art [prior-art]: #prior-art From 6d24e8889945b1cd8bfaddf7757424013366abea Mon Sep 17 00:00:00 2001 From: Philipp Goerz Date: Sun, 2 Apr 2023 21:12:59 +0200 Subject: [PATCH 03/92] add some more TODOs --- text/0000-guaranteed-tco.md | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/text/0000-guaranteed-tco.md b/text/0000-guaranteed-tco.md index dac4881ae00..fdbd1b97de0 100644 --- a/text/0000-guaranteed-tco.md +++ b/text/0000-guaranteed-tco.md @@ -22,7 +22,7 @@ For the second goal no guaranteed method exists, so if TCO is performed depends Some specific use cases that are supported by this feature are new ways to encode state machines and jump tables, allowing code to be written in a continuation-passing style, recursive algorithms to be guaranteed TCO, and faster interpreters. One common example for the usefulness of tail-calls in C is improving performance of Protobuf parsing [blog](https://blog.reverberate.org/2021/04/21/musttail-efficient-interpreters.html), which would then also be possible in Rust. -# Guide-level explanation +# TODO Guide-level explanation [guide-level-explanation]: #guide-level-explanation ## Introducing new named concepts. @@ -125,7 +125,7 @@ For new Rust programmers this feature should probably be introduced late into th As this feature introduces a new keyword and is independent of existing code it has no impact on existing code. For code that does use this feature, it is required that a programmer understands the differences between `become` and `return`, it is difficult to judge how big this impact is without an initial implementation. One difference, however, is in debugging code that uses `become`. As the stack is not preserved, debugging context is lost which likely makes debugging more difficult. That is, elided parent functions as well as their variable values are not available during debugging. (Though this issue might be lessened by providing a flag to opt out of TCO, which would, however, break the semantic guarantee of creating further stack frames. This is likely an issue that needs some investigation after creating an initial implementation.) -# Reference-level explanation +# TODO Reference-level explanation [reference-level-explanation]: #reference-level-explanation This is the technical portion of the RFC. Explain the design in sufficient detail that: @@ -151,11 +151,11 @@ Additionally, this proposal is limited to exactly matching function signatures w There is also a unwanted interaction between TCO and debugging. As TCO by design elides stack frames this information is lost during debugging, that is the parent functions and their local variable values are incomplete. As TCO provides a semantic guarantee of constant stack usage it is also not generally possible to disable TCO for debugging builds as then the stack could overflow. (Still maybe a compiler flag could be provided to temporarily disable TCO for debugging builds.) -# Rationale and alternatives +# TODO Rationale and alternatives [rationale-and-alternatives]: #rationale-and-alternatives ## Why is this design the best in the space of possible designs? -This design is the best tradeoff between implementation effort and provided functionality. Though it +TODO This design is the best tradeoff between implementation effort and provided functionality. ## What other designs have been considered and what is the rationale for not choosing them? @@ -179,7 +179,7 @@ This design is the best tradeoff between implementation effort and provided func While there exist libraries for a trampoline based method to avoid growing the stack, this is not enough to achieve the possible performance of real TCO, so this feature requires support by the compiler itself. -# Prior art +# TODO Prior art [prior-art]: #prior-art Discuss prior art, both the good and the bad, in relation to this proposal. @@ -196,14 +196,14 @@ If there is no prior art, that is fine - your ideas are interesting to us whethe Note that while precedent set by other languages is some motivation, it does not on its own motivate an RFC. Please also take into consideration that rust sometimes intentionally diverges from common language features. -# Unresolved questions +# TODO Unresolved questions [unresolved-questions]: #unresolved-questions - What parts of the design do you expect to resolve through the RFC process before this gets merged? - What parts of the design do you expect to resolve through the implementation of this feature before stabilization? - What related issues do you consider out of scope for this RFC that could be addressed in the future independently of the solution that comes out of this RFC? -# Future possibilities +# TODO Future possibilities [future-possibilities]: #future-possibilities Think about what the natural extension and evolution of your proposal would From 72209f81c3671ba42dc0b1ee4248486b735fc6c4 Mon Sep 17 00:00:00 2001 From: Philipp Goerz Date: Mon, 3 Apr 2023 17:01:20 +0200 Subject: [PATCH 04/92] rationale and alternatives --- text/0000-guaranteed-tco.md | 39 ++++++++++++++++++++++++++++++++----- 1 file changed, 34 insertions(+), 5 deletions(-) diff --git a/text/0000-guaranteed-tco.md b/text/0000-guaranteed-tco.md index fdbd1b97de0..4aa5c8dbb57 100644 --- a/text/0000-guaranteed-tco.md +++ b/text/0000-guaranteed-tco.md @@ -155,19 +155,45 @@ There is also a unwanted interaction between TCO and debugging. As TCO by design [rationale-and-alternatives]: #rationale-and-alternatives ## Why is this design the best in the space of possible designs? -TODO This design is the best tradeoff between implementation effort and provided functionality. +This design is the best tradeoff between implementation effort and provided functionality, while also offering a good starting point towards exploration of a more general implementation. To expand on this, compared to other options creating a function local scope with the use of `become` greatly reduces implementation effort. Additionally, limiting tail-callable functions to those with an exactly matching function signatures enforces a common stack layout across all functions. This should in theory, depending on the backend, allow tail-calls to be performed without any stack shuffling, indeed it might even be possible to do so for indirect calls or external functions. ## What other designs have been considered and what is the rationale for not choosing them? +There are some designs that either can not achieve the same performance or functionality as the chosen approach. Though most other designs evolve around how to mark what should be a tail-call or marking what functions can be tail called. There is also the possibility of providing support for a custom backend (e.g. LLVM) or MIR pass. -### Loop based approach +There might also be some variation on the current design, which can be explore after the chosen design has been implemented see [unresolved questions](#unresolved) for some possibilities. -### Attribute on return +### Trampoline based Approach +There could be a trampoline based approach ([comment](https://github.com/rust-lang/rfcs/pull/1888#issuecomment-326952763)) that can fulfill the semantic guarantee of using constant stack space, though they can not be used to achieve the performance that the chosen design is capable of. Additionally, functions need to be known during compile time for these approaches to work. -### Attribute on tail-callable functions +### Principled Local Goto +One alternative would be to support some kind of local goto natively, indeed there exists a +[pre-RFC](https://internals.rust-lang.org/t/pre-rfc-safe-goto-with-value/14470/9?u=scottmcm) ([comment](https://github.com/rust-lang/rfcs/issues/2691#issuecomment-1458604986)). This design should be able to achieve the same performance and stack usage, though it seems to be quite difficult to implement and does not seems to be as flexible as the chosen design (regarding indirect calls / external functions). -### Using `become` and a marker for tail-callable functions +### Attribute on tail-callable Functions +One alternative is to mark a group of functions that should be mutually tail-callable [example](https://github.com/rust-lang/rfcs/pull/1888#issuecomment-1161525527) with some follow up [discussion](https://github.com/rust-lang/rfcs/pull/1888#issuecomment-1185828948). + +The goal behind this design is to TCO functions other than exactly matching function signatures, in theory this just requires that tail-called functions are callee cleanup, which is a mismatch to the default calling convention used by Rust. To limit the impact of this change all functions that should be TCO-able should be marked with a attribute. + +While quite noisy it is also less flexible than the chosen approach. Indeed TCO is a property of the call and not a function, sometimes a call should be guaranteed to be TCO and sometimes not, marking a function would be less flexible. + +### Attribute on Return +One alternative could be to use a attribute instead of the `become` keyword for function calls. To my knowledge this would be the first time a attribute would be allowed for a call. Example: + +```rust +fn a() { + become b(); + // or + #[become] + return b(); +} +``` + +This alternative mostly comes to taste (or bikeshedding) and `become` was chosen as it is shorter to write. ### Custom compiler or MIR passes +One more distant alternative would be to support a custom compiler or MIR pass so that this optimization can be done externally. While supported for LLVM [Zulip](https://rust-lang.zulipchat.com/#narrow/stream/187780-t-compiler.2Fwg-llvm/topic/.E2.9C.94.20Running.20Custom.20LLVM.20Pass/near/320275483), for MIR this is not supported [discussion](https://internals.rust-lang.org/t/mir-compiler-plugins-for-custom-mir-passes/3166/10). + +This would be a error prone and unergonomic approach to solving this problem. ## What is the impact of not doing this? @@ -199,6 +225,9 @@ Please also take into consideration that rust sometimes intentionally diverges f # TODO Unresolved questions [unresolved-questions]: #unresolved-questions +- should the performance be guaranteed? that is turning a `call` is transformed into a `jmp` +- how general do signatures for tail-callable functions need to be? would it be enough to create some padding arguments to allow "general" tail-calls across functions with same sized arguments, maybe only the sum of argument sizes need to match? + - What parts of the design do you expect to resolve through the RFC process before this gets merged? - What parts of the design do you expect to resolve through the implementation of this feature before stabilization? - What related issues do you consider out of scope for this RFC that could be addressed in the future independently of the solution that comes out of this RFC? From 4beeedd780fe6acfd120c78d0ac073a198a305e1 Mon Sep 17 00:00:00 2001 From: Philipp Goerz Date: Mon, 3 Apr 2023 17:25:53 +0200 Subject: [PATCH 05/92] finish up rationale and alternatives --- text/0000-guaranteed-tco.md | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/text/0000-guaranteed-tco.md b/text/0000-guaranteed-tco.md index 4aa5c8dbb57..b341128d73a 100644 --- a/text/0000-guaranteed-tco.md +++ b/text/0000-guaranteed-tco.md @@ -141,17 +141,17 @@ The section should return to the examples given in the previous section, and exp Why should we *not* do this? -As this feature should be mostly independent from other features the main drawback lies in the implementation and maintenance effort. This feature adds a new keyword which will need to be implemented not only in Rust but also in other tooling. The main effort, however, lies in supporting this feature in the backends: +As this feature should be mostly independent from other features the main drawback lies in the implementation and maintenance effort. This feature adds a new keyword which will need to be implemented not only in Rust but also in other tooling. The primary effort, however, lies in supporting this feature in the backends: - LLVM supports a `musttail` marker to indicate that TCO should be performed [docs](https://llvm.org/docs/LangRef.html#id327). Clang which already depends on this feature, seems to only generate correct code for the x86 backend [source](https://github.com/rust-lang/rfcs/issues/2691#issuecomment-1490009983) (as of 30.03.23). - GCC does not support a equivalent `musttail` marker. - WebAssembly accepted tail-calls into the [standard](https://github.com/WebAssembly/proposals/pull/157/) and Cranelift is now [working](https://github.com/bytecodealliance/rfcs/pull/29) towards supporting it. -Additionally, this proposal is limited to exactly matching function signatures which will *not* allow general tail-calls, however, the work towards this initial version could be used for a more comprehensive version. +Additionally, this proposal is limited to exactly matching function signatures which will *not* allow general tail-calls, however, the work towards this initial version is likely to be useful for a more comprehensive version. There is also a unwanted interaction between TCO and debugging. As TCO by design elides stack frames this information is lost during debugging, that is the parent functions and their local variable values are incomplete. As TCO provides a semantic guarantee of constant stack usage it is also not generally possible to disable TCO for debugging builds as then the stack could overflow. (Still maybe a compiler flag could be provided to temporarily disable TCO for debugging builds.) -# TODO Rationale and alternatives +# Rationale and alternatives [rationale-and-alternatives]: #rationale-and-alternatives ## Why is this design the best in the space of possible designs? @@ -197,9 +197,12 @@ This would be a error prone and unergonomic approach to solving this problem. ## What is the impact of not doing this? -- https://github.com/rust-lang/rust/issues/102952 -- Clang has support, this feature would restore this deficit parity -- +One goal of Rust is to ([source](https://blog.rust-lang.org/inside-rust/2022/04/04/lang-roadmap-2024.html)): +> Rust's goal is to empower everyone to build reliable and efficient software. +This feature provides a crucial optimization for some low level code. It seems that without this feature there is a [big incentive](https://github.com/rust-lang/rust/issues/102952) to use other system level languages that can perform TCO. + +Additionally, this feature enables recursive algorithms that require TCO, which would provide better support for functional programming in Rust. + ## If this is a language proposal, could this be done in a library or macro instead? Does the proposed change make Rust code easier or harder to read, understand, and maintain? While there exist libraries for a trampoline based method to avoid growing the stack, this is not enough to achieve the possible performance of real TCO, so this feature requires support by the compiler itself. From 60c0242260380fcbbe0976652e1495a2e9c07270 Mon Sep 17 00:00:00 2001 From: Philipp Goerz Date: Mon, 3 Apr 2023 17:28:56 +0200 Subject: [PATCH 06/92] fix some formatting issues --- text/0000-guaranteed-tco.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/text/0000-guaranteed-tco.md b/text/0000-guaranteed-tco.md index b341128d73a..8bd01d09e55 100644 --- a/text/0000-guaranteed-tco.md +++ b/text/0000-guaranteed-tco.md @@ -176,7 +176,7 @@ The goal behind this design is to TCO functions other than exactly matching func While quite noisy it is also less flexible than the chosen approach. Indeed TCO is a property of the call and not a function, sometimes a call should be guaranteed to be TCO and sometimes not, marking a function would be less flexible. -### Attribute on Return +### Attribute on `return` One alternative could be to use a attribute instead of the `become` keyword for function calls. To my knowledge this would be the first time a attribute would be allowed for a call. Example: ```rust @@ -199,6 +199,7 @@ This would be a error prone and unergonomic approach to solving this problem. ## What is the impact of not doing this? One goal of Rust is to ([source](https://blog.rust-lang.org/inside-rust/2022/04/04/lang-roadmap-2024.html)): > Rust's goal is to empower everyone to build reliable and efficient software. + This feature provides a crucial optimization for some low level code. It seems that without this feature there is a [big incentive](https://github.com/rust-lang/rust/issues/102952) to use other system level languages that can perform TCO. Additionally, this feature enables recursive algorithms that require TCO, which would provide better support for functional programming in Rust. From 7fecbad1af39594708c0683e4d06d840666d0b9c Mon Sep 17 00:00:00 2001 From: Philipp Goerz Date: Tue, 4 Apr 2023 20:51:47 +0200 Subject: [PATCH 07/92] prior art add links --- text/0000-guaranteed-tco.md | 97 +++++++++++++++++++++++++++++++++++++ 1 file changed, 97 insertions(+) diff --git a/text/0000-guaranteed-tco.md b/text/0000-guaranteed-tco.md index 8bd01d09e55..2ec3b1fb8ce 100644 --- a/text/0000-guaranteed-tco.md +++ b/text/0000-guaranteed-tco.md @@ -75,6 +75,9 @@ fn foo(x: i32) -> i32 { This is a potential source of confusion, indeed in a function language where every call is expected to be TCO this would be quite unexpected. (Maybe in functions that use `become` a lint should be applied that enforces usage of either `return` or `become`.) +### TODO none returns + + ### Alternating `become` and `return` calls ([original example](https://github.com/rust-lang/rfcs/pull/1888#issuecomment-279062656)) @@ -212,6 +215,8 @@ While there exist libraries for a trampoline based method to avoid growing the s # TODO Prior art [prior-art]: #prior-art +TODO remove +---- Discuss prior art, both the good and the bad, in relation to this proposal. A few examples of what this can include are: @@ -225,6 +230,98 @@ If there is no prior art, that is fine - your ideas are interesting to us whethe Note that while precedent set by other languages is some motivation, it does not on its own motivate an RFC. Please also take into consideration that rust sometimes intentionally diverges from common language features. +---- + + +## Clang +Clang, as of April 2021, does offer support for a musttail attribute on `return` statements in both C and C++. This functionality is enabled by the support in LLVM, which should also be the first backend an initial implementation in Rust. + +It seems this feature is received with "excitement" by those that can make use of it, a popular example is its usage to improve [Protobuf parsing speed](https://blog.reverberate.org/2021/04/21/musttail-efficient-interpreters.html). However, one issue is that it is not very portable and there still seem to be some problem with it's [implementation](https://github.com/rust-lang/rfcs/issues/2691#issuecomment-1490009983). + + +For a more detailed description see this excerpt from the description of the [implementation](https://reviews.llvm.org/rG834467590842): + +> Guaranteed tail calls are now supported with statement attributes +> ``[[clang::musttail]]`` in C++ and ``__attribute__((musttail))`` in C. The +> attribute is applied to a return statement (not a function declaration), +> and an error is emitted if a tail call cannot be guaranteed, for example if +> the function signatures of caller and callee are not compatible. Guaranteed +> tail calls enable a class of algorithms that would otherwise use an +> arbitrary amount of stack space. +> +> If a ``return`` statement is marked ``musttail``, this indicates that the +> compiler must generate a tail call for the program to be correct, even when +> optimizations are disabled. This guarantees that the call will not cause +> unbounded stack growth if it is part of a recursive cycle in the call graph. +> +> If the callee is a virtual function that is implemented by a thunk, there is +> no guarantee in general that the thunk tail-calls the implementation of the +> virtual function, so such a call in a recursive cycle can still result in +> unbounded stack growth. +> +> ``clang::musttail`` can only be applied to a ``return`` statement whose value +> is the result of a function call (even functions returning void must use +> ``return``, although no value is returned). The target function must have the +> same number of arguments as the caller. The types of the return value and all +> arguments must be similar according to C++ rules (differing only in cv +> qualifiers or array size), including the implicit "this" argument, if any. +> Any variables in scope, including all arguments to the function and the +> return value must be trivially destructible. The calling convention of the +> caller and callee must match, and they must not be variadic functions or have +> old style K&R C function declarations. + +There is also a proposal (https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2920.pdf) for the C Standard (https://www.open-std.org/JTC1/SC22/WG14/), outlining some limitations for Clang. +> Clang requires the argument types, argument number, and return type to be the same between the +> caller and the callee, as well as out-of-scope considerations such as C++ features and the calling +> convention. Implementor experience with Clang shows that the ABI of the caller and callee must be +> identical for the feature to work; otherwise, replacement may be impossible for some targets and +> conventions (replacing a differing argument list is non-trivial on some platforms). + + +## GCC +GCC does not support a feature equivalent to Clang's `musttail`, there also does not seem to be push to implement it ([pipermail](https://gcc.gnu.org/pipermail/gcc/2021-April/235882.html)). However, there also exists a experimental [plugin](https://github.com/pietro/gcc-musttail-plugin) for gcc last updated in 2021. + + +## dotnet +[Pull Request](https://github.com/dotnet/runtime/pull/341) ([Issue](https://github.com/dotnet/runtime/issues/2191)) +> This implements tailcall-via-help support for all platforms supported by +> the runtime. In this new mechanism the JIT asks the runtime for help +> whenever it realizes it will need a helper to perform a tailcall, i.e. +> when it sees an explicit tail. prefixed call that it cannot make into a +> fast jump-based tailcall. + + +## Zig +Zig provides separate syntax to allow more flexibility than normal function calls. There are options for async calls, inlining, compile time evaluation of the called function, and to enforce TCO on the call. +([source](https://ziglang.org/documentation/master/#call)) +```zig +const expect = @import("std").testing.expect; + +test "noinline function call" { + try expect(@call(.auto, add, .{3, 9}) == 12); +} + +fn add(a: i32, b: i32) i32 { + return a + b; +} +``` + +(TODO what is the communities reception of this feature?) + + +## JS +https://github.com/rust-lang/rfcs/pull/1888#issuecomment-368204577 (Feb, 2018) +> Technically the ES6 spec mandates tail-calls, but the situation in reality is more complicated than that. +> +> The only browser that actually supports tail calls is Safari (and Webkit). And the Edge team has said that it's unlikely that they will implement tail calls (for similar reasons as Rust: they currently use the Windows ABI calling convention, which doesn't work well with tail calls). +> +> Therefore, tail calls in JS is a very controversial thing, even to this day +> +> Just to be clear, the Edge team is against implicit tail-calls for all functions, but they're in favor of tail-calls-with-an-explicit-keyword (similar to this RFC). + + +A unofficial summary of the ECMA Script/ Javascript proposal for tail call/return +https://github.com/carbon-language/carbon-lang/issues/1761#issuecomment-1198672079 (Jul, 2022) # TODO Unresolved questions [unresolved-questions]: #unresolved-questions From ff899fd39341459483e5a1f0dde2ac3fafadb56d Mon Sep 17 00:00:00 2001 From: Philipp Goerz Date: Wed, 5 Apr 2023 10:56:53 +0200 Subject: [PATCH 08/92] finish up prior-art --- text/0000-guaranteed-tco.md | 44 +++++++++++++++++++++---------------- 1 file changed, 25 insertions(+), 19 deletions(-) diff --git a/text/0000-guaranteed-tco.md b/text/0000-guaranteed-tco.md index 2ec3b1fb8ce..a24b1705cbb 100644 --- a/text/0000-guaranteed-tco.md +++ b/text/0000-guaranteed-tco.md @@ -212,11 +212,9 @@ Additionally, this feature enables recursive algorithms that require TCO, which While there exist libraries for a trampoline based method to avoid growing the stack, this is not enough to achieve the possible performance of real TCO, so this feature requires support by the compiler itself. -# TODO Prior art +# Prior art [prior-art]: #prior-art - -TODO remove ----- + +Functional languages usually depend on proper tail calls as a language feature, which requires guaranteed TCO. For system level languages the topic of guaranteed TCO is usually wanted but implementation effort is the common reason this is not yet done. Even languages with managed code such as .Net or ECMAScript (though implementation is lagging behind) also support guaranteed TCO, again performance and resource usage were the main motivators for their implementation. + +See below for a more detailed description for compilers and languages. + ## Clang Clang, as of April 2021, does offer support for a musttail attribute on `return` statements in both C and C++. This functionality is enabled by the support in LLVM, which should also be the first backend an initial implementation in Rust. -It seems this feature is received with "excitement" by those that can make use of it, a popular example is its usage to improve [Protobuf parsing speed](https://blog.reverberate.org/2021/04/21/musttail-efficient-interpreters.html). However, one issue is that it is not very portable and there still seem to be some problem with it's [implementation](https://github.com/rust-lang/rfcs/issues/2691#issuecomment-1490009983). +It seems this feature is received with "excitement" by those that can make use of it, a popular example of its usage is to improve [Protobuf parsing speed](https://blog.reverberate.org/2021/04/21/musttail-efficient-interpreters.html). However, one issue is that it is not very portable and there still seem to be some problem with it's [implementation](https://github.com/rust-lang/rfcs/issues/2691#issuecomment-1490009983). -For a more detailed description see this excerpt from the description of the [implementation](https://reviews.llvm.org/rG834467590842): +For a more detailed description see this excerpt from the description of the feature, taken from the [implementation](https://reviews.llvm.org/rG834467590842): > Guaranteed tail calls are now supported with statement attributes > ``[[clang::musttail]]`` in C++ and ``__attribute__((musttail))`` in C. The @@ -279,16 +281,7 @@ There is also a proposal (https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2920 ## GCC -GCC does not support a feature equivalent to Clang's `musttail`, there also does not seem to be push to implement it ([pipermail](https://gcc.gnu.org/pipermail/gcc/2021-April/235882.html)). However, there also exists a experimental [plugin](https://github.com/pietro/gcc-musttail-plugin) for gcc last updated in 2021. - - -## dotnet -[Pull Request](https://github.com/dotnet/runtime/pull/341) ([Issue](https://github.com/dotnet/runtime/issues/2191)) -> This implements tailcall-via-help support for all platforms supported by -> the runtime. In this new mechanism the JIT asks the runtime for help -> whenever it realizes it will need a helper to perform a tailcall, i.e. -> when it sees an explicit tail. prefixed call that it cannot make into a -> fast jump-based tailcall. +GCC does not support a feature equivalent to Clang's `musttail`, there also does not seem to be push to implement it ([pipermail](https://gcc.gnu.org/pipermail/gcc/2021-April/235882.html)) (as of 2021). However, there also exists a experimental [plugin](https://github.com/pietro/gcc-musttail-plugin) for GCC last updated in 2021. ## Zig @@ -306,10 +299,23 @@ fn add(a: i32, b: i32) i32 { } ``` -(TODO what is the communities reception of this feature?) +(TODO: What is the community sentiment regarding this feature? Except for some bug reports I did not find anything.) + +## Carbon +As per this [issue](https://github.com/carbon-language/carbon-lang/issues/1761) it seems providing TCO is of interest even if the implementation is difficult + + +## .Net +The .Net JIT does support TCO as of 2020, a main motivator for this feature was improving performance. +[Pull Request](https://github.com/dotnet/runtime/pull/341) ([Issue](https://github.com/dotnet/runtime/issues/2191)) +> This implements tailcall-via-help support for all platforms supported by +> the runtime. In this new mechanism the JIT asks the runtime for help +> whenever it realizes it will need a helper to perform a tailcall, i.e. +> when it sees an explicit tail. prefixed call that it cannot make into a +> fast jump-based tailcall. -## JS +## ECMA Script / JS https://github.com/rust-lang/rfcs/pull/1888#issuecomment-368204577 (Feb, 2018) > Technically the ES6 spec mandates tail-calls, but the situation in reality is more complicated than that. > From 98dfe995d2d667a91c6e25ea63363c841eb2ec0a Mon Sep 17 00:00:00 2001 From: Philipp Goerz Date: Wed, 5 Apr 2023 15:41:31 +0200 Subject: [PATCH 09/92] add reference-level explanation --- text/0000-guaranteed-tco.md | 122 ++++++++++++++++++++++++++++++------ 1 file changed, 104 insertions(+), 18 deletions(-) diff --git a/text/0000-guaranteed-tco.md b/text/0000-guaranteed-tco.md index a24b1705cbb..bdea230b6b7 100644 --- a/text/0000-guaranteed-tco.md +++ b/text/0000-guaranteed-tco.md @@ -26,7 +26,7 @@ Some specific use cases that are supported by this feature are new ways to encod [guide-level-explanation]: #guide-level-explanation ## Introducing new named concepts. -The `become` keyword can be used at the same locations as the `return` keyword, however, only a *simple* function call can take the place of the argument. That is supported are calls such as `become foo()`, `become foo(a)`, `become foo(a, b)`, however, **not** supported are calls that contain or are part of a larger expression such as `become foo() + 1`, `become foo(1 + 1)`, `become foo(bar())` (though this may be subject to change). Additionally, there is a further restriction on the tail-callable functions: the function signature must exactly match that of the calling function (a restriction that might be loosened in the future). +The `become` keyword can be used at the same locations as the `return` keyword, however, only a plain function or method call can take the place of the argument. That is supported are calls such as `become foo()`, `become foo(a)`, `become foo(a, b)`, however, **not** supported are calls that contain or are part of a larger expression such as `become foo() + 1`, `become foo(1 + 1)`, `become foo(bar())` (though this may be subject to change). Additionally, there is a further restriction on the tail-callable functions: the function signature must exactly match that of the calling function (a restriction that might be loosened in the future). ## Explaining the feature largely in terms of examples. Now on to some examples. Starting with how `return` and `become` differ, and some potential pitfalls. @@ -128,22 +128,103 @@ For new Rust programmers this feature should probably be introduced late into th As this feature introduces a new keyword and is independent of existing code it has no impact on existing code. For code that does use this feature, it is required that a programmer understands the differences between `become` and `return`, it is difficult to judge how big this impact is without an initial implementation. One difference, however, is in debugging code that uses `become`. As the stack is not preserved, debugging context is lost which likely makes debugging more difficult. That is, elided parent functions as well as their variable values are not available during debugging. (Though this issue might be lessened by providing a flag to opt out of TCO, which would, however, break the semantic guarantee of creating further stack frames. This is likely an issue that needs some investigation after creating an initial implementation.) -# TODO Reference-level explanation +# Reference-level explanation [reference-level-explanation]: #reference-level-explanation - -This is the technical portion of the RFC. Explain the design in sufficient detail that: + +This explanation is mostly based on a [previous RFC](https://github.com/DemiMarie/rfcs/blob/become/0000-proper-tail-calls.md#detailed-design) though is more restricted as the current RFC does not target general tail calls anymore. -# Drawbacks -[drawbacks]: #drawbacks +The goal of this RFC is to create a first implementation that is already useful, while providing a basis to explore possible ways to relax the requirements when TCO can be guaranteed. + +## Syntax +[syntax]: #syntax + +A guaranteed TCO is indicated by using the `become` keyword in place of `return`. The `become` keyword is already +reserved, so there is no backwards-compatibility break. The `become` keyword must be followed by a plain function call +or method calls, that is supported are calls like: `become foo()`, `become foo(a)`, `become foo(a, b)`, and so on, or +`become foo.bar()` with plain arguments. Neither the function call or any arguments can be part of a larger expression +such as `become foo() + 1`, `become foo(1 + 1)`, `become foo(bar())`. Additionally, there is a further restriction on +the tail-callable functions: the function signature must exactly match that of the calling function. + +Invocations of overloaded operators with at least one non-primitive argument were considered as valid targets, but were +rejected on grounds of being too error-prone. In any case, these can still be called as methods. + +## Type checking +[typechecking]: #typechecking +A `become` statement is type-checked like a `return` statement, with the added restriction of exactly matching the +function signatures between caller and callee. Additionally, the caller and callee **must** use the same calling +convention. -Why should we *not* do this? +## Borrowchecking and Runtime Semantics +[semantics]: #semantics +A `become` expression acts as if the following events occurred in-order: +1. All variables that are being passed by-value are moved to temporary storage. +2. All local variables in the caller are destroyed according to usual Rust semantics. Destructors are called where + necessary. Note that values moved from in step 1 are _not_ dropped. +3. The caller's stack frame is removed from the stack. +4. Control is transferred to the callee's entry point. + +This implies that it is invalid for any references into the caller's stack frame to outlive the call. The borrow checker ensures that none of the above steps will result in the use of a value that has gone out of scope. + +As `become` is always in a tail position (due to being used in place of `return`), this requirement for TCO is already +fulfilled. + +Example: +```rust +fn x() { + let a = Box::new(()); + let b = Box::new(()); + become y(a) +} +``` + +Will be desugared in the following way: +```rust +fn x() { + let a = Box::new(()); + let b = Box::new(()); + let _tmp = a; + drop(b); + become y(_tmp) +} +``` + +## Implementation +[implementation]: #implementation + +A now six years old implementation for the earlier mentioned +[RFC](https://github.com/DemiMarie/rfcs/blob/become/0000-proper-tail-calls.md) can be found at +[DemiMarie/rust/tree/explicit-tailcalls](https://github.com/DemiMarie/rust/tree/explicit-tailcalls). A current implementation is planned as part of this RFC. + +The parser parses `become` exactly how it parses the `return` keyword. The difference in semantics is handled later. + +During type checking, the following are checked: + +1. The target of the tail call is, in fact, a simple call. +2. The target of the tail call has the proper ABI. + +Later phases in the compiler assert that these requirements are met. + +New nodes are added in HIR and THIR to correspond to `become`. In MIR, the function call is checked that: +1. The returned value is directly returned. +2. There are no cleanups. +3. The basic block being branched into has length zero. +4. The basic block being branched into terminates with a return. + +If these conditions are fulfilled the function call following `become` is flagged to indicate the TCO requirement. This flag is then propagated to the corresponding backend. In the backend, there is an additional check if TCO can be performed. + +Should any check during compilation not pass a compiler error should be issued. + + +# TODO Drawbacks +[drawbacks]: #drawbacks + As this feature should be mostly independent from other features the main drawback lies in the implementation and maintenance effort. This feature adds a new keyword which will need to be implemented not only in Rust but also in other tooling. The primary effort, however, lies in supporting this feature in the backends: - LLVM supports a `musttail` marker to indicate that TCO should be performed [docs](https://llvm.org/docs/LangRef.html#id327). Clang which already depends on this feature, seems to only generate correct code for the x86 backend [source](https://github.com/rust-lang/rfcs/issues/2691#issuecomment-1490009983) (as of 30.03.23). - GCC does not support a equivalent `musttail` marker. @@ -153,6 +234,8 @@ Additionally, this proposal is limited to exactly matching function signatures w There is also a unwanted interaction between TCO and debugging. As TCO by design elides stack frames this information is lost during debugging, that is the parent functions and their local variable values are incomplete. As TCO provides a semantic guarantee of constant stack usage it is also not generally possible to disable TCO for debugging builds as then the stack could overflow. (Still maybe a compiler flag could be provided to temporarily disable TCO for debugging builds.) +TODO portability + # Rationale and alternatives [rationale-and-alternatives]: #rationale-and-alternatives @@ -172,7 +255,7 @@ There could be a trampoline based approach ([comment](https://github.com/rust-la One alternative would be to support some kind of local goto natively, indeed there exists a [pre-RFC](https://internals.rust-lang.org/t/pre-rfc-safe-goto-with-value/14470/9?u=scottmcm) ([comment](https://github.com/rust-lang/rfcs/issues/2691#issuecomment-1458604986)). This design should be able to achieve the same performance and stack usage, though it seems to be quite difficult to implement and does not seems to be as flexible as the chosen design (regarding indirect calls / external functions). -### Attribute on tail-callable Functions +### Attribute on Function Declaration One alternative is to mark a group of functions that should be mutually tail-callable [example](https://github.com/rust-lang/rfcs/pull/1888#issuecomment-1161525527) with some follow up [discussion](https://github.com/rust-lang/rfcs/pull/1888#issuecomment-1185828948). The goal behind this design is to TCO functions other than exactly matching function signatures, in theory this just requires that tail-called functions are callee cleanup, which is a mismatch to the default calling convention used by Rust. To limit the impact of this change all functions that should be TCO-able should be marked with a attribute. @@ -229,12 +312,11 @@ If there is no prior art, that is fine - your ideas are interesting to us whethe Note that while precedent set by other languages is some motivation, it does not on its own motivate an RFC. Please also take into consideration that rust sometimes intentionally diverges from common language features. --> -Functional languages usually depend on proper tail calls as a language feature, which requires guaranteed TCO. For system level languages the topic of guaranteed TCO is usually wanted but implementation effort is the common reason this is not yet done. Even languages with managed code such as .Net or ECMAScript (though implementation is lagging behind) also support guaranteed TCO, again performance and resource usage were the main motivators for their implementation. +Functional languages (such as OCaml, SML, Haskell, Scheme, and F#) usually depend on proper tail calls as a language feature, which requires guaranteed TCO. For system level languages the guaranteed TCO is usually wanted but implementation effort is a common reason this is not yet done. Even languages with managed code such as .Net or ECMAScript (though implementation is lagging behind) also support guaranteed TCO, again performance and resource usage were the main motivators for their implementation. See below for a more detailed description for compilers and languages. - ## Clang Clang, as of April 2021, does offer support for a musttail attribute on `return` statements in both C and C++. This functionality is enabled by the support in LLVM, which should also be the first backend an initial implementation in Rust. @@ -331,18 +413,22 @@ https://github.com/carbon-language/carbon-lang/issues/1761#issuecomment-11986720 # TODO Unresolved questions [unresolved-questions]: #unresolved-questions + - should the performance be guaranteed? that is turning a `call` is transformed into a `jmp` - how general do signatures for tail-callable functions need to be? would it be enough to create some padding arguments to allow "general" tail-calls across functions with same sized arguments, maybe only the sum of argument sizes need to match? - -- What parts of the design do you expect to resolve through the RFC process before this gets merged? -- What parts of the design do you expect to resolve through the implementation of this feature before stabilization? -- What related issues do you consider out of scope for this RFC that could be addressed in the future independently of the solution that comes out of this RFC? +- method calls? +- generics? +- async? +- closures? +- debugging +- calling-convention # TODO Future possibilities [future-possibilities]: #future-possibilities - -Think about what the natural extension and evolution of your proposal would + From 6233cc33a01ca39bd06c20018f2d86776c6558bc Mon Sep 17 00:00:00 2001 From: Philipp Goerz Date: Wed, 5 Apr 2023 16:37:21 +0200 Subject: [PATCH 10/92] add unresolved questions and future possibilities --- text/0000-guaranteed-tco.md | 56 +++++++++++++++++++++++++++---------- 1 file changed, 41 insertions(+), 15 deletions(-) diff --git a/text/0000-guaranteed-tco.md b/text/0000-guaranteed-tco.md index bdea230b6b7..cf8a63d0293 100644 --- a/text/0000-guaranteed-tco.md +++ b/text/0000-guaranteed-tco.md @@ -113,7 +113,7 @@ This feature is only useful for some specific algorithms, where it can be essent ## If applicable, provide sample error messages, deprecation warnings, or migration guidance. -TODO Error messages +(TODO Error messages once an initial implementation exists) As this is a independent new feature there should be no need for deprecation warnings. @@ -222,7 +222,7 @@ If these conditions are fulfilled the function call following `become` is flagge Should any check during compilation not pass a compiler error should be issued. -# TODO Drawbacks +# Drawbacks [drawbacks]: #drawbacks As this feature should be mostly independent from other features the main drawback lies in the implementation and maintenance effort. This feature adds a new keyword which will need to be implemented not only in Rust but also in other tooling. The primary effort, however, lies in supporting this feature in the backends: @@ -234,8 +234,6 @@ Additionally, this proposal is limited to exactly matching function signatures w There is also a unwanted interaction between TCO and debugging. As TCO by design elides stack frames this information is lost during debugging, that is the parent functions and their local variable values are incomplete. As TCO provides a semantic guarantee of constant stack usage it is also not generally possible to disable TCO for debugging builds as then the stack could overflow. (Still maybe a compiler flag could be provided to temporarily disable TCO for debugging builds.) -TODO portability - # Rationale and alternatives [rationale-and-alternatives]: #rationale-and-alternatives @@ -411,22 +409,29 @@ https://github.com/rust-lang/rfcs/pull/1888#issuecomment-368204577 (Feb, 2018) A unofficial summary of the ECMA Script/ Javascript proposal for tail call/return https://github.com/carbon-language/carbon-lang/issues/1761#issuecomment-1198672079 (Jul, 2022) -# TODO Unresolved questions +# Unresolved questions [unresolved-questions]: #unresolved-questions - -- should the performance be guaranteed? that is turning a `call` is transformed into a `jmp` -- how general do signatures for tail-callable functions need to be? would it be enough to create some padding arguments to allow "general" tail-calls across functions with same sized arguments, maybe only the sum of argument sizes need to match? -- method calls? -- generics? -- async? -- closures? -- debugging -- calling-convention +- What parts of the design do you expect to resolve through the RFC process before this gets merged? + - The main uncertainties are regarding the exact restrictions on when backends can guarantee TCO, this RFC is intentionally strict to try and require as little as possible from the backends. + - One point that needs to be decided is if TCO should be a feature that needs to be required from all backends or if it can be optional. + - Another point that needs to be decided is, if TCO is supported by a backend what exactly should be guaranteed? While the guarantee that there is no stack growth should be necessary, should performance (as in transforming `call` instructions into `jmp`) also be guaranteed? Note that a backend that guarantees performance should do so **always** otherwise the main intend of this RFC seems to be lost. +- What parts of the design do you expect to resolve through the implementation of this feature before stabilization? + - Are all calling-convention used by Rust available for TCO with the proposed restrictions on function signatures? + - Can the restrictions on function signatures be relaxed? + - Can generic functions be supported? + - Can async functions be supported? (see [here](https://github.com/rust-lang/rfcs/pull/1888#issuecomment-1186604115) for an initial assesment) + - Can closures be supported? (see [here](https://github.com/rust-lang/rfcs/pull/1888#issuecomment-1186604115) for an initial assesment) + - Can dynamic function calls be supported? + - Can functions outside the current crate be supported, functions from dynamically loaded libraries? + - Is there some way to reduce the impact on debugging? + -# TODO Future possibilities +# Future possibilities [future-possibilities]: #future-possibilities +## Helpers +It seems possible to keep the restriction on exactly matching function signatures by offering some kind of placeholder arguments to pad out the differences. For example: +```rust +foo(a: u32, b: u32) { + // ... +} + +bar(a: u32, _b: u32) { + // ... +} +``` +Maybe it is useful to provide a macro or attribute that inserts missing arguments. +```rust +#[pad_args(foo)] +bar(a: u32) { + // ... +} +``` + +## Function Programming +It might be possible to allow even more functional programming paradigms based on TCO, the examples in this RFC still seem quite far from typical functional programming. \ No newline at end of file From dde8305a25efc4c2c13b8f0c6c9de9a8a447f928 Mon Sep 17 00:00:00 2001 From: Philipp Goerz Date: Thu, 6 Apr 2023 14:20:02 +0200 Subject: [PATCH 11/92] final pass --- text/0000-guaranteed-tco.md | 293 ++++++++++++++++++++++++++---------- 1 file changed, 214 insertions(+), 79 deletions(-) diff --git a/text/0000-guaranteed-tco.md b/text/0000-guaranteed-tco.md index cf8a63d0293..b22537e24c4 100644 --- a/text/0000-guaranteed-tco.md +++ b/text/0000-guaranteed-tco.md @@ -6,39 +6,80 @@ # Summary [summary]: #summary -This feature allows guaranteeing that function calls are tail-call optimized (TCO) via the `become` keyword. If this guarantee can not be provided by the compiler an error is generated instead. The check for the guarantee is done by verifying that the candidate function call follows several restrictions such as tail position and a function signature that exactly matches the calling function (it might be possible to loosen the function signature restriction in the future). - -This RFC discusses a minimal version that restricts function signatures to be exactly matching the calling function. It is possible that some restrictions can be removed with more experience of the implementation and usage of this feature. Also note that the current proposed version does not support general tail call optimization, this likely requires some more changes in Rust and the backends. +This feature provides a guarantee that function calls are tail-call optimized via the `become` keyword. If this +guarantee can not be provided by the compiler an error is generated instead. # Motivation [motivation]: #motivation -While opportunistic TCO is already supported there currently is no way to natively guarantee TCO. This optimization is interesting for two general goals. One goal is to do function calls without adding a new stack frame to the stack, this mainly has semantic implications as for example recursive algorithms can overflow the stack without this optimization. The other goal is to, in simple words, replace `call` instructions by `jmp` instructions, this optimization has performance implications and can provide massive speedups for algorithms that have a high density of function calls. +While opportunistic tail-call optimization (TCO) is already supported there currently is no way to guarantee TCO. This +guarantee is interesting for two general goals. One goal is to do function calls without growing the stack, this mainly +has semantic implications as recursive algorithms can overflow the stack without this optimization. The other goal is +to, in simple words, replace `call` instructions by `jmp` instructions, this optimization has performance implications +and can provide massive speedups for algorithms that have a high density of function calls. -Note that workarounds for the first goal exist by using so called trampolining which limits the stack depth. However, while this functionality is provided by several crates, a inclusion in the language can provide greater adoption of a more functional programming style. +Note that workarounds for the first goal exist by using trampolining which limits the stack depth. However, while this +functionality can be provided as a library, inclusion in the language can provide greater adoption of a more functional +programming style. -For the second goal no guaranteed method exists, so if TCO is performed depends on the specific structure of the code and the compiler version. This can result in TCO no longer being performed if non-semantic changes to the code are done or the compiler version changes. +For the second goal no guaranteed method exists. The decision if TCO is performed depends on the specific code and the +compiler version. This can result in TCO surprisingly no longer being performed due to small changes to the code or a +change of the compiler version, see this [issue](https://github.com/rust-lang/rust/issues/102952) for an example. -Some specific use cases that are supported by this feature are new ways to encode state machines and jump tables, allowing code to be written in a continuation-passing style, recursive algorithms to be guaranteed TCO, and faster interpreters. One common example for the usefulness of tail-calls in C is improving performance of Protobuf parsing [blog](https://blog.reverberate.org/2021/04/21/musttail-efficient-interpreters.html), which would then also be possible in Rust. +Some specific use cases that are supported by this feature are new ways to encode state machines and jump tables, +allowing code to be written in a continuation-passing style, recursive algorithms to be guaranteed TCO, or guaranteeing +significantly faster interpreters / emulators. One common example of the usefulness of tail calls in C is improving +performance of Protobuf parsing as described in this [blog post](https://blog.reverberate.org/2021/04/21/musttail-efficient-interpreters.html), this approach would then also be possible in Rust. # TODO Guide-level explanation [guide-level-explanation]: #guide-level-explanation - -## Introducing new named concepts. -The `become` keyword can be used at the same locations as the `return` keyword, however, only a plain function or method call can take the place of the argument. That is supported are calls such as `become foo()`, `become foo(a)`, `become foo(a, b)`, however, **not** supported are calls that contain or are part of a larger expression such as `become foo() + 1`, `become foo(1 + 1)`, `become foo(bar())` (though this may be subject to change). Additionally, there is a further restriction on the tail-callable functions: the function signature must exactly match that of the calling function (a restriction that might be loosened in the future). - -## Explaining the feature largely in terms of examples. -Now on to some examples. Starting with how `return` and `become` differ, and some potential pitfalls. -TODO add usecases + +Pretending this RFC has already been accepted into Rust, it could be explained to another Rust programmer as follows. + +## Introducing new named concepts. +Rust now supports a way to guarantee tail call optimization (TCO), this is interesting for two groups of programmers +those that want to use recursive algorithms and those that want to create highly optimized code. Note that using this +feature can have some difficulties as there are several requirements on functions where TCO can be performed. + +TCO provides a way to call functions without creating a new stack frame, instead, the stack frame of the calling +function is reused. This is only possible if the functions have a similar enough stack layout in the first place, this +layout is based on the calling convention, and arguments as well as return types (the function signature in short). +Currently, all of these need to match exactly otherwise an error will be thrown during compilation. + +Reusing the stack frame has two effects: One is that the stack will no longer grow, allowing unlimited nested function +calls, if all are TCO'ed. The other is that creating a new stack frame is actually quite expensive, especially for code +with a high density of function calls, so reusing the stack frame can lead to massive performance improvements. + +To guarantee TCO the `become` keyword can be used instead of the `return` keyword (and only there). However, only a +"plain" function or method call can take the place of the argument. That is supported are calls such as `become foo()`, +`become foo(a)`, `become foo(a, b)`, however, **not** supported are calls that contain or are part of a larger +expression such as `become foo() + 1`, `become foo(1 + 1)`, `become foo(bar())` (though this may be subject to change). +Additionally, as already said the function signature must exactly match that of the calling function (a restriction that +might also be loosened a bit in the future). + +## Examples +Now on to some examples. Starting with how `return` and `become` differ, two example use cases, and some potential +pitfalls. ### The difference between `return` and `become` -One essential difference to `return` is that `become` drops function local variables **before** the function call instead of after. So the following function ([original example](https://github.com/rust-lang/rfcs/issues/2691#issuecomment-1136728427)): +The essential difference to `return` is that `become` drops function local variables **before** the function call +instead of after. So the following function ([original example](https://github.com/rust-lang/rfcs/issues/2691#issuecomment-1136728427)): ```rust fn x() { let a = Box::new(()); let b = Box::new(()); - become y(a) + become y(a); } ``` @@ -49,15 +90,79 @@ fn x() { let b = Box::new(()); let _tmp = a; drop(b); - become y(_tmp) + become y(_tmp); } ``` -This early dropping allows to avoid many complexities associated with deciding if a call can be TCO, instead the heavy lifting is done by the borrow checker and a lifetime error will be produced if references to local variables are passed to the called function. To be clear a reference to a local variable could be passed if instead of `become` the call would be done with `return y(a);` (or equivalently `y(a)`), indeed this difference between the handling of local variables is also the main difference between `return` and `become`. + +This early dropping allows the compiler to avoid many complexities associated with deciding if a call can be TCO, +instead the heavy lifting is done by the borrow checker and a lifetime error will be produced if references to local +variables are passed to the called function. To be clear a reference to a local variable could be passed if instead of +`become` the call would be done with `return y(a);` (or equivalently `y(a)`), indeed this difference between the +handling of local variables is also the main difference between `return` and `become`. + +### Use Case 1: Recursive Algorithm +As a possible use case let us take a look at creating the sum over a `Vec`. Admittedly an unusual example for Rust as +this is usually done with iteration. Though, this is kind of the point, without TCO this example can overflow the stack. + +```rust +fn sum_list(data: Vec, mut offset: usize, mut accum: u64) -> u64 { + if offset < data.len() { + accum += data[offset]; + offset += 1; + become sum_list(data, offset, accum); // <- become here + } else { + // Note that this would be a `return accum;` + accum + } +} +``` + + +### Use Case 2: Interpreter +In an interpreter the usual loop is to get an instruction, match on that instruction to find the corresponding function, **call** that function, and finally return to the loop to get the next instruction. (This is a simplified example.) + +```rust +fn exec_instruction(mut self) { + loop { + let next_instruction = self.read_instr(); // this call can be inlined + match next_instruction { + Instruction::Foo => self.execute_instruction_foo(), + Instruction::Bar => self.execute_instruction_bar(), + } + } +} +``` + +This example can be turned into the following code, which no longer does any calls and instead just uses jump instructions. (Note that this example might not be the optimal way to use `become`.) + +```rust +fn execute_instruction_foo(mut self) { + // foo things ... + + become self.next_instruction(); +} + +fn execute_instruction_bar(mut self) { + // bar things ... + + become self.next_instruction(); +} + +fn next_instruction(mut self) { + let next_instruction = self.read_instr(); // this call can be inlined + match next_instruction { + Instruction::Foo => become self.execute_instruction_foo(), + Instruction::Bar => become self.execute_instruction_bar(), + } +} +``` ### Omission of the `become` keyword causes the call to be `return` instead. ([original example](https://github.com/rust-lang/rfcs/pull/1888#issuecomment-278988088)) +This is a potential source of confusion, indeed in a function language where every call is expected to be TCO this would be quite unexpected. (Maybe in functions that use `become` a lint should be applied that enforces usage of either `return` or `become` in functions where at least one `become` is used.) + ```rust fn foo(x: i32) -> i32 { if x % 2 { @@ -72,18 +177,17 @@ fn foo(x: i32) -> i32 { } ``` -This is a potential source of confusion, indeed in a function language where every call is expected to be TCO this would be quite unexpected. (Maybe in functions that use `become` a lint should be applied that enforces usage of either `return` or `become`.) - - -### TODO none returns - - ### Alternating `become` and `return` calls ([original example](https://github.com/rust-lang/rfcs/pull/1888#issuecomment-279062656)) +Here one function uses `become` the other `return`, this is another potential source of confusion. This mutual recursion +would eventual overflow the stack. As mutual recursion can also happen across more functions, `become` needs to be used +consistently in all functions if TCO should be guaranteed. (Maybe it is also possible to create a lint for these +use cases as well.) + ```rust fn foo(n: i32) { - // ups! we forgot become! + // oops, we forgot become .. return bar(n); // or alternatively: `bar(n)` } @@ -92,40 +196,40 @@ fn bar(n: i32) { } ``` -Here one function uses `become` the other `return`, this is another potential source of confusion. This mutual recursion would eventual overflow the stack. As mutual recursion can also happen across more functions, `become` needs to be used consistently in all functions if TCO should be guaranteed. (Maybe it is also possible to create a lint for these use-cases as well.) - - - ## Explaining how Rust programmers should *think* about the feature, and how it should impact the way they use Rust. It should explain the impact as concretely as possible. -This feature is only useful for some specific algorithms, where it can be essential, though it might also create a push towards a more functional programming style in Rust. In general this feature is probably unneeded for most Rust programmers, Rust has been getting on fine without this feature for most applications. As a result it impacts only those few Rust programmers that require TCO provided by this feature. +This feature is only useful for some specific algorithms, where it can be essential, though it might also create a push +towards a more functional programming style in Rust. In general this feature is probably unneeded for most Rust +programmers, Rust has been getting on fine without this feature for most applications. As a result it impacts only those +few Rust programmers that require TCO provided by this feature. ## If applicable, provide sample error messages, deprecation warnings, or migration guidance. (TODO Error messages once an initial implementation exists) -As this is a independent new feature there should be no need for deprecation warnings. +There should be no need for deprecation warnings. -Regarding migration guidance, it might be interesting to provide a lint that indicates that a trivial transformation from `return` to `become` can be done for function calls where requisites are already fulfilled. However, this lint might be confusing and noisy without too much of a benefit, especially if TCO is already done without `become`. +Regarding migration guidance, it might be interesting to provide a lint that indicates that a trivial transformation +from `return` to `become` can be done for function calls where all requisites are already fulfilled. However, this lint +might be confusing and noisy. ## If applicable, describe the differences between teaching this to existing Rust programmers and new Rust programmers. -For new Rust programmers this feature should probably be introduced late into the learning process, it is not a required feature and only useful for niche problems. So it should be taught similarly as to programmers that already know Rust. It is likely enough to provide a description of the feature, explain TCO, compare the differences to `return`, and give examples of possible use-cases and mistakes. +For new Rust programmers this feature should probably be introduced late into the learning process, it requires +understanding some advanced concepts and the current use cases are likely to be niche. So it should be taught similarly +as to programmers that already know Rust. It is likely enough to description the feature, explain TCO, compare the +differences to `return`, and give examples of possible use cases and mistakes. ## Discuss how this impacts the ability to read, understand, and maintain Rust code. Code is read and modified far more often than written; will the proposed feature make code easier to maintain? -As this feature introduces a new keyword and is independent of existing code it has no impact on existing code. For code that does use this feature, it is required that a programmer understands the differences between `become` and `return`, it is difficult to judge how big this impact is without an initial implementation. One difference, however, is in debugging code that uses `become`. As the stack is not preserved, debugging context is lost which likely makes debugging more difficult. That is, elided parent functions as well as their variable values are not available during debugging. (Though this issue might be lessened by providing a flag to opt out of TCO, which would, however, break the semantic guarantee of creating further stack frames. This is likely an issue that needs some investigation after creating an initial implementation.) +As this feature introduces a new keyword and is independent of existing code it has no impact on existing code. For code +that does use this feature, it is required that a programmer understands the differences between `become` and `return`, +it is difficult to judge how big this impact is without an initial implementation. One difference, however, is in +debugging code that uses `become`. As the stack is not preserved, debugging context is lost which likely makes debugging +more difficult. That is, elided parent functions as well as their variable values are not available during debugging. +(Though this issue might be lessened by providing a flag to opt out of TCO, which would, however, break the semantic +guarantee of not creating stack frames. This is likely an issue that needs some investigation after creating an initial +implementation.) # Reference-level explanation @@ -137,9 +241,11 @@ As this feature introduces a new keyword and is independent of existing code it - Corner cases are dissected by example. The section should return to the examples given in the previous section, and explain more fully how the detailed proposal makes those examples work. --> -This explanation is mostly based on a [previous RFC](https://github.com/DemiMarie/rfcs/blob/become/0000-proper-tail-calls.md#detailed-design) though is more restricted as the current RFC does not target general tail calls anymore. +This explanation is mostly based on a [previous RFC](https://github.com/DemiMarie/rfcs/blob/become/0000-proper-tail-calls.md#detailed-design) +though is more restricted as the current RFC does not target general tail calls anymore. -The goal of this RFC is to create a first implementation that is already useful, while providing a basis to explore possible ways to relax the requirements when TCO can be guaranteed. +The goal of this RFC is to describe a first implementation that is already useful while providing a basis to explore +possible ways to relax the requirements when TCO can be guaranteed. ## Syntax [syntax]: #syntax @@ -147,7 +253,7 @@ The goal of this RFC is to create a first implementation that is already useful, A guaranteed TCO is indicated by using the `become` keyword in place of `return`. The `become` keyword is already reserved, so there is no backwards-compatibility break. The `become` keyword must be followed by a plain function call or method calls, that is supported are calls like: `become foo()`, `become foo(a)`, `become foo(a, b)`, and so on, or -`become foo.bar()` with plain arguments. Neither the function call or any arguments can be part of a larger expression +`become foo.bar()` with plain arguments. Neither the function call nor any arguments can be part of a larger expression such as `become foo() + 1`, `become foo(1 + 1)`, `become foo(bar())`. Additionally, there is a further restriction on the tail-callable functions: the function signature must exactly match that of the calling function. @@ -200,7 +306,8 @@ fn x() { A now six years old implementation for the earlier mentioned [RFC](https://github.com/DemiMarie/rfcs/blob/become/0000-proper-tail-calls.md) can be found at -[DemiMarie/rust/tree/explicit-tailcalls](https://github.com/DemiMarie/rust/tree/explicit-tailcalls). A current implementation is planned as part of this RFC. +[DemiMarie/rust/tree/explicit-tailcalls](https://github.com/DemiMarie/rust/tree/explicit-tailcalls). +A new implementation is planned as part of this RFC. The parser parses `become` exactly how it parses the `return` keyword. The difference in semantics is handled later. @@ -217,7 +324,9 @@ New nodes are added in HIR and THIR to correspond to `become`. In MIR, the funct 3. The basic block being branched into has length zero. 4. The basic block being branched into terminates with a return. -If these conditions are fulfilled the function call following `become` is flagged to indicate the TCO requirement. This flag is then propagated to the corresponding backend. In the backend, there is an additional check if TCO can be performed. +If these conditions are fulfilled the function call following `become` is flagged to indicate the TCO requirement. This +flag is then propagated to the corresponding backend. In the backend, there is an additional check if TCO can be +performed. Should any check during compilation not pass a compiler error should be issued. @@ -225,43 +334,55 @@ Should any check during compilation not pass a compiler error should be issued. # Drawbacks [drawbacks]: #drawbacks -As this feature should be mostly independent from other features the main drawback lies in the implementation and maintenance effort. This feature adds a new keyword which will need to be implemented not only in Rust but also in other tooling. The primary effort, however, lies in supporting this feature in the backends: +As this feature should be mostly independent of other features the main drawback lies in the implementation and +maintenance effort. This feature adds a new keyword which will need to be implemented not only in Rust but also in other +tooling. The primary effort, however, lies in supporting this feature in the backends: - LLVM supports a `musttail` marker to indicate that TCO should be performed [docs](https://llvm.org/docs/LangRef.html#id327). Clang which already depends on this feature, seems to only generate correct code for the x86 backend [source](https://github.com/rust-lang/rfcs/issues/2691#issuecomment-1490009983) (as of 30.03.23). -- GCC does not support a equivalent `musttail` marker. +- GCC does not support an equivalent `musttail` marker. - WebAssembly accepted tail-calls into the [standard](https://github.com/WebAssembly/proposals/pull/157/) and Cranelift is now [working](https://github.com/bytecodealliance/rfcs/pull/29) towards supporting it. Additionally, this proposal is limited to exactly matching function signatures which will *not* allow general tail-calls, however, the work towards this initial version is likely to be useful for a more comprehensive version. -There is also a unwanted interaction between TCO and debugging. As TCO by design elides stack frames this information is lost during debugging, that is the parent functions and their local variable values are incomplete. As TCO provides a semantic guarantee of constant stack usage it is also not generally possible to disable TCO for debugging builds as then the stack could overflow. (Still maybe a compiler flag could be provided to temporarily disable TCO for debugging builds.) +There is also an unwanted interaction between TCO and debugging. As TCO by design elides stack frames this information is lost during debugging, that is the parent functions and their local variable values are incomplete. As TCO provides a semantic guarantee of constant stack usage it is also not generally possible to disable TCO for debugging builds as then the stack could overflow. (Still maybe a compiler flag could be provided to temporarily disable TCO for debugging builds.) # Rationale and alternatives [rationale-and-alternatives]: #rationale-and-alternatives ## Why is this design the best in the space of possible designs? -This design is the best tradeoff between implementation effort and provided functionality, while also offering a good starting point towards exploration of a more general implementation. To expand on this, compared to other options creating a function local scope with the use of `become` greatly reduces implementation effort. Additionally, limiting tail-callable functions to those with an exactly matching function signatures enforces a common stack layout across all functions. This should in theory, depending on the backend, allow tail-calls to be performed without any stack shuffling, indeed it might even be possible to do so for indirect calls or external functions. +This design is the best tradeoff between implementation effort and functionality, while also offering a good starting +point toward further exploration of a more general implementation. To expand on this, compared to other options +creating a function local scope with the use of `become` greatly reduces implementation effort. Additionally, limiting +tail-callable functions to those with exactly matching function signatures enforces a common stack layout across all +functions. This should in theory, depending on the backend, allow tail calls to be performed without any stack +shuffling, indeed it might even be possible to do so for indirect calls or external functions. ## What other designs have been considered and what is the rationale for not choosing them? There are some designs that either can not achieve the same performance or functionality as the chosen approach. Though most other designs evolve around how to mark what should be a tail-call or marking what functions can be tail called. There is also the possibility of providing support for a custom backend (e.g. LLVM) or MIR pass. -There might also be some variation on the current design, which can be explore after the chosen design has been implemented see [unresolved questions](#unresolved) for some possibilities. - ### Trampoline based Approach -There could be a trampoline based approach ([comment](https://github.com/rust-lang/rfcs/pull/1888#issuecomment-326952763)) that can fulfill the semantic guarantee of using constant stack space, though they can not be used to achieve the performance that the chosen design is capable of. Additionally, functions need to be known during compile time for these approaches to work. +There could be a trampoline-based approach +([comment](https://github.com/rust-lang/rfcs/pull/1888#issuecomment-326952763)) that can fulfill the semantic guarantee +of using constant stack space, though they can not be used to achieve the performance that the chosen design is capable +of. Additionally, functions need to be known during compile time for these approaches to work. ### Principled Local Goto One alternative would be to support some kind of local goto natively, indeed there exists a -[pre-RFC](https://internals.rust-lang.org/t/pre-rfc-safe-goto-with-value/14470/9?u=scottmcm) ([comment](https://github.com/rust-lang/rfcs/issues/2691#issuecomment-1458604986)). This design should be able to achieve the same performance and stack usage, though it seems to be quite difficult to implement and does not seems to be as flexible as the chosen design (regarding indirect calls / external functions). +[pre-RFC](https://internals.rust-lang.org/t/pre-rfc-safe-goto-with-value/14470/9?u=scottmcm) ([comment](https://github.com/rust-lang/rfcs/issues/2691#issuecomment-1458604986)). This design should be able to achieve the same performance and stack usage, though it seems to be quite difficult to implement and does not seem to be as flexible as the chosen design (especially regarding indirect calls / external functions). ### Attribute on Function Declaration One alternative is to mark a group of functions that should be mutually tail-callable [example](https://github.com/rust-lang/rfcs/pull/1888#issuecomment-1161525527) with some follow up [discussion](https://github.com/rust-lang/rfcs/pull/1888#issuecomment-1185828948). -The goal behind this design is to TCO functions other than exactly matching function signatures, in theory this just requires that tail-called functions are callee cleanup, which is a mismatch to the default calling convention used by Rust. To limit the impact of this change all functions that should be TCO-able should be marked with a attribute. +The goal behind this design is to allow TCO of functions that do not have exactly matching function signatures, in +theory, this just requires that tail-called functions are callee cleanup, which is a mismatch to the default calling +convention used by Rust. To limit the impact of this change all functions that should be TCO-able should be marked with +an attribute. -While quite noisy it is also less flexible than the chosen approach. Indeed TCO is a property of the call and not a function, sometimes a call should be guaranteed to be TCO and sometimes not, marking a function would be less flexible. +While quite noisy it is also less flexible than the chosen approach. Indeed TCO is a property of the call and not a +function, sometimes a call should be guaranteed to be TCO and sometimes not, marking a function would be less flexible. ### Attribute on `return` -One alternative could be to use a attribute instead of the `become` keyword for function calls. To my knowledge this would be the first time a attribute would be allowed for a call. Example: +One alternative could be to use an attribute instead of the `become` keyword for function calls. To my knowledge, this would be the first time an attribute would be allowed for a call. Example: ```rust fn a() { @@ -272,25 +393,29 @@ fn a() { } ``` -This alternative mostly comes to taste (or bikeshedding) and `become` was chosen as it is shorter to write. +This alternative mostly comes down to taste (or bikeshedding) and `become` was chosen as it is already reserved and +shorter to write. ### Custom compiler or MIR passes One more distant alternative would be to support a custom compiler or MIR pass so that this optimization can be done externally. While supported for LLVM [Zulip](https://rust-lang.zulipchat.com/#narrow/stream/187780-t-compiler.2Fwg-llvm/topic/.E2.9C.94.20Running.20Custom.20LLVM.20Pass/near/320275483), for MIR this is not supported [discussion](https://internals.rust-lang.org/t/mir-compiler-plugins-for-custom-mir-passes/3166/10). -This would be a error prone and unergonomic approach to solving this problem. +This would be an error-prone and unergonomic approach to solving this problem. ## What is the impact of not doing this? -One goal of Rust is to ([source](https://blog.rust-lang.org/inside-rust/2022/04/04/lang-roadmap-2024.html)): > Rust's goal is to empower everyone to build reliable and efficient software. +([source](https://blog.rust-lang.org/inside-rust/2022/04/04/lang-roadmap-2024.html)) -This feature provides a crucial optimization for some low level code. It seems that without this feature there is a [big incentive](https://github.com/rust-lang/rust/issues/102952) to use other system level languages that can perform TCO. +This feature provides a crucial optimization for some low-level code. It seems that without this feature there is a big +incentive for developers of those specific applications to use other system-level languages that can perform TCO. -Additionally, this feature enables recursive algorithms that require TCO, which would provide better support for functional programming in Rust. +Additionally, this feature enables recursive algorithms that require TCO, which would provide better support for +functional programming in Rust. ## If this is a language proposal, could this be done in a library or macro instead? Does the proposed change make Rust code easier or harder to read, understand, and maintain? -While there exist libraries for a trampoline based method to avoid growing the stack, this is not enough to achieve the possible performance of real TCO, so this feature requires support by the compiler itself. +While there exist libraries for a trampoline-based method to avoid growing the stack, this is not enough to achieve the +possible performance of real TCO, so this feature requires support from the compiler itself. # Prior art @@ -310,15 +435,21 @@ If there is no prior art, that is fine - your ideas are interesting to us whethe Note that while precedent set by other languages is some motivation, it does not on its own motivate an RFC. Please also take into consideration that rust sometimes intentionally diverges from common language features. --> -Functional languages (such as OCaml, SML, Haskell, Scheme, and F#) usually depend on proper tail calls as a language feature, which requires guaranteed TCO. For system level languages the guaranteed TCO is usually wanted but implementation effort is a common reason this is not yet done. Even languages with managed code such as .Net or ECMAScript (though implementation is lagging behind) also support guaranteed TCO, again performance and resource usage were the main motivators for their implementation. +Functional languages (such as OCaml, SML, Haskell, Scheme, and F#) usually depend on proper tail calls as a language +feature, which requires guaranteed TCO. For system-level languages, guaranteed TCO is usually wanted but implementation +effort is a common reason this is not yet done. Even languages with managed code such as .Net or ECMAScript (as per the +standard) also support guaranteed TCO, again performance and resource usage were the main motivators for their +implementation. -See below for a more detailed description for compilers and languages. +See below for a more detailed description on select compilers and languages. ## Clang -Clang, as of April 2021, does offer support for a musttail attribute on `return` statements in both C and C++. This functionality is enabled by the support in LLVM, which should also be the first backend an initial implementation in Rust. +Clang, as of April 2021, does offer support for a `musttail` attribute on `return` statements in both C and C++. This +functionality is enabled by the support in LLVM, which should also be the first backend for an initial implementation in +Rust. -It seems this feature is received with "excitement" by those that can make use of it, a popular example of its usage is to improve [Protobuf parsing speed](https://blog.reverberate.org/2021/04/21/musttail-efficient-interpreters.html). However, one issue is that it is not very portable and there still seem to be some problem with it's [implementation](https://github.com/rust-lang/rfcs/issues/2691#issuecomment-1490009983). +It seems this feature is received with "excitement" by those that can make use of it, a popular example of its usage is to improve [Protobuf parsing speed](https://blog.reverberate.org/2021/04/21/musttail-efficient-interpreters.html). However, one issue is that it is not very portable and there still seems to be some problem with the [implementation](https://github.com/rust-lang/rfcs/issues/2691#issuecomment-1490009983). For a more detailed description see this excerpt from the description of the feature, taken from the [implementation](https://reviews.llvm.org/rG834467590842): @@ -352,7 +483,7 @@ For a more detailed description see this excerpt from the description of the fea > caller and callee must match, and they must not be variadic functions or have > old style K&R C function declarations. -There is also a proposal (https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2920.pdf) for the C Standard (https://www.open-std.org/JTC1/SC22/WG14/), outlining some limitations for Clang. +There is also a [proposal](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2920.pdf) for the [C Standard](https://www.open-std.org/JTC1/SC22/WG14/) outlining some limitations for Clang. > Clang requires the argument types, argument number, and return type to be the same between the > caller and the callee, as well as out-of-scope considerations such as C++ features and the calling > convention. Implementor experience with Clang shows that the ABI of the caller and callee must be @@ -406,7 +537,7 @@ https://github.com/rust-lang/rfcs/pull/1888#issuecomment-368204577 (Feb, 2018) > Just to be clear, the Edge team is against implicit tail-calls for all functions, but they're in favor of tail-calls-with-an-explicit-keyword (similar to this RFC). -A unofficial summary of the ECMA Script/ Javascript proposal for tail call/return +An unofficial summary of the ECMA Script/ Javascript proposal for tail call/return https://github.com/carbon-language/carbon-lang/issues/1761#issuecomment-1198672079 (Jul, 2022) # Unresolved questions @@ -419,15 +550,17 @@ https://github.com/carbon-language/carbon-lang/issues/1761#issuecomment-11986720 - What parts of the design do you expect to resolve through the RFC process before this gets merged? - The main uncertainties are regarding the exact restrictions on when backends can guarantee TCO, this RFC is intentionally strict to try and require as little as possible from the backends. - One point that needs to be decided is if TCO should be a feature that needs to be required from all backends or if it can be optional. - - Another point that needs to be decided is, if TCO is supported by a backend what exactly should be guaranteed? While the guarantee that there is no stack growth should be necessary, should performance (as in transforming `call` instructions into `jmp`) also be guaranteed? Note that a backend that guarantees performance should do so **always** otherwise the main intend of this RFC seems to be lost. + - Another point that needs to be decided is if TCO is supported by a backend what exactly should be guaranteed? While the guarantee that there is no stack growth should be necessary, should performance (as in transforming `call` instructions into `jmp`) also be guaranteed? Note that a backend that guarantees performance should do so **always** otherwise the main intent of this RFC seems to be lost. - What parts of the design do you expect to resolve through the implementation of this feature before stabilization? - Are all calling-convention used by Rust available for TCO with the proposed restrictions on function signatures? - Can the restrictions on function signatures be relaxed? - Can generic functions be supported? - - Can async functions be supported? (see [here](https://github.com/rust-lang/rfcs/pull/1888#issuecomment-1186604115) for an initial assesment) - - Can closures be supported? (see [here](https://github.com/rust-lang/rfcs/pull/1888#issuecomment-1186604115) for an initial assesment) + - Can async functions be supported? (see [here](https://github.com/rust-lang/rfcs/pull/1888#issuecomment-1186604115) for an initial assessment) + - Can closures be supported? (see [here](https://github.com/rust-lang/rfcs/pull/1888#issuecomment-1186604115) for an initial assessment) - Can dynamic function calls be supported? - Can functions outside the current crate be supported, functions from dynamically loaded libraries? + - Can functions that abort be supported? + - Can functions that return a result be supported? As in: `become foo()?;` - Is there some way to reduce the impact on debugging? @@ -454,11 +587,11 @@ The section merely provides additional information. --> It seems possible to keep the restriction on exactly matching function signatures by offering some kind of placeholder arguments to pad out the differences. For example: ```rust foo(a: u32, b: u32) { - // ... + // uses `a` and `b` } bar(a: u32, _b: u32) { - // ... + // only uses `a` } ``` Maybe it is useful to provide a macro or attribute that inserts missing arguments. @@ -470,4 +603,6 @@ bar(a: u32) { ``` ## Function Programming -It might be possible to allow even more functional programming paradigms based on TCO, the examples in this RFC still seem quite far from typical functional programming. \ No newline at end of file +This might be a silly idea but if guaranteed TCO is supported there could be further language extensions to make Rust +more attractive for functional programming paradigms. Though it is unclear to me how far this should be taken or what +changes exactly would be a benefit. From 03bdd9f8cc39850b8e0bbb901a4e750680b97f6c Mon Sep 17 00:00:00 2001 From: phi-go Date: Thu, 6 Apr 2023 16:48:54 +0200 Subject: [PATCH 12/92] update based on reviews --- text/0000-guaranteed-tco.md | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/text/0000-guaranteed-tco.md b/text/0000-guaranteed-tco.md index b22537e24c4..67ae18c2348 100644 --- a/text/0000-guaranteed-tco.md +++ b/text/0000-guaranteed-tco.md @@ -7,7 +7,7 @@ [summary]: #summary This feature provides a guarantee that function calls are tail-call optimized via the `become` keyword. If this -guarantee can not be provided by the compiler an error is generated instead. +guarantee can not be provided by the compiler a compile time error is generated instead. # Motivation [motivation]: #motivation @@ -343,7 +343,7 @@ tooling. The primary effort, however, lies in supporting this feature in the bac Additionally, this proposal is limited to exactly matching function signatures which will *not* allow general tail-calls, however, the work towards this initial version is likely to be useful for a more comprehensive version. -There is also an unwanted interaction between TCO and debugging. As TCO by design elides stack frames this information is lost during debugging, that is the parent functions and their local variable values are incomplete. As TCO provides a semantic guarantee of constant stack usage it is also not generally possible to disable TCO for debugging builds as then the stack could overflow. (Still maybe a compiler flag could be provided to temporarily disable TCO for debugging builds.) +There is also an unwanted interaction between TCO and debugging. As TCO by design elides stack frames this information is lost during debugging, that is the parent functions and their local variable values are incomplete. As TCO provides a semantic guarantee of constant stack usage it is also not generally possible to disable TCO for debugging builds as then the stack could overflow. (Still maybe a compiler flag could be provided to temporarily disable TCO for debugging builds. As suggested [here](https://github.com/rust-lang/rfcs/pull/3407/files#r1159817279), another option would be special support for `become` by a debugger. With this support the debugger would keep track of the N most recent calls providing at least some context to the bug.) # Rationale and alternatives @@ -560,7 +560,6 @@ https://github.com/carbon-language/carbon-lang/issues/1761#issuecomment-11986720 - Can dynamic function calls be supported? - Can functions outside the current crate be supported, functions from dynamically loaded libraries? - Can functions that abort be supported? - - Can functions that return a result be supported? As in: `become foo()?;` - Is there some way to reduce the impact on debugging? @@ -602,7 +601,7 @@ bar(a: u32) { } ``` -## Function Programming +## Functional Programming This might be a silly idea but if guaranteed TCO is supported there could be further language extensions to make Rust more attractive for functional programming paradigms. Though it is unclear to me how far this should be taken or what changes exactly would be a benefit. From 30967a741b873d9014dd72825d0d6a82d808aed6 Mon Sep 17 00:00:00 2001 From: phi-go Date: Fri, 7 Apr 2023 07:50:12 +0200 Subject: [PATCH 13/92] remove stray TODO --- text/0000-guaranteed-tco.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-guaranteed-tco.md b/text/0000-guaranteed-tco.md index 67ae18c2348..05dafbaa69d 100644 --- a/text/0000-guaranteed-tco.md +++ b/text/0000-guaranteed-tco.md @@ -32,7 +32,7 @@ significantly faster interpreters / emulators. One common example of the usefuln performance of Protobuf parsing as described in this [blog post](https://blog.reverberate.org/2021/04/21/musttail-efficient-interpreters.html), this approach would then also be possible in Rust. -# TODO Guide-level explanation +# Guide-level explanation [guide-level-explanation]: #guide-level-explanation -This explanation is mostly based on a [previous RFC](https://github.com/DemiMarie/rfcs/blob/become/0000-proper-tail-calls.md#detailed-design) +This explanation is mostly based on the [previous RFC](https://github.com/DemiMarie/rfcs/blob/become/0000-proper-tail-calls.md#detailed-design) though is more restricted as the current RFC does not target general tail calls anymore. The goal of this RFC is to describe a first implementation that is already useful while providing a basis to explore From fd35d0f4428f8e7cde72fa78052572ae554de629 Mon Sep 17 00:00:00 2001 From: phi-go Date: Sun, 9 Apr 2023 09:07:18 +0200 Subject: [PATCH 28/92] change example in semantics section to ref --- text/0000-guaranteed-tco.md | 21 ++------------------- 1 file changed, 2 insertions(+), 19 deletions(-) diff --git a/text/0000-guaranteed-tco.md b/text/0000-guaranteed-tco.md index 19163288a16..1781f4ba116 100644 --- a/text/0000-guaranteed-tco.md +++ b/text/0000-guaranteed-tco.md @@ -71,6 +71,7 @@ Now on to some examples. Starting with how `return` and `become` differ, two exa pitfalls. ### The difference between `return` and `become` +[difference]: #difference The essential difference to `return` is that `become` drops function local variables **before** the function call instead of after. So the following function ([original example](https://github.com/rust-lang/rfcs/issues/2691#issuecomment-1136728427)): ```rust @@ -283,25 +284,7 @@ This implies that it is invalid for any references into the caller's stack frame As `become` is always in a tail position (due to being used in place of `return`), this requirement for TCO is already fulfilled. -Example: -```rust -fn x() { - let a = Box::new(()); - let b = Box::new(()); - become y(a) -} -``` - -Will be desugared in the following way: -```rust -fn x() { - let a = Box::new(()); - let b = Box::new(()); - let _tmp = a; - drop(b); - become y(_tmp) -} -``` +See this earlier [example](#the-difference-between-return-and-become) on how become causes drops to be elaborated. ## Implementation [implementation]: #implementation From bef3933b609ea3cef87b0720aec334121ec28fde Mon Sep 17 00:00:00 2001 From: Philipp Goerz Date: Tue, 11 Apr 2023 14:39:30 +0200 Subject: [PATCH 29/92] rewrite to use TCE --- text/0000-guaranteed-tco.md | 206 +++++++++++++++++++----------------- 1 file changed, 109 insertions(+), 97 deletions(-) diff --git a/text/0000-guaranteed-tco.md b/text/0000-guaranteed-tco.md index 1781f4ba116..eedce357f88 100644 --- a/text/0000-guaranteed-tco.md +++ b/text/0000-guaranteed-tco.md @@ -6,13 +6,14 @@ # Summary [summary]: #summary -This feature provides a guarantee that function calls are tail-call optimized via the `become` keyword. If this -guarantee can not be provided by the compiler a compile time error is generated instead. +While tail call optimization (TCO) is already supported by Rust, there is no way to specify when +it should be guaranteed that a stack frame should be reused. +This RFC describes a language feature providing tail call elimination via the `become` keyword providing this guarantee. +If this guarantee can not be provided by the compiler a compile time error is generated instead. # Motivation [motivation]: #motivation - -While opportunistic tail-call optimization (TCO) is already supported there currently is no way to guarantee TCO. This +While tail-call optimization (TCO) is already supported there currently is no way to guarantee stack frame reuse. This guarantee is interesting for two general goals. One goal is to do function calls without growing the stack, this mainly has semantic implications as recursive algorithms can overflow the stack without this optimization. The other goal is to, in simple words, replace `call` instructions by `jmp` instructions, this optimization has performance implications @@ -22,14 +23,16 @@ Note that workarounds for the first goal exist by using trampolining which limit functionality can be provided as a library, inclusion in the language can provide greater adoption of a more functional programming style. -For the second goal no guaranteed method exists. The decision if TCO is performed depends on the specific code and the -compiler version. This can result in TCO surprisingly no longer being performed due to small changes to the code or a -change of the compiler version, see this [issue](https://github.com/rust-lang/rust/issues/102952) for an example. +For the second goal no guaranteed method exists. While TCO can have the intended effect, if it is performed depends on +the specific code and the compiler version. This can result in unexpected slow-downs after small changes to the code or +a change of the compiler version, see this [issue](https://github.com/rust-lang/rust/issues/102952) for an example. Some specific use cases that are supported by this feature are new ways to encode state machines and jump tables, -allowing code to be written in a continuation-passing style, recursive algorithms to be guaranteed TCO, or guaranteeing -significantly faster interpreters / emulators. One common example of the usefulness of tail calls in C is improving -performance of Protobuf parsing as described in this [blog post](https://blog.reverberate.org/2021/04/21/musttail-efficient-interpreters.html), this approach would then also be possible in Rust. +allowing code to be written in a continuation-passing style, using recursive algorithms without the danger of +overflowing the stack, or guaranteeing significantly faster interpreters / emulators. One common example of the +usefulness of tail calls in C is improving performance of Protobuf parsing as described in this +[blog post](https://blog.reverberate.org/2021/04/21/musttail-efficient-interpreters.html), +this approach would then also be possible in Rust. # Guide-level explanation @@ -47,27 +50,65 @@ Explain the proposal as if it was already included in the language and you were For implementation-oriented RFCs (e.g. for compiler internals), this section should focus on how compiler contributors should think about the change, and give examples of its concrete impact. For policy RFCs, this section should provide an example-driven introduction to the policy, and explain its impact in concrete terms. --> Pretending this RFC has already been accepted into Rust, it could be explained to another Rust programmer as follows. -## Introducing new named concepts. -Rust now supports a way to guarantee tail call optimization (TCO). This is interesting for two groups of programmers: those that want to use recursive algorithms and those that want to create highly optimized code. Note that using this feature can have some difficulties, as there are several requirements on functions where TCO can be performed. +## Tail Call Elimination +[tail-call-elimination]: #tail-call-elimination + +Rust supports a way to specify tail call elimination (TCE) for function calls. +If TCE is requested for a call the called function will reuse the stack frame of the calling function, +assuming all requirements are fulfilled. +The optimization of reusing the stack frame is also known as tail call optimization (TCO) which Rust already supports. +The difference between TCE and TCO is that TCE guarantees that the stack frame is reused, while +with TCO the stack frame is only reused if the compiler expects doing so will be faster (or smaller +if optimizing for space). + +TCE is interesting for two groups of programmers: Those that want to use recursive algorithms, +which can overflow the stack if the stack frame is not reused; and those that want to create highly optimized code, +as creating new stack frames can be expensive. + +To request TCE the `become` keyword can be used instead of `return`, and only there. +However, it is not quite so simple. +Several requirements need to be fulfilled for TCE (and TCO) to work. + +The main restriction is that the argument to `become` can be simplified to a tail call, +the call is the last action that happens in the function. +Supported are calls such as `become foo()`, `become foo(a)`, `become foo(a, b)`, `become foo(1 + 1)`, +`become foo(bar())`, `become foo.method()`, or `become function_table[idx](arg)`. +Calls that are not in the tail position can **not** be used for example `become foo() + 1` is not allowed. +The function would need to be evaluated and then the addition would need to take place. + +A further restriction is on the function signature of the caller and callee. +As the stack frame should be reused it needs to be similar for both functions. +The stack frame layout is based on the calling convention, arguments, as well as return types (the function signature in +short). +Currently, all of these need to match exactly. + +There is a further restriction on the arguments. +As the stack frame of the calling function is replaced it is not possible to pass references to local variables. +This is the same reason why returning references to local variables is not possible. + +If any of these restrictions are not met when using `become` a compilation error is thrown. + +Note that using this feature can make debugging difficult. +As `become` causes the stack frame to be replaced, debugging context is lost. +Expect to no longer see any parent functions that used `become` in the stack trace, +or have access to their variable values while debugging. + + +As this feature is strictly opt-in and the `become` keyword is already reserved, this has no impact on existing code. + + +(TODO Error messages once an initial implementation exists) -TCO provides a way to call functions without creating a new stack frame. Instead, the stack frame of the calling -function is reused. This is only possible if the functions have a similar enough stack layout. This -layout is based on the calling convention, arguments, as well as return types (the function signature in short). -Currently, all of these need to match exactly; otherwise, an error will be thrown during compilation. +(TODO migration guidance) -Reusing the stack frame has two effects: One is that the stack will no longer grow, allowing unlimited nested function -calls, if all are TCO'ed. The other is that creating a new stack frame is actually quite expensive, especially for code -with a high density of function calls, so reusing the stack frame can lead to massive performance improvements. -To guarantee TCO the `become` keyword can be used instead of the `return` keyword (and only there). However, only a -"plain" function or method call can take the place of the argument. That is supported are calls such as `become foo()`, -`become foo(a)`, `become foo(a, b)`, however, **not** supported are calls that contain or are part of a larger -expression such as `become foo() + 1`, `become foo(1 + 1)`, `become foo(bar())` (though this may be subject to change). -Additionally, as already said the function signature must exactly match that of the calling function (a restriction that -might also be loosened a bit in the future). +## Teaching +For new Rust programmers this feature should probably be introduced late into the learning process, it requires +understanding some advanced concepts and the current use cases are likely to be niche. So it should be taught similarly +as to programmers that already know Rust. ## Examples -Now on to some examples. Starting with how `return` and `become` differ, two example use cases, and some potential +On to some examples. Starting with how `return` and `become` differ, two example use cases, and some potential pitfalls. ### The difference between `return` and `become` @@ -104,11 +145,14 @@ fn x() { ``` -This early dropping allows the compiler to avoid many complexities associated with deciding if a call can be TCO. Instead, the heavy lifting is done by the borrow checker, which will produce a lifetime error if references to local -variables are passed to the called function. This is distinct from `return`, which _does_ allow references to local variables to be passed. Indeed, this difference in the handling of local variables is also the main difference between `return` and `become`. +This early dropping allows the compiler to avoid many complexities associated with deciding if the stack frame can be +reused. Instead, the heavy lifting is done by the borrow checker, which will produce a lifetime error if references to +local variables are passed to the called function. This is distinct from `return`, which _does_ allow references to +local variables to be passed. Indeed, this difference in the handling of local variables is also the main difference +between `return` and `become`. ### Use Case 1: Recursive Algorithm -A simple example is the following algorithm for summing the elemnts of a `Vec`. While this would usually be done with iteration in Rust, this example illustrates a simple use of `become`. Without TCO, this example could overflow the stack. +A simple example is the following algorithm for summing the elements of a `Vec`. While this would usually be done with iteration in Rust, this example illustrates a simple use of `become`. Without TCE, this example could overflow the stack. ```rust fn sum_list(data: Vec, mut offset: usize, mut accum: u64) -> u64 { @@ -165,7 +209,7 @@ fn next_instruction(mut self) { ### Omission of the `become` keyword causes the call to be `return` instead. ([original example](https://github.com/rust-lang/rfcs/pull/1888#issuecomment-278988088)) -This is a potential source of confusion, indeed in a function language where every call is expected to be TCO this would be quite unexpected. (Maybe in functions that use `become` a lint should be applied that enforces usage of either `return` or `become` in functions where at least one `become` is used.) +This is a potential source of confusion, indeed in a functional language where calls are expected to be TCE this would be quite unexpected. (Maybe in functions that use `become` a lint should be applied that enforces usage of either `return` or `become` in functions where at least one `become` is used.) ```rust fn foo(x: i32) -> i32 { @@ -185,8 +229,8 @@ fn foo(x: i32) -> i32 { ([original example](https://github.com/rust-lang/rfcs/pull/1888#issuecomment-279062656)) Here one function uses `become` the other `return`, this is another potential source of confusion. This mutual recursion -would eventual overflow the stack. As mutual recursion can also happen across more functions, `become` needs to be used -consistently in all functions if TCO should be guaranteed. (Maybe it is also possible to create a lint for these +would eventually overflow the stack. As mutual recursion can also happen across more functions, `become` needs to be +used consistently in all functions if TCO should be guaranteed. (Maybe it is also possible to create a lint for these use cases as well.) ```rust @@ -201,41 +245,6 @@ fn bar(n: i32) { ``` -## Explaining how Rust programmers should *think* about the feature, and how it should impact the way they use Rust. It should explain the impact as concretely as possible. -This feature is only useful for some specific algorithms, where it can be essential, though it might also create a push -towards a more functional programming style in Rust. In general this feature is probably unneeded for most Rust -programmers, Rust has been getting on fine without this feature for most applications. As a result it impacts only those -few Rust programmers that require TCO provided by this feature. - - -## If applicable, provide sample error messages, deprecation warnings, or migration guidance. -(TODO Error messages once an initial implementation exists) - -There should be no need for deprecation warnings. - -Regarding migration guidance, it might be interesting to provide a lint that indicates that a trivial transformation -from `return` to `become` can be done for function calls where all requisites are already fulfilled. However, this lint -might be confusing and noisy. - - -## If applicable, describe the differences between teaching this to existing Rust programmers and new Rust programmers. -For new Rust programmers this feature should probably be introduced late into the learning process, it requires -understanding some advanced concepts and the current use cases are likely to be niche. So it should be taught similarly -as to programmers that already know Rust. It is likely enough to description the feature, explain TCO, compare the -differences to `return`, and give examples of possible use cases and mistakes. - - -## Discuss how this impacts the ability to read, understand, and maintain Rust code. Code is read and modified far more often than written; will the proposed feature make code easier to maintain? -As the `become` keyword is already reserved, this has no impact on existing code. For code -that does use this feature, it is required that a programmer understands the differences between `become` and `return`, -it is difficult to judge how big this impact is without an initial implementation. One difference, however, is in -debugging code that uses `become`. As the stack is not preserved, debugging context is lost which likely makes debugging -more difficult. That is, elided parent functions as well as their variable values are not available during debugging. -(Though this issue might be lessened by providing a flag to opt out of TCO, which would, however, break the semantic -guarantee of not creating stack frames. This is likely an issue that needs some investigation after creating an initial -implementation.) - - # Reference-level explanation [reference-level-explanation]: #reference-level-explanation Functional languages (such as OCaml, SML, Haskell, Scheme, and F#) usually depend on proper tail calls as a language -feature, which requires guaranteed TCO. For system-level languages, guaranteed TCO is usually wanted but implementation +feature (TCE for general calls). For system-level languages TCE is usually wanted but implementation effort is a common reason this is not yet done. Even languages with managed code such as .Net or ECMAScript (as per the -standard) also support guaranteed TCO, again performance and resource usage were the main motivators for their +standard) also support TCE, again performance and resource usage were the main motivators for their implementation. See below for a more detailed description on select compilers and languages. @@ -483,7 +492,7 @@ GCC does not support a feature equivalent to Clang's `musttail`, there also does ## Zig -Zig provides separate syntax to allow more flexibility than normal function calls. There are options for async calls, inlining, compile time evaluation of the called function, and to enforce TCO on the call. +Zig provides separate syntax to allow more flexibility than normal function calls. There are options for async calls, inlining, compile time evaluation of the called function, or specifying TCE on the call. ([source](https://ziglang.org/documentation/master/#call)) ```zig const expect = @import("std").testing.expect; @@ -500,11 +509,11 @@ fn add(a: i32, b: i32) i32 { (TODO: What is the community sentiment regarding this feature? Except for some bug reports I did not find anything.) ## Carbon -As per this [issue](https://github.com/carbon-language/carbon-lang/issues/1761) it seems providing TCO is of interest even if the implementation is difficult +As per this [issue](https://github.com/carbon-language/carbon-lang/issues/1761) it seems providing TCE is of interest even if the implementation is difficult ## .Net -The .Net JIT does support TCO as of 2020, a main motivator for this feature was improving performance. +The .Net JIT does support TCE as of 2020, a main motivator for this feature was improving performance. [Pull Request](https://github.com/dotnet/runtime/pull/341) ([Issue](https://github.com/dotnet/runtime/issues/2191)) > This implements tailcall-via-help support for all platforms supported by > the runtime. In this new mechanism the JIT asks the runtime for help @@ -535,11 +544,14 @@ https://github.com/carbon-language/carbon-lang/issues/1761#issuecomment-11986720 - What related issues do you consider out of scope for this RFC that could be addressed in the future independently of the solution that comes out of this RFC? --> - What parts of the design do you expect to resolve through the RFC process before this gets merged? - - The main uncertainties are regarding the exact restrictions on when backends can guarantee TCO, this RFC is intentionally strict to try and require as little as possible from the backends. - - One point that needs to be decided is if TCO should be a feature that needs to be required from all backends or if it can be optional. - - Another point that needs to be decided is if TCO is supported by a backend what exactly should be guaranteed? While the guarantee that there is no stack growth should be necessary, should performance (as in transforming `call` instructions into `jmp`) also be guaranteed? Note that a backend that guarantees performance should do so **always** otherwise the main intent of this RFC seems to be lost. + - The main uncertainties are regarding the exact restrictions on when backends can offer TCE, this RFC is intentionally strict to try and require as little as possible from the backends. + - One point that needs to be decided is if TCE should be a feature that needs to be required from all backends or if it can be optional. + - Another point that needs to be decided is if TCE is supported by a backend what exactly should be guaranteed? While the guarantee that there is no stack growth should be necessary, should performance (as in transforming `call` instructions into `jmp`) also be guaranteed? Note that a backend that guarantees performance should do so **always** otherwise the main intent of this RFC seems to be lost. + - Migration guidance, it might be interesting to provide a lint that indicates that a trivial transformation +from `return` to `become` can be done for function calls where all requisites are already fulfilled. However, this lint +might be confusing and noisy. Decide on if this lint or others should be added. - What parts of the design do you expect to resolve through the implementation of this feature before stabilization? - - Are all calling-convention used by Rust available for TCO with the proposed restrictions on function signatures? + - Are all calling-convention used by Rust available for TCE with the proposed restrictions on function signatures? - Can the restrictions on function signatures be relaxed? - Can generic functions be supported? - Can async functions be supported? (see [here](https://github.com/rust-lang/rfcs/pull/1888#issuecomment-1186604115) for an initial assessment) @@ -589,6 +601,6 @@ bar(a: u32) { ``` ## Functional Programming -This might be a silly idea but if guaranteed TCO is supported there could be further language extensions to make Rust +This might be a silly idea but if TCE is supported there could be further language extensions to make Rust more attractive for functional programming paradigms. Though it is unclear to me how far this should be taken or what changes exactly would be a benefit. From e03c86908ae78b8c2969d931fab1c7e8051c656b Mon Sep 17 00:00:00 2001 From: phi-go Date: Mon, 17 Apr 2023 11:11:58 +0200 Subject: [PATCH 30/92] change feature name --- text/{0000-guaranteed-tco.md => 0000-explicit-tail-calls.md} | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) rename text/{0000-guaranteed-tco.md => 0000-explicit-tail-calls.md} (99%) diff --git a/text/0000-guaranteed-tco.md b/text/0000-explicit-tail-calls.md similarity index 99% rename from text/0000-guaranteed-tco.md rename to text/0000-explicit-tail-calls.md index eedce357f88..d1c8d0f9fec 100644 --- a/text/0000-guaranteed-tco.md +++ b/text/0000-explicit-tail-calls.md @@ -1,4 +1,4 @@ -- Feature Name: guaranteed_tco +- Feature Name: explicit_tail_calls - Start Date: 2023-04-01 - RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000) - Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000) From 86ad37bd65ad6f31f6c4c6e44d310d9030360ebb Mon Sep 17 00:00:00 2001 From: phi-go Date: Mon, 17 Apr 2023 11:58:57 +0200 Subject: [PATCH 31/92] use the term TCE correctly --- text/0000-explicit-tail-calls.md | 25 ++++++++++--------------- 1 file changed, 10 insertions(+), 15 deletions(-) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index d1c8d0f9fec..5a44a0a5385 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -5,19 +5,18 @@ # Summary [summary]: #summary - -While tail call optimization (TCO) is already supported by Rust, there is no way to specify when -it should be guaranteed that a stack frame should be reused. +While tail call elimination (TCE) is already possible via tail call optimization (TCO) in Rust, there is no way to guaranteed that a stack frame should be reused. This RFC describes a language feature providing tail call elimination via the `become` keyword providing this guarantee. If this guarantee can not be provided by the compiler a compile time error is generated instead. # Motivation [motivation]: #motivation -While tail-call optimization (TCO) is already supported there currently is no way to guarantee stack frame reuse. This -guarantee is interesting for two general goals. One goal is to do function calls without growing the stack, this mainly -has semantic implications as recursive algorithms can overflow the stack without this optimization. The other goal is -to, in simple words, replace `call` instructions by `jmp` instructions, this optimization has performance implications -and can provide massive speedups for algorithms that have a high density of function calls. +Tail call elimination (TCE) allows stack frames to be reused. +While TCE via tail call optimization (TCO) is already supported by Rust, as is normal for optimizations TCE will only be applied if the compiler excpects a improvement by doing so. +There is currently no way to specify that TCE should be guaranteed. +This guarantee is interesting for two general goals. +One goal is to do function calls without growing the stack, this mainly has semantic implications as recursive algorithms can overflow the stack without this optimization. +The other goal is to avoid paying the cost to create a new stack frame, replacing `call` instructions by `jmp` instructions, this optimization has performance implications and can provide massive speedups for algorithms that have a high density of function calls. Note that workarounds for the first goal exist by using trampolining which limits the stack depth. However, while this functionality can be provided as a library, inclusion in the language can provide greater adoption of a more functional @@ -53,13 +52,9 @@ Pretending this RFC has already been accepted into Rust, it could be explained t ## Tail Call Elimination [tail-call-elimination]: #tail-call-elimination -Rust supports a way to specify tail call elimination (TCE) for function calls. -If TCE is requested for a call the called function will reuse the stack frame of the calling function, -assuming all requirements are fulfilled. -The optimization of reusing the stack frame is also known as tail call optimization (TCO) which Rust already supports. -The difference between TCE and TCO is that TCE guarantees that the stack frame is reused, while -with TCO the stack frame is only reused if the compiler expects doing so will be faster (or smaller -if optimizing for space). +Rust supports a way to guarantee tail call elimination (TCE) for function calls using the `become` keyword. +If TCE is requested for a call the called function will reuse the stack frame of the calling function, assuming all requirements are fulfilled. +Note that TCE can opportunistically also be performed by Rust using tail call optimization (TCO), this will cause TCE to be used if it is deemed to be "better" (as in faster, or smaller if optimizing for space). TCE is interesting for two groups of programmers: Those that want to use recursive algorithms, which can overflow the stack if the stack frame is not reused; and those that want to create highly optimized code, From c55c189c44605b9568daf7cd3da661b8abd111f2 Mon Sep 17 00:00:00 2001 From: phi-go Date: Mon, 17 Apr 2023 12:53:39 +0200 Subject: [PATCH 32/92] add more unresolved questions --- text/0000-explicit-tail-calls.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index 5a44a0a5385..981fc8631fd 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -542,12 +542,12 @@ https://github.com/carbon-language/carbon-lang/issues/1761#issuecomment-11986720 - The main uncertainties are regarding the exact restrictions on when backends can offer TCE, this RFC is intentionally strict to try and require as little as possible from the backends. - One point that needs to be decided is if TCE should be a feature that needs to be required from all backends or if it can be optional. - Another point that needs to be decided is if TCE is supported by a backend what exactly should be guaranteed? While the guarantee that there is no stack growth should be necessary, should performance (as in transforming `call` instructions into `jmp`) also be guaranteed? Note that a backend that guarantees performance should do so **always** otherwise the main intent of this RFC seems to be lost. - - Migration guidance, it might be interesting to provide a lint that indicates that a trivial transformation -from `return` to `become` can be done for function calls where all requisites are already fulfilled. However, this lint -might be confusing and noisy. Decide on if this lint or others should be added. + - Migration guidance, it might be interesting to provide a lint that indicates that a trivial transformation from `return` to `become` can be done for function calls where all requisites are already fulfilled. However, this lint might be confusing and noisy. Decide on if this lint or others should be added. + - Should a lint be added for functions that are marked to be tail call or use become. See discussion [here](https://github.com/rust-lang/rfcs/pull/3407#issuecomment-1500620309). - What parts of the design do you expect to resolve through the implementation of this feature before stabilization? - Are all calling-convention used by Rust available for TCE with the proposed restrictions on function signatures? - Can the restrictions on function signatures be relaxed? + - One option for intra-crate direct calls is to automatically pad the arguments during compilation see [here](https://github.com/rust-lang/rfcs/pull/3407#issuecomment-1500620309). Does this have an influence on other calls? How much implementation effort is it? - Can generic functions be supported? - Can async functions be supported? (see [here](https://github.com/rust-lang/rfcs/pull/1888#issuecomment-1186604115) for an initial assessment) - Can closures be supported? (see [here](https://github.com/rust-lang/rfcs/pull/1888#issuecomment-1186604115) for an initial assessment) From 77f93aceccdeee6c7a93fd5221357e499e82e876 Mon Sep 17 00:00:00 2001 From: phi-go Date: Mon, 17 Apr 2023 13:13:23 +0200 Subject: [PATCH 33/92] update attribute on return section --- text/0000-explicit-tail-calls.md | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index 981fc8631fd..a869b7b6edc 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -373,7 +373,7 @@ While quite noisy it is also less flexible than the chosen approach. Indeed TCE function, sometimes a call should be guaranteed to be TCE and sometimes not, marking a function would be less flexible. ### Attribute on `return` -One alternative could be to use an attribute instead of the `become` keyword for function calls. To my knowledge, this would be the first time an attribute would be allowed for a call. Example: +One alternative could be to use an attribute instead of the `become` keyword for function calls. Example: ```rust fn a() { @@ -384,8 +384,7 @@ fn a() { } ``` -This alternative mostly comes down to taste (or bikeshedding) and `become` was chosen as it is already reserved and -shorter to write. +This alternative mostly comes down to taste (or bikeshedding) and `become` was chosen as it is [reserved](https://rust-lang.github.io/rfcs/0601-replace-be-with-become.html) for this use, shorter to write, and as drop order changes compared to `return` a new keyword seems warranted. ### Custom compiler or MIR passes One more distant alternative would be to support a custom compiler or MIR pass so that this optimization can be done externally. While supported for LLVM [Zulip](https://rust-lang.zulipchat.com/#narrow/stream/187780-t-compiler.2Fwg-llvm/topic/.E2.9C.94.20Running.20Custom.20LLVM.20Pass/near/320275483), for MIR this is not supported [discussion](https://internals.rust-lang.org/t/mir-compiler-plugins-for-custom-mir-passes/3166/10). From 417aa48338886e3d9d4878a7f433a70530ecbdff Mon Sep 17 00:00:00 2001 From: phi-go Date: Tue, 18 Apr 2023 09:24:20 +0200 Subject: [PATCH 34/92] fix wrong link --- text/0000-explicit-tail-calls.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index a869b7b6edc..451c0a7e7dd 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -542,7 +542,7 @@ https://github.com/carbon-language/carbon-lang/issues/1761#issuecomment-11986720 - One point that needs to be decided is if TCE should be a feature that needs to be required from all backends or if it can be optional. - Another point that needs to be decided is if TCE is supported by a backend what exactly should be guaranteed? While the guarantee that there is no stack growth should be necessary, should performance (as in transforming `call` instructions into `jmp`) also be guaranteed? Note that a backend that guarantees performance should do so **always** otherwise the main intent of this RFC seems to be lost. - Migration guidance, it might be interesting to provide a lint that indicates that a trivial transformation from `return` to `become` can be done for function calls where all requisites are already fulfilled. However, this lint might be confusing and noisy. Decide on if this lint or others should be added. - - Should a lint be added for functions that are marked to be tail call or use become. See discussion [here](https://github.com/rust-lang/rfcs/pull/3407#issuecomment-1500620309). + - Should a lint be added for functions that are marked to be tail call or use become. See discussion [here](https://github.com/rust-lang/rfcs/pull/3407#discussion_r1159822824). - What parts of the design do you expect to resolve through the implementation of this feature before stabilization? - Are all calling-convention used by Rust available for TCE with the proposed restrictions on function signatures? - Can the restrictions on function signatures be relaxed? From 9a24c19f5a2ffb5351f94184d079b19a13e9b0f5 Mon Sep 17 00:00:00 2001 From: phi-go Date: Tue, 18 Apr 2023 15:19:10 +0200 Subject: [PATCH 35/92] add another reference to question regarding lints --- text/0000-explicit-tail-calls.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index 451c0a7e7dd..b84b860eab6 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -542,7 +542,7 @@ https://github.com/carbon-language/carbon-lang/issues/1761#issuecomment-11986720 - One point that needs to be decided is if TCE should be a feature that needs to be required from all backends or if it can be optional. - Another point that needs to be decided is if TCE is supported by a backend what exactly should be guaranteed? While the guarantee that there is no stack growth should be necessary, should performance (as in transforming `call` instructions into `jmp`) also be guaranteed? Note that a backend that guarantees performance should do so **always** otherwise the main intent of this RFC seems to be lost. - Migration guidance, it might be interesting to provide a lint that indicates that a trivial transformation from `return` to `become` can be done for function calls where all requisites are already fulfilled. However, this lint might be confusing and noisy. Decide on if this lint or others should be added. - - Should a lint be added for functions that are marked to be tail call or use become. See discussion [here](https://github.com/rust-lang/rfcs/pull/3407#discussion_r1159822824). + - Should a lint be added for functions that are marked to be tail call or use become. See discussion [here](https://github.com/rust-lang/rfcs/pull/3407#discussion_r1159822824), as well as, the clippy and rustfmt changes of an initial [implementation](https://github.com/semtexzv/rust/commit/29f430976542011d53e149650f8e6c7221545207#diff-6c8f5168858fed7066e1b6c8badaca8b4a033d0204007b3e3025bf7dd33fffcb) (2022). - What parts of the design do you expect to resolve through the implementation of this feature before stabilization? - Are all calling-convention used by Rust available for TCE with the proposed restrictions on function signatures? - Can the restrictions on function signatures be relaxed? From f23c09ebbc7ab1353c0888720f77217bc2c3cb33 Mon Sep 17 00:00:00 2001 From: phi-go Date: Wed, 19 Apr 2023 13:52:04 +0200 Subject: [PATCH 36/92] expand on why operators are not supported --- text/0000-explicit-tail-calls.md | 30 +++++++++++++++++++++--------- 1 file changed, 21 insertions(+), 9 deletions(-) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index b84b860eab6..c37007f585c 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -68,7 +68,7 @@ The main restriction is that the argument to `become` can be simplified to a tai the call is the last action that happens in the function. Supported are calls such as `become foo()`, `become foo(a)`, `become foo(a, b)`, `become foo(1 + 1)`, `become foo(bar())`, `become foo.method()`, or `become function_table[idx](arg)`. -Calls that are not in the tail position can **not** be used for example `become foo() + 1` is not allowed. +Calls that are not in the tail position can **not** be used, for example, `become foo() + 1` is not allowed. The function would need to be evaluated and then the addition would need to take place. A further restriction is on the function signature of the caller and callee. @@ -259,14 +259,26 @@ possible ways to relax the requirements for TCE. [syntax]: #syntax A function call can be specified to be TCE by using the `become` keyword in place of `return`. The `become` keyword is -already reserved, so there is no backwards-compatibility break. The `become` keyword must be followed by a plain -function call or method calls, that is supported are calls like: `become foo()`, `become foo(a)`, `become foo(a, b)`, -and so on, or `become foo.bar()` with plain arguments. Neither the function call nor any arguments can be part of a -larger expression such as `become foo() + 1`, `become foo(1 + 1)`, `become foo(bar())`. Additionally, there is a further -restriction on the tail-callable functions: the function signature must exactly match that of the calling function. - -Invocations of overloaded operators with at least one non-primitive argument were considered as valid targets, but were -rejected on grounds of being too error-prone. In any case, these can still be called as methods. +already reserved, so there is no backwards-compatibility break. The `become` keyword must be followed by a +function or method call, see the section on [tail-call-elimination](#tail-call-elimination) for examples. +Additionally, there is a further restriction on the tail-callable functions: the function signature must exactly match +that of the calling function. + +Invocations of overloaded operators were considered as valid targets, but were rejected on grounds of being too error-prone. +In any case, these can still be called as methods. One example of their error-prone nature ([source](https://github.com/rust-lang/rfcs/pull/3407#discussion_r1167112296)): +```rust +pub fn fibonacci(n: u64) -> u64 { + if n < 2 { + return n + } + become fibonacci(n - 2) + fibonacci(n - 1) +} +``` +In this case a naive author might assume that this is going to be a stack space efficient implementation since it uses tail recursion instead of normal recursion. However, the outcome is more of less the same since the critical recursive calls are not actually in tail call position. + +Further confusion could result from the same-signature restriction where the Rust compilers complains that fibonacci and ::add do not share a common signature. + + ## Type checking [typechecking]: #typechecking From 8ab2fbda329209a1c972d182d5a4462c9cd8a5f2 Mon Sep 17 00:00:00 2001 From: phi-go Date: Thu, 20 Apr 2023 11:59:31 +0200 Subject: [PATCH 37/92] simplify reference-level explanation --- text/0000-explicit-tail-calls.md | 88 ++++++++++---------------------- 1 file changed, 26 insertions(+), 62 deletions(-) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index c37007f585c..cfa6bac0fe5 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -239,7 +239,6 @@ fn bar(n: i32) { } ``` - # Reference-level explanation [reference-level-explanation]: #reference-level-explanation -This explanation is mostly based on the [previous RFC](https://github.com/DemiMarie/rfcs/blob/become/0000-proper-tail-calls.md#detailed-design) -though is more restricted as the current RFC does not target general tail calls anymore. +Implementation of this feature requires checks that all prerequisites to guarantee TCE are fulfilled. +These checks are: +- The `become` keyword is only used in place of `return`. The intend is to reuse the semantics of a `return` signifying "the end of a function". See the section on [tail-call-elimination](#tail-call-elimination) for examples. +- The argument to `become` is a function (or method) call, that exactly matches the function signature and calling convention of the callee. The intend is to assure a compatible stack frame layout. +- The stack frame of the calling function is reused, this also implies that the function is never returned to. The required checks to ensure this is possible are: no local variables are passed to the called function, and no further cleanup is necessary. These checks can be done by using the borrowchecker as described in the [Borrowchecking](#semantics) section below. -The goal of this RFC is to describe a first implementation that is already useful while providing a basis to explore -possible ways to relax the requirements for TCE. +If any of these checks fail a compiler error is issued. -## Syntax -[syntax]: #syntax +One additional check must be done, if the backend cannot guarantee TCE to be performed a ICE is issued. It is also suggested to ensure that the invariants provided by the prerequisites are maintained during compilation and raising a ICE if this is not the case. -A function call can be specified to be TCE by using the `become` keyword in place of `return`. The `become` keyword is -already reserved, so there is no backwards-compatibility break. The `become` keyword must be followed by a -function or method call, see the section on [tail-call-elimination](#tail-call-elimination) for examples. -Additionally, there is a further restriction on the tail-callable functions: the function signature must exactly match -that of the calling function. +Note that as `become` is a keyword reserved for exactly the use-case described in this RFC there is no backwards-compatibility break. -Invocations of overloaded operators were considered as valid targets, but were rejected on grounds of being too error-prone. -In any case, these can still be called as methods. One example of their error-prone nature ([source](https://github.com/rust-lang/rfcs/pull/3407#discussion_r1167112296)): -```rust -pub fn fibonacci(n: u64) -> u64 { - if n < 2 { - return n - } - become fibonacci(n - 2) + fibonacci(n - 1) -} -``` -In this case a naive author might assume that this is going to be a stack space efficient implementation since it uses tail recursion instead of normal recursion. However, the outcome is more of less the same since the critical recursive calls are not actually in tail call position. - -Further confusion could result from the same-signature restriction where the Rust compilers complains that fibonacci and ::add do not share a common signature. +This feature will have interactions with other features that depend on stack frames, for example, debugging and backtraces. As omitting stack frames is fundamental to the feature described in this RFC, it is suggested to warn a user of this interaction. It might also be possible to add special handling for some features, for example storing a constant number of stack frames separately during debugging. +Features that depend on drop order can also be impacted by this feature, for example locking mechanisms. +See below for how borrowchecking can be used to implement this feature and the reasoning why operators are not supported. -## Type checking -[typechecking]: #typechecking -A `become` statement is type-checked like a `return` statement, with the added restriction that the function signatures of the caller and callee must match exactly. Additionally, the caller and callee **must** use the same calling -convention. - -## Borrowchecking and Runtime Semantics +## Borrowchecking [semantics]: #semantics A `become` expression acts as if the following events occurred in-order: @@ -302,37 +283,20 @@ fulfilled. See this earlier [example](#the-difference-between-return-and-become) on how become causes drops to be elaborated. -## Implementation -[implementation]: #implementation - -A now six years old implementation for the earlier mentioned -[RFC](https://github.com/DemiMarie/rfcs/blob/become/0000-proper-tail-calls.md) can be found at -[DemiMarie/rust/tree/explicit-tailcalls](https://github.com/DemiMarie/rust/tree/explicit-tailcalls). -A new implementation is planned as part of this RFC. - -The parser parses `become` exactly how it parses the `return` keyword. The difference in semantics is handled later. - -During type checking, the following are checked: - -1. The target of the tail call is, in fact, a simple call. -2. The target of the tail call has the proper ABI. - -Should any of these checks fail a compiler error should be issued. - - -New nodes are added in HIR and THIR to correspond to `become`. In MIR, the function call is checked that: -1. The returned value is directly returned. -2. There are no cleanups. -3. The basic block being branched into has length zero. -4. The basic block being branched into terminates with a return. - -If these conditions are fulfilled the function call and the `become` are merged into a `TailCall` MIR node, -this guarantees that nothing can be inserted between the call and `become`. Additionally, this node indicates -the request for TCE for the call which is then propagated to the corresponding backend. In the backend, -there is an additional check if TCE can be performed. - -Should any of these checks fail an ICE should be issued. +## Operators are not supported +Invocations of operators were considered as valid targets, but were rejected on grounds of being too error-prone. +In any case, these can still be called as methods. One example of their error-prone nature ([source](https://github.com/rust-lang/rfcs/pull/3407#discussion_r1167112296)): +```rust +pub fn fibonacci(n: u64) -> u64 { + if n < 2 { + return n + } + become fibonacci(n - 2) + fibonacci(n - 1) +} +``` +In this case a naive author might assume that this is going to be a stack space efficient implementation since it uses tail recursion instead of normal recursion. However, the outcome is more of less the same since the critical recursive calls are not actually in tail call position. +Further confusion could result from the same-signature restriction where the Rust compilers complains that fibonacci and ::add do not share a common signature. # Drawbacks [drawbacks]: #drawbacks @@ -551,7 +515,7 @@ https://github.com/carbon-language/carbon-lang/issues/1761#issuecomment-11986720 - What parts of the design do you expect to resolve through the RFC process before this gets merged? - The main uncertainties are regarding the exact restrictions on when backends can offer TCE, this RFC is intentionally strict to try and require as little as possible from the backends. - - One point that needs to be decided is if TCE should be a feature that needs to be required from all backends or if it can be optional. + - One point that needs to be decided is if TCE should be a feature that needs to be required from all backends or if it can be optional. Currently the RFC specifies that a ICE should be issued if a backend cannot guarantee that TCE will be performed. - Another point that needs to be decided is if TCE is supported by a backend what exactly should be guaranteed? While the guarantee that there is no stack growth should be necessary, should performance (as in transforming `call` instructions into `jmp`) also be guaranteed? Note that a backend that guarantees performance should do so **always** otherwise the main intent of this RFC seems to be lost. - Migration guidance, it might be interesting to provide a lint that indicates that a trivial transformation from `return` to `become` can be done for function calls where all requisites are already fulfilled. However, this lint might be confusing and noisy. Decide on if this lint or others should be added. - Should a lint be added for functions that are marked to be tail call or use become. See discussion [here](https://github.com/rust-lang/rfcs/pull/3407#discussion_r1159822824), as well as, the clippy and rustfmt changes of an initial [implementation](https://github.com/semtexzv/rust/commit/29f430976542011d53e149650f8e6c7221545207#diff-6c8f5168858fed7066e1b6c8badaca8b4a033d0204007b3e3025bf7dd33fffcb) (2022). From 0df947b1d2c37f304c41897bd340977c1dafbc2b Mon Sep 17 00:00:00 2001 From: phi-go Date: Thu, 20 Apr 2023 13:46:43 +0200 Subject: [PATCH 38/92] do another writing pass --- text/0000-explicit-tail-calls.md | 84 ++++++++++++++++---------------- 1 file changed, 41 insertions(+), 43 deletions(-) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index cfa6bac0fe5..d19e8c8f566 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -12,19 +12,17 @@ If this guarantee can not be provided by the compiler a compile time error is ge # Motivation [motivation]: #motivation Tail call elimination (TCE) allows stack frames to be reused. -While TCE via tail call optimization (TCO) is already supported by Rust, as is normal for optimizations TCE will only be applied if the compiler excpects a improvement by doing so. +While TCE via tail call optimization (TCO) is already supported by Rust, as is normal for optimizations TCE will only be applied if the compiler expects a improvement by doing so. There is currently no way to specify that TCE should be guaranteed. This guarantee is interesting for two general goals. -One goal is to do function calls without growing the stack, this mainly has semantic implications as recursive algorithms can overflow the stack without this optimization. -The other goal is to avoid paying the cost to create a new stack frame, replacing `call` instructions by `jmp` instructions, this optimization has performance implications and can provide massive speedups for algorithms that have a high density of function calls. +One goal is to do function calls without growing the stack, this mainly has semantic implications as recursive algorithms can overflow the stack without this guarantee. +The other goal is to avoid paying the cost to create a new stack frame, replacing `call` instructions by `jmp` instructions, this optimization has performance implications and can provide massive speedups for algorithms that have a high density of function calls. This goal also depend on the guarantee as otherwise a subtle change or a new compiler version can have a unexpected impact on performance. Note that workarounds for the first goal exist by using trampolining which limits the stack depth. However, while this functionality can be provided as a library, inclusion in the language can provide greater adoption of a more functional programming style. -For the second goal no guaranteed method exists. While TCO can have the intended effect, if it is performed depends on -the specific code and the compiler version. This can result in unexpected slow-downs after small changes to the code or -a change of the compiler version, see this [issue](https://github.com/rust-lang/rust/issues/102952) for an example. +For the second goal, TCO can have the intended effect, however, there is no guarantee. This can result in unexpected slow-downs, for example, as can be seen in this [issue](https://github.com/rust-lang/rust/issues/102952). Some specific use cases that are supported by this feature are new ways to encode state machines and jump tables, allowing code to be written in a continuation-passing style, using recursive algorithms without the danger of @@ -57,34 +55,33 @@ If TCE is requested for a call the called function will reuse the stack frame of Note that TCE can opportunistically also be performed by Rust using tail call optimization (TCO), this will cause TCE to be used if it is deemed to be "better" (as in faster, or smaller if optimizing for space). TCE is interesting for two groups of programmers: Those that want to use recursive algorithms, -which can overflow the stack if the stack frame is not reused; and those that want to create highly optimized code, +which can overflow the stack if the stack frame is not reused; and those that want to create highly optimized code as creating new stack frames can be expensive. -To request TCE the `become` keyword can be used instead of `return`, and only there. -However, it is not quite so simple. -Several requirements need to be fulfilled for TCE (and TCO) to work. +To request TCE the `become` keyword can be used instead of `return` and only there. +However, several requirements need to be fulfilled for TCE (and TCO) to work. -The main restriction is that the argument to `become` can be simplified to a tail call, -the call is the last action that happens in the function. +The main restriction is that the argument to `become` is a tail call, +a call that is the last action performed in the function. Supported are calls such as `become foo()`, `become foo(a)`, `become foo(a, b)`, `become foo(1 + 1)`, `become foo(bar())`, `become foo.method()`, or `become function_table[idx](arg)`. Calls that are not in the tail position can **not** be used, for example, `become foo() + 1` is not allowed. -The function would need to be evaluated and then the addition would need to take place. +In the example, the function would need to be evaluated and **then** the addition would need to take place. A further restriction is on the function signature of the caller and callee. -As the stack frame should be reused it needs to be similar for both functions. The stack frame layout is based on the calling convention, arguments, as well as return types (the function signature in short). -Currently, all of these need to match exactly. +As the stack frame is to be reused it needs to be similar enough for both functions. +This requires that the function signature and calling convention of the calling and called function need to match exactly. -There is a further restriction on the arguments. -As the stack frame of the calling function is replaced it is not possible to pass references to local variables. -This is the same reason why returning references to local variables is not possible. +Additionally, there is a further restriction on the arguments. +The stack frame of the calling function is reused, it is essentially cleaned up and the called function takes the space. +As a result it is not possible to pass references to local variables, neither will the called function "return" to the calling function. So all variables not used as an argument are dropped before the call and no cleanup will be done after the call. If any of these restrictions are not met when using `become` a compilation error is thrown. -Note that using this feature can make debugging difficult. -As `become` causes the stack frame to be replaced, debugging context is lost. +Note that using this feature can make debugging more difficult. +As `become` causes the stack frame to be reused, debugging context is lost. Expect to no longer see any parent functions that used `become` in the stack trace, or have access to their variable values while debugging. @@ -92,9 +89,9 @@ or have access to their variable values while debugging. As this feature is strictly opt-in and the `become` keyword is already reserved, this has no impact on existing code. -(TODO Error messages once an initial implementation exists) + -(TODO migration guidance) + ## Teaching @@ -108,7 +105,7 @@ pitfalls. ### The difference between `return` and `become` [difference]: #difference -The essential difference to `return` is that `become` drops function local variables **before** the function call +The difference to `return` is that `become` drops function local variables **before** the function call instead of after. So the following function ([original example](https://github.com/rust-lang/rfcs/issues/2691#issuecomment-1136728427)): ```rust fn x() { @@ -256,13 +253,11 @@ These checks are: If any of these checks fail a compiler error is issued. -One additional check must be done, if the backend cannot guarantee TCE to be performed a ICE is issued. It is also suggested to ensure that the invariants provided by the prerequisites are maintained during compilation and raising a ICE if this is not the case. +One additional check must be done, if the backend cannot guarantee TCE to be performed a ICE is issued. It is also suggested to ensure that the invariants provided by the pre-requisites are maintained during compilation, raising a ICE if this is not the case. Note that as `become` is a keyword reserved for exactly the use-case described in this RFC there is no backwards-compatibility break. -This feature will have interactions with other features that depend on stack frames, for example, debugging and backtraces. As omitting stack frames is fundamental to the feature described in this RFC, it is suggested to warn a user of this interaction. It might also be possible to add special handling for some features, for example storing a constant number of stack frames separately during debugging. - -Features that depend on drop order can also be impacted by this feature, for example locking mechanisms. +This feature will have interactions with other features that depend on stack frames, for example, debugging and backtraces. See [drawbacks](#drawbacks) for further discussion. See below for how borrowchecking can be used to implement this feature and the reasoning why operators are not supported. @@ -294,7 +289,7 @@ pub fn fibonacci(n: u64) -> u64 { become fibonacci(n - 2) + fibonacci(n - 1) } ``` -In this case a naive author might assume that this is going to be a stack space efficient implementation since it uses tail recursion instead of normal recursion. However, the outcome is more of less the same since the critical recursive calls are not actually in tail call position. +In this case a naive author might assume that this is going to be a stack space efficient implementation since it uses tail recursion instead of normal recursion. However, the outcome is more or less the same since the critical recursive calls are not actually in tail call position. Further confusion could result from the same-signature restriction where the Rust compilers complains that fibonacci and ::add do not share a common signature. @@ -317,12 +312,12 @@ There is also an unwanted interaction between TCE and debugging. As TCE by desig [rationale-and-alternatives]: #rationale-and-alternatives ## Why is this design the best in the space of possible designs? -This design is the best tradeoff between implementation effort and functionality, while also offering a good starting +This design is the best tradeoff between implementation effort and functionality while also offering a good starting point toward further exploration of a more general implementation. To expand on this, compared to other options creating a function local scope with the use of `become` greatly reduces implementation effort. Additionally, limiting -tail-callable functions to those with exactly matching function signatures enforces a common stack layout across all -functions. This should in theory, depending on the backend, allow tail calls to be performed without any stack -shuffling, indeed it is even possible to do so for indirect calls or external functions. +tail-callable functions to those with exactly matching function signatures and calling conventions enforces a common +stack layout across all functions. This should in theory, depending on the backend, allow tail calls to be performed +without any stack shuffling, indeed it is even possible to do so for indirect calls or external functions. ## What other designs have been considered and what is the rationale for not choosing them? There are some designs that either can not achieve the same performance or functionality as the chosen approach. Though most other designs evolve around how to mark what should be a tail-call or marking what functions can be tail called. There is also the possibility of providing support for a custom backend (e.g. LLVM) or MIR pass. @@ -331,11 +326,11 @@ There are some designs that either can not achieve the same performance or funct There could be a trampoline-based approach ([comment](https://github.com/rust-lang/rfcs/pull/1888#issuecomment-326952763)) that can fulfill the semantic guarantee of using constant stack space, though they can not be used to achieve the performance that the chosen design is capable -of. Additionally, functions need to be known during compile time for these approaches to work. +of. ### Principled Local Goto One alternative would be to support some kind of local goto natively, indeed there exists a -[pre-RFC](https://internals.rust-lang.org/t/pre-rfc-safe-goto-with-value/14470/9?u=scottmcm) ([comment](https://github.com/rust-lang/rfcs/issues/2691#issuecomment-1458604986)). This design should be able to achieve the same performance and stack usage, though it seems to be quite difficult to implement and does not seem to be as flexible as the chosen design (especially regarding indirect calls / external functions). +[pre-RFC](https://internals.rust-lang.org/t/pre-rfc-safe-goto-with-value/14470/9?u=scottmcm) ([comment](https://github.com/rust-lang/rfcs/issues/2691#issuecomment-1458604986)). This design should be able to achieve the same performance and stack usage, though it seems to be quite difficult to implement and does not seem to be as flexible as the chosen design especially regarding indirect calls and external functions. ### Attribute on Function Declaration One alternative is to mark a group of functions that should be mutually tail-callable [example](https://github.com/rust-lang/rfcs/pull/1888#issuecomment-1161525527) with some follow up [discussion](https://github.com/rust-lang/rfcs/pull/1888#issuecomment-1185828948). @@ -345,8 +340,9 @@ theory, this just requires that tail-called functions are callee cleanup, which convention used by Rust. To limit the impact of this change all functions that should be TCE-able should be marked with an attribute. -While quite noisy it is also less flexible than the chosen approach. Indeed TCE is a property of the call and not a -function, sometimes a call should be guaranteed to be TCE and sometimes not, marking a function would be less flexible. +While quite noisy it is also less flexible than the chosen approach. Indeed, TCE is a property of the call and not a +function definition, sometimes a call should be guaranteed to be TCE and sometimes not, marking a function would +be less flexible. ### Attribute on `return` One alternative could be to use an attribute instead of the `become` keyword for function calls. Example: @@ -373,7 +369,7 @@ This would be an error-prone and unergonomic approach to solving this problem. ([source](https://blog.rust-lang.org/inside-rust/2022/04/04/lang-roadmap-2024.html)) This feature provides a crucial optimization for some low-level code. It seems that without this feature there is a big -incentive for developers of those specific applications to use other system-level languages that can perform TCE. +incentive for developers of those specific applications to use other system-level languages that can guarantee TCE. Additionally, this feature enables recursive algorithms that require TCE, which would provide better support for functional programming in Rust. @@ -476,7 +472,7 @@ fn add(a: i32, b: i32) i32 { } ``` -(TODO: What is the community sentiment regarding this feature? Except for some bug reports I did not find anything.) + ## Carbon As per this [issue](https://github.com/carbon-language/carbon-lang/issues/1761) it seems providing TCE is of interest even if the implementation is difficult @@ -514,7 +510,6 @@ https://github.com/carbon-language/carbon-lang/issues/1761#issuecomment-11986720 - What related issues do you consider out of scope for this RFC that could be addressed in the future independently of the solution that comes out of this RFC? --> - What parts of the design do you expect to resolve through the RFC process before this gets merged? - - The main uncertainties are regarding the exact restrictions on when backends can offer TCE, this RFC is intentionally strict to try and require as little as possible from the backends. - One point that needs to be decided is if TCE should be a feature that needs to be required from all backends or if it can be optional. Currently the RFC specifies that a ICE should be issued if a backend cannot guarantee that TCE will be performed. - Another point that needs to be decided is if TCE is supported by a backend what exactly should be guaranteed? While the guarantee that there is no stack growth should be necessary, should performance (as in transforming `call` instructions into `jmp`) also be guaranteed? Note that a backend that guarantees performance should do so **always** otherwise the main intent of this RFC seems to be lost. - Migration guidance, it might be interesting to provide a lint that indicates that a trivial transformation from `return` to `become` can be done for function calls where all requisites are already fulfilled. However, this lint might be confusing and noisy. Decide on if this lint or others should be added. @@ -529,8 +524,10 @@ https://github.com/carbon-language/carbon-lang/issues/1761#issuecomment-11986720 - Can dynamic function calls be supported? - Can functions outside the current crate be supported, functions from dynamically loaded libraries? - Can functions that abort be supported? - - Is there some way to reduce the impact on debugging? - + - Is there some way to reduce the impact on debugging and other features? +- What related issues do you consider out of scope for this RFC that could be addressed in the future independently of + the solution that comes out of this RFC? + - Supporting general tail calls, the current RFC restricts function signatures which can be loosened independently in the future. # Future possibilities [future-possibilities]: #future-possibilities @@ -552,7 +549,8 @@ is not a reason to accept the current or a future RFC; such notes should be in the section on motivation or rationale in this or subsequent RFCs. The section merely provides additional information. --> ## Helpers -It seems possible to keep the restriction on exactly matching function signatures by offering some kind of placeholder arguments to pad out the differences. For example: +It seems possible to keep the restriction on exactly matching function signatures by offering some kind of placeholder +arguments to pad out the differences. For example: ```rust foo(a: u32, b: u32) { // uses `a` and `b` @@ -571,6 +569,6 @@ bar(a: u32) { ``` ## Functional Programming -This might be a silly idea but if TCE is supported there could be further language extensions to make Rust +This might be wishful thinking but if TCE is supported there could be further language extensions to make Rust more attractive for functional programming paradigms. Though it is unclear to me how far this should be taken or what changes exactly would be a benefit. From 0abc63af14cab04932ab2452fe65dc8fe2e43d6a Mon Sep 17 00:00:00 2001 From: phi-go Date: Thu, 20 Apr 2023 20:08:00 +0200 Subject: [PATCH 39/92] passing borrows of local variables is not allowed Co-authored-by: Jacob Lifshay --- text/0000-explicit-tail-calls.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index d19e8c8f566..0deae1ab6c2 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -249,7 +249,7 @@ Implementation of this feature requires checks that all prerequisites to guarant These checks are: - The `become` keyword is only used in place of `return`. The intend is to reuse the semantics of a `return` signifying "the end of a function". See the section on [tail-call-elimination](#tail-call-elimination) for examples. - The argument to `become` is a function (or method) call, that exactly matches the function signature and calling convention of the callee. The intend is to assure a compatible stack frame layout. -- The stack frame of the calling function is reused, this also implies that the function is never returned to. The required checks to ensure this is possible are: no local variables are passed to the called function, and no further cleanup is necessary. These checks can be done by using the borrowchecker as described in the [Borrowchecking](#semantics) section below. +- The stack frame of the calling function is reused, this also implies that the function is never returned to. The required checks to ensure this is possible are: no borrows of local variables are passed to the called function (passing local variables by copy/move is ok, since that doesn't require the local variable to continue existing after the call), and no further cleanup is necessary. These checks can be done by using the borrowchecker as described in the [Borrowchecking](#semantics) section below. If any of these checks fail a compiler error is issued. From b263a9d77e00ad2701f43e01d7765470294d66e6 Mon Sep 17 00:00:00 2001 From: phi-go Date: Sat, 22 Apr 2023 10:27:47 +0200 Subject: [PATCH 40/92] add example for arguments that are calls --- text/0000-explicit-tail-calls.md | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index 0deae1ab6c2..3872b032e7a 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -198,6 +198,30 @@ fn next_instruction(mut self) { } ``` +### Function calls as arguments are not tail call eliminated. +([original example](https://github.com/rust-lang/rfcs/pull/3407#issuecomment-1516477758)) + +The guarantee of TCE is only provided to the function call that is an argument to `become`, +it is not given to calls that are arguments, see the following example: + +```rust +fn add(a: u64, b: u64) -> u64 { + a + b +} + +pub fn calc(a: u64, b: u64) -> u64 { + if a < b { + return a + } + + let n = a - b; + become add(calc(n, 2), calc(n, 1)); +} +``` + +In this example `become` will guarantee TCE only for the call to `add()` but not for the `calc()` calls. +Running this code will likely end up in a stack overflow as the recursive calls are to `calc()` which are not TCE'd. + ### Omission of the `become` keyword causes the call to be `return` instead. ([original example](https://github.com/rust-lang/rfcs/pull/1888#issuecomment-278988088)) From 70c074825c6c633f0f8e1e04481aea467f5609cc Mon Sep 17 00:00:00 2001 From: phi-go Date: Sat, 22 Apr 2023 10:29:59 +0200 Subject: [PATCH 41/92] fix typo in example --- text/0000-explicit-tail-calls.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index 3872b032e7a..fc9f9a46531 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -232,7 +232,7 @@ fn foo(x: i32) -> i32 { if x % 2 { let x = x / 2; // one branch uses `become` - become foo(new_x); + become foo(x); } else { let x = x + 3; // the other does not From d17ba51ece0833436b62caeab3351bb1d2f2be68 Mon Sep 17 00:00:00 2001 From: phi-go Date: Sat, 22 Apr 2023 10:32:38 +0200 Subject: [PATCH 42/92] reword sentence --- text/0000-explicit-tail-calls.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index fc9f9a46531..bf5c4f247c0 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -277,7 +277,7 @@ These checks are: If any of these checks fail a compiler error is issued. -One additional check must be done, if the backend cannot guarantee TCE to be performed a ICE is issued. It is also suggested to ensure that the invariants provided by the pre-requisites are maintained during compilation, raising a ICE if this is not the case. +One additional check must be done, if the backend cannot guarantee that TCE will be performed a ICE is issued. It is also suggested to ensure that the invariants provided by the pre-requisites are maintained during compilation, raising a ICE if this is not the case. Note that as `become` is a keyword reserved for exactly the use-case described in this RFC there is no backwards-compatibility break. From 0b47449cb4b5ae6b4ecbb7ef19c4e6da650e8b05 Mon Sep 17 00:00:00 2001 From: phi-go Date: Sat, 22 Apr 2023 11:28:24 +0200 Subject: [PATCH 43/92] merge borrow checking and difference sections --- text/0000-explicit-tail-calls.md | 79 +++++++++++++++++++------------- 1 file changed, 48 insertions(+), 31 deletions(-) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index bf5c4f247c0..56371bb7f02 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -105,38 +105,72 @@ pitfalls. ### The difference between `return` and `become` [difference]: #difference -The difference to `return` is that `become` drops function local variables **before** the function call -instead of after. So the following function ([original example](https://github.com/rust-lang/rfcs/issues/2691#issuecomment-1136728427)): +The difference to `return` is that `become` drops function local variables **before** the `become` function call +instead of after. To be more specific a `become` expression acts as if the following events occurred in-order: + +1. All variables that are being passed by-value are moved to temporary storage. +2. All local variables in the caller are destroyed according to usual Rust semantics. Destructors are called where + necessary. Note that values moved from in step 1 are _not_ dropped. +3. The caller's stack frame is removed from the stack. +4. Control is transferred to the callee's entry point. + +This implies that it is invalid for any references into the caller's stack frame to outlive the call. The borrow checker ensures that none of the above steps will result in the use of a value that has gone out of scope. + +See the [example](#the-difference-between-return-and-become) below on how `become` causes drops to be elaborated. + + ```rust -fn x() { +fn x(_arg_zero: Box<()>, _arg_one: ()) { let a = Box::new(()); let b = Box::new(()); - become y(a); + let c = Box::new(()); + + become y(a, foo(b)); } ``` The drops will be elaborated by the compiler like this: ```rust -fn x() { +fn x(_arg_zero: Box<()>, _arg_one: ()) { let a = Box::new(()); let b = Box::new(()); - drop(b); // `a` is not dropped because it is moved to the callee - become y(a); + let c = Box::new(()); + + // Move become arguments to temporary variables. + let function_ptr = y; // The function pointer could be the result of an expression like: fn_list[fn_idx]; + let tmp_arg0 = a; + let tmp_arg1 = foo(b); + + // End of the function, all variables not used in the `become` call are dropped, as would be done after a `return`. + // Return value of foo() is *not* dropped as it is used in the become call to y(). + drop(c); + // `b` is *not* dropped because it is moved due to the call to foo(). + // `a` is *not* dropped as it is used in the become call to y(). + drop(_arg_one); + drop(_arg_zero); + + // Finally, `become` the called function. + become function_ptr(tmp_arg0, tmp_arg1); } ``` If we used `return` instead, the drops would happen after the call: ```rust -fn x() { +fn x(_arg_zero: Box<()>, _arg_one: ()) { let a = Box::new(()); let b = Box::new(()); - let tmp = y(a); - drop(b); // `a` is not dropped because it is moved to the callee - return tmp; + let c = Box::new(()); + return y(a, foo(b)); + // Normal drop order: + // Return value of foo() is dropped. + // drop(c); + // `b` is *not* dropped because it is moved due to the call to foo(). + // `a` is *not* dropped because it is moved to the callee y(). + // drop(_arg_one); + // drop(_arg_zero); } ``` - This early dropping allows the compiler to avoid many complexities associated with deciding if the stack frame can be reused. Instead, the heavy lifting is done by the borrow checker, which will produce a lifetime error if references to local variables are passed to the called function. This is distinct from `return`, which _does_ allow references to @@ -273,7 +307,7 @@ Implementation of this feature requires checks that all prerequisites to guarant These checks are: - The `become` keyword is only used in place of `return`. The intend is to reuse the semantics of a `return` signifying "the end of a function". See the section on [tail-call-elimination](#tail-call-elimination) for examples. - The argument to `become` is a function (or method) call, that exactly matches the function signature and calling convention of the callee. The intend is to assure a compatible stack frame layout. -- The stack frame of the calling function is reused, this also implies that the function is never returned to. The required checks to ensure this is possible are: no borrows of local variables are passed to the called function (passing local variables by copy/move is ok, since that doesn't require the local variable to continue existing after the call), and no further cleanup is necessary. These checks can be done by using the borrowchecker as described in the [Borrowchecking](#semantics) section below. +- The stack frame of the calling function is reused, this also implies that the function is never returned to. The required checks to ensure this is possible are: no borrows of local variables are passed to the called function (passing local variables by copy/move is ok, since that doesn't require the local variable to continue existing after the call), and no further cleanup is necessary. These checks can be done by using the borrow checker as already described in the [section](#difference) showing the difference between `return` and `become` above. If any of these checks fail a compiler error is issued. @@ -283,24 +317,7 @@ Note that as `become` is a keyword reserved for exactly the use-case described i This feature will have interactions with other features that depend on stack frames, for example, debugging and backtraces. See [drawbacks](#drawbacks) for further discussion. -See below for how borrowchecking can be used to implement this feature and the reasoning why operators are not supported. - -## Borrowchecking -[semantics]: #semantics -A `become` expression acts as if the following events occurred in-order: - -1. All variables that are being passed by-value are moved to temporary storage. -2. All local variables in the caller are destroyed according to usual Rust semantics. Destructors are called where - necessary. Note that values moved from in step 1 are _not_ dropped. -3. The caller's stack frame is removed from the stack. -4. Control is transferred to the callee's entry point. - -This implies that it is invalid for any references into the caller's stack frame to outlive the call. The borrow checker ensures that none of the above steps will result in the use of a value that has gone out of scope. - -As `become` is always in a tail position (due to being used in place of `return`), this requirement for TCE is already -fulfilled. - -See this earlier [example](#the-difference-between-return-and-become) on how become causes drops to be elaborated. +See below for the reasoning why operators are not supported. ## Operators are not supported Invocations of operators were considered as valid targets, but were rejected on grounds of being too error-prone. From 1b4b6c2ac39eb85c38be3be79249dde7b58e3a6f Mon Sep 17 00:00:00 2001 From: phi-go Date: Sat, 22 Apr 2023 11:37:18 +0200 Subject: [PATCH 44/92] do a grammar pass --- text/0000-explicit-tail-calls.md | 58 ++++++++++++++++---------------- 1 file changed, 29 insertions(+), 29 deletions(-) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index 56371bb7f02..1c26ce8854d 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -5,18 +5,18 @@ # Summary [summary]: #summary -While tail call elimination (TCE) is already possible via tail call optimization (TCO) in Rust, there is no way to guaranteed that a stack frame should be reused. +While tail call elimination (TCE) is already possible via tail call optimization (TCO) in Rust, there is no way to guarantee that a stack frame should be reused. This RFC describes a language feature providing tail call elimination via the `become` keyword providing this guarantee. If this guarantee can not be provided by the compiler a compile time error is generated instead. # Motivation [motivation]: #motivation Tail call elimination (TCE) allows stack frames to be reused. -While TCE via tail call optimization (TCO) is already supported by Rust, as is normal for optimizations TCE will only be applied if the compiler expects a improvement by doing so. +While TCE via tail call optimization (TCO) is already supported by Rust, as is normal for optimizations TCE will only be applied if the compiler expects an improvement by doing so. There is currently no way to specify that TCE should be guaranteed. This guarantee is interesting for two general goals. One goal is to do function calls without growing the stack, this mainly has semantic implications as recursive algorithms can overflow the stack without this guarantee. -The other goal is to avoid paying the cost to create a new stack frame, replacing `call` instructions by `jmp` instructions, this optimization has performance implications and can provide massive speedups for algorithms that have a high density of function calls. This goal also depend on the guarantee as otherwise a subtle change or a new compiler version can have a unexpected impact on performance. +The other goal is to avoid paying the cost to create a new stack frame, replacing `call` instructions by `jmp` instructions, this optimization has performance implications and can provide massive speedups for algorithms that have a high density of function calls. This goal also depends on the guarantee as otherwise a subtle change or a new compiler version can have an unexpected impact on performance. Note that workarounds for the first goal exist by using trampolining which limits the stack depth. However, while this functionality can be provided as a library, inclusion in the language can provide greater adoption of a more functional @@ -26,8 +26,8 @@ For the second goal, TCO can have the intended effect, however, there is no guar Some specific use cases that are supported by this feature are new ways to encode state machines and jump tables, allowing code to be written in a continuation-passing style, using recursive algorithms without the danger of -overflowing the stack, or guaranteeing significantly faster interpreters / emulators. One common example of the -usefulness of tail calls in C is improving performance of Protobuf parsing as described in this +overflowing the stack or guaranteeing significantly faster interpreters/emulators. One common example of the +usefulness of tail calls in C is improving the performance of Protobuf parsing as described in this [blog post](https://blog.reverberate.org/2021/04/21/musttail-efficient-interpreters.html), this approach would then also be possible in Rust. @@ -76,7 +76,7 @@ This requires that the function signature and calling convention of the calling Additionally, there is a further restriction on the arguments. The stack frame of the calling function is reused, it is essentially cleaned up and the called function takes the space. -As a result it is not possible to pass references to local variables, neither will the called function "return" to the calling function. So all variables not used as an argument are dropped before the call and no cleanup will be done after the call. +As a result, it is not possible to pass references to local variables, nor will the called function "return" to the calling function. So all variables not used as an argument are dropped before the call and no cleanup will be done after the call. If any of these restrictions are not met when using `become` a compilation error is thrown. @@ -110,7 +110,7 @@ instead of after. To be more specific a `become` expression acts as if the follo 1. All variables that are being passed by-value are moved to temporary storage. 2. All local variables in the caller are destroyed according to usual Rust semantics. Destructors are called where - necessary. Note that values moved from in step 1 are _not_ dropped. + necessary. Note that values moved from step 1 are _not_ dropped. 3. The caller's stack frame is removed from the stack. 4. Control is transferred to the callee's entry point. @@ -194,7 +194,7 @@ fn sum_list(data: Vec, mut offset: usize, mut accum: u64) -> u64 { ### Use Case 2: Interpreter -In an interpreter the usual loop is to get an instruction, match on that instruction to find the corresponding function, **call** that function, and finally return to the loop to get the next instruction. (This is a simplified example.) +For an interpreter, the usual loop is to get an instruction, match on that instruction to find the corresponding function, **call** that function, and finally return to the loop to get the next instruction. (This is a simplified example.) ```rust fn exec_instruction(mut self) { @@ -305,22 +305,22 @@ fn bar(n: i32) { The section should return to the examples given in the previous section, and explain more fully how the detailed proposal makes those examples work. --> Implementation of this feature requires checks that all prerequisites to guarantee TCE are fulfilled. These checks are: -- The `become` keyword is only used in place of `return`. The intend is to reuse the semantics of a `return` signifying "the end of a function". See the section on [tail-call-elimination](#tail-call-elimination) for examples. -- The argument to `become` is a function (or method) call, that exactly matches the function signature and calling convention of the callee. The intend is to assure a compatible stack frame layout. -- The stack frame of the calling function is reused, this also implies that the function is never returned to. The required checks to ensure this is possible are: no borrows of local variables are passed to the called function (passing local variables by copy/move is ok, since that doesn't require the local variable to continue existing after the call), and no further cleanup is necessary. These checks can be done by using the borrow checker as already described in the [section](#difference) showing the difference between `return` and `become` above. +- The `become` keyword is only used in place of `return`. The intent is to reuse the semantics of a `return` signifying "the end of a function". See the section on [tail-call-elimination](#tail-call-elimination) for examples. +- The argument to `become` is a function (or method) call, that exactly matches the function signature and calling convention of the callee. The intent is to assure a compatible stack frame layout. +- The stack frame of the calling function is reused, this also implies that the function is never returned to. The required checks to ensure this is possible are: no borrows of local variables are passed to the called function (passing local variables by copy/move is ok since that doesn't require the local variable to continue existing after the call), and no further cleanup is necessary. These checks can be done by using the borrow checker as already described in the [section](#difference) showing the difference between `return` and `become` above. If any of these checks fail a compiler error is issued. -One additional check must be done, if the backend cannot guarantee that TCE will be performed a ICE is issued. It is also suggested to ensure that the invariants provided by the pre-requisites are maintained during compilation, raising a ICE if this is not the case. +One additional check must be done, if the backend cannot guarantee that TCE will be performed an ICE is issued. It is also suggested to ensure that the invariants provided by the pre-requisites are maintained during compilation, raising an ICE if this is not the case. -Note that as `become` is a keyword reserved for exactly the use-case described in this RFC there is no backwards-compatibility break. +Note that as `become` is a keyword reserved for exactly the use-case described in this RFC there is no backward-compatibility break. This feature will have interactions with other features that depend on stack frames, for example, debugging and backtraces. See [drawbacks](#drawbacks) for further discussion. See below for the reasoning why operators are not supported. ## Operators are not supported -Invocations of operators were considered as valid targets, but were rejected on grounds of being too error-prone. +Invocations of operators were considered as valid targets but were rejected on grounds of being too error-prone. In any case, these can still be called as methods. One example of their error-prone nature ([source](https://github.com/rust-lang/rfcs/pull/3407#discussion_r1167112296)): ```rust pub fn fibonacci(n: u64) -> u64 { @@ -330,9 +330,9 @@ pub fn fibonacci(n: u64) -> u64 { become fibonacci(n - 2) + fibonacci(n - 1) } ``` -In this case a naive author might assume that this is going to be a stack space efficient implementation since it uses tail recursion instead of normal recursion. However, the outcome is more or less the same since the critical recursive calls are not actually in tail call position. +In this case, a naive author might assume that this is going to be a stack space-efficient implementation since it uses tail recursion instead of normal recursion. However, the outcome is more or less the same since the critical recursive calls are not actually in tail call position. -Further confusion could result from the same-signature restriction where the Rust compilers complains that fibonacci and ::add do not share a common signature. +Further confusion could result from the same-signature restriction where the Rust compiler raises an error since fibonacci and ::add do not share a common signature. # Drawbacks [drawbacks]: #drawbacks @@ -340,13 +340,13 @@ Further confusion could result from the same-signature restriction where the Rus As this feature should be mostly independent of other features the main drawback lies in the implementation and maintenance effort. This feature adds a new keyword which will need to be implemented not only in Rust but also in other tooling. The primary effort, however, lies in supporting this feature in the backends: -- LLVM supports a `musttail` marker to indicate that TCE should be performed [docs](https://llvm.org/docs/LangRef.html#id327). Clang which already depends on this feature, seems to only generate correct code for the x86 backend [source](https://github.com/rust-lang/rfcs/issues/2691#issuecomment-1490009983) (as of 30.03.23). +- LLVM supports a `musttail` marker to indicate that TCE should be performed [docs](https://llvm.org/docs/LangRef.html#id327). Clang which already depends on this feature seems to only generate correct code for the x86 backend [source](https://github.com/rust-lang/rfcs/issues/2691#issuecomment-1490009983) (as of 30.03.23). - GCC does seem to support an equivalent `musttail` marker, though it is only accessible via the [libgccjit API](https://gcc.gnu.org/onlinedocs/gcc-7.3.0/jit/topics/expressions.html#gcc_jit_rvalue_set_bool_require_tail_call) ([source](https://github.com/rust-lang/rfcs/pull/3407#discussion_r1160013809)). -- WebAssembly accepted tail-calls into the [standard](https://github.com/WebAssembly/proposals/pull/157/) and Cranelift is now [working](https://github.com/bytecodealliance/rfcs/pull/29) towards supporting it. +- WebAssembly accepted tail calls into the [standard](https://github.com/WebAssembly/proposals/pull/157/) and Cranelift is now [working](https://github.com/bytecodealliance/rfcs/pull/29) towards supporting it. Additionally, this proposal is limited to exactly matching function signatures which will *not* allow general tail-calls, however, the work towards this initial version is likely to be useful for a more comprehensive version. -There is also an unwanted interaction between TCE and debugging. As TCE by design elides stack frames this information is lost during debugging, that is the parent functions and their local variable values are incomplete. As TCE provides a semantic guarantee of constant stack usage it is also not generally possible to disable TCE for debugging builds as then the stack could overflow. (Still maybe a compiler flag could be provided to temporarily disable TCE for debugging builds. As suggested [here](https://github.com/rust-lang/rfcs/pull/3407/files#r1159817279), another option would be special support for `become` by a debugger. With this support the debugger would keep track of the N most recent calls providing at least some context to the bug.) +There is also an unwanted interaction between TCE and debugging. As TCE by design elides stack frames this information is lost during debugging, that is the parent functions and their local variable values are incomplete. As TCE provides a semantic guarantee of constant stack usage it is also not generally possible to disable TCE for debugging builds as then the stack could overflow. (Still, maybe a compiler flag could be provided to temporarily disable TCE for debugging builds. As suggested [here](https://github.com/rust-lang/rfcs/pull/3407/files#r1159817279), another option would be special support for `become` by a debugger. With this support the debugger would keep track of the N most recent calls providing at least some context to the bug.) # Rationale and alternatives @@ -371,10 +371,10 @@ of. ### Principled Local Goto One alternative would be to support some kind of local goto natively, indeed there exists a -[pre-RFC](https://internals.rust-lang.org/t/pre-rfc-safe-goto-with-value/14470/9?u=scottmcm) ([comment](https://github.com/rust-lang/rfcs/issues/2691#issuecomment-1458604986)). This design should be able to achieve the same performance and stack usage, though it seems to be quite difficult to implement and does not seem to be as flexible as the chosen design especially regarding indirect calls and external functions. +[pre-RFC](https://internals.rust-lang.org/t/pre-rfc-safe-goto-with-value/14470/9?u=scottmcm) ([comment](https://github.com/rust-lang/rfcs/issues/2691#issuecomment-1458604986)). This design should be able to achieve the same performance and stack usage, though it seems to be quite difficult to implement and does not seem to be as flexible as the chosen design, especially regarding indirect calls and external functions. ### Attribute on Function Declaration -One alternative is to mark a group of functions that should be mutually tail-callable [example](https://github.com/rust-lang/rfcs/pull/1888#issuecomment-1161525527) with some follow up [discussion](https://github.com/rust-lang/rfcs/pull/1888#issuecomment-1185828948). +One alternative is to mark a group of functions that should be mutually tail-callable [example](https://github.com/rust-lang/rfcs/pull/1888#issuecomment-1161525527) with some follow-up [discussion](https://github.com/rust-lang/rfcs/pull/1888#issuecomment-1185828948). The goal behind this design is to allow TCE of functions that do not have exactly matching function signatures, in theory, this just requires that tail-called functions are callee cleanup, which is a mismatch to the default calling @@ -382,7 +382,7 @@ convention used by Rust. To limit the impact of this change all functions that s an attribute. While quite noisy it is also less flexible than the chosen approach. Indeed, TCE is a property of the call and not a -function definition, sometimes a call should be guaranteed to be TCE and sometimes not, marking a function would +function definition, sometimes a call should be guaranteed to be TCE, and sometimes not, marking a function would be less flexible. ### Attribute on `return` @@ -399,7 +399,7 @@ fn a() { This alternative mostly comes down to taste (or bikeshedding) and `become` was chosen as it is [reserved](https://rust-lang.github.io/rfcs/0601-replace-be-with-become.html) for this use, shorter to write, and as drop order changes compared to `return` a new keyword seems warranted. -### Custom compiler or MIR passes +### Custom Compiler or MIR Passes One more distant alternative would be to support a custom compiler or MIR pass so that this optimization can be done externally. While supported for LLVM [Zulip](https://rust-lang.zulipchat.com/#narrow/stream/187780-t-compiler.2Fwg-llvm/topic/.E2.9C.94.20Running.20Custom.20LLVM.20Pass/near/320275483), for MIR this is not supported [discussion](https://internals.rust-lang.org/t/mir-compiler-plugins-for-custom-mir-passes/3166/10). This would be an error-prone and unergonomic approach to solving this problem. @@ -439,12 +439,12 @@ Note that while precedent set by other languages is some motivation, it does not Please also take into consideration that rust sometimes intentionally diverges from common language features. --> Functional languages (such as OCaml, SML, Haskell, Scheme, and F#) usually depend on proper tail calls as a language -feature (TCE for general calls). For system-level languages TCE is usually wanted but implementation +feature (TCE for general calls). For system-level languages, TCE is usually wanted but implementation effort is a common reason this is not yet done. Even languages with managed code such as .Net or ECMAScript (as per the standard) also support TCE, again performance and resource usage were the main motivators for their implementation. -See below for a more detailed description on select compilers and languages. +See below for a more detailed description of select compilers and languages. ## Clang @@ -495,11 +495,11 @@ There is also a [proposal](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n292 ## GCC -GCC does not support a feature equivalent to Clang's `musttail`, there also does not seem to be push to implement it ([pipermail](https://gcc.gnu.org/pipermail/gcc/2021-April/235882.html)) (as of 2021). However, there also exists a experimental [plugin](https://github.com/pietro/gcc-musttail-plugin) for GCC last updated in 2021. +GCC does not support a feature equivalent to Clang's `musttail`, there also does not seem to be a push to implement it ([pipermail](https://gcc.gnu.org/pipermail/gcc/2021-April/235882.html)) (as of 2021). However, there also exists an experimental [plugin](https://github.com/pietro/gcc-musttail-plugin) for GCC last updated in 2021. ## Zig -Zig provides separate syntax to allow more flexibility than normal function calls. There are options for async calls, inlining, compile time evaluation of the called function, or specifying TCE on the call. +Zig provides separate syntax to allow more flexibility than normal function calls. There are options for async calls, inlining, compile-time evaluation of the called function, or specifying TCE on the call. ([source](https://ziglang.org/documentation/master/#call)) ```zig const expect = @import("std").testing.expect; @@ -551,10 +551,10 @@ https://github.com/carbon-language/carbon-lang/issues/1761#issuecomment-11986720 - What related issues do you consider out of scope for this RFC that could be addressed in the future independently of the solution that comes out of this RFC? --> - What parts of the design do you expect to resolve through the RFC process before this gets merged? - - One point that needs to be decided is if TCE should be a feature that needs to be required from all backends or if it can be optional. Currently the RFC specifies that a ICE should be issued if a backend cannot guarantee that TCE will be performed. + - One point that needs to be decided is if TCE should be a feature that needs to be required from all backends or if it can be optional. Currently, the RFC specifies that an ICE should be issued if a backend cannot guarantee that TCE will be performed. - Another point that needs to be decided is if TCE is supported by a backend what exactly should be guaranteed? While the guarantee that there is no stack growth should be necessary, should performance (as in transforming `call` instructions into `jmp`) also be guaranteed? Note that a backend that guarantees performance should do so **always** otherwise the main intent of this RFC seems to be lost. - Migration guidance, it might be interesting to provide a lint that indicates that a trivial transformation from `return` to `become` can be done for function calls where all requisites are already fulfilled. However, this lint might be confusing and noisy. Decide on if this lint or others should be added. - - Should a lint be added for functions that are marked to be tail call or use become. See discussion [here](https://github.com/rust-lang/rfcs/pull/3407#discussion_r1159822824), as well as, the clippy and rustfmt changes of an initial [implementation](https://github.com/semtexzv/rust/commit/29f430976542011d53e149650f8e6c7221545207#diff-6c8f5168858fed7066e1b6c8badaca8b4a033d0204007b3e3025bf7dd33fffcb) (2022). + - Should a lint be added for functions that are marked to be a tail call or use become. See the discussion [here](https://github.com/rust-lang/rfcs/pull/3407#discussion_r1159822824), as well as, the clippy and rustfmt changes of an initial [implementation](https://github.com/semtexzv/rust/commit/29f430976542011d53e149650f8e6c7221545207#diff-6c8f5168858fed7066e1b6c8badaca8b4a033d0204007b3e3025bf7dd33fffcb) (2022). - What parts of the design do you expect to resolve through the implementation of this feature before stabilization? - Are all calling-convention used by Rust available for TCE with the proposed restrictions on function signatures? - Can the restrictions on function signatures be relaxed? From 578b33f31a51914a05ae49933d66c456d9fb6de6 Mon Sep 17 00:00:00 2001 From: phi-go Date: Sat, 22 Apr 2023 12:37:58 +0200 Subject: [PATCH 45/92] improve accuracy of steps done by become Co-authored-by: Gary Guo --- text/0000-explicit-tail-calls.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index 1c26ce8854d..6092f69a5d7 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -108,7 +108,7 @@ pitfalls. The difference to `return` is that `become` drops function local variables **before** the `become` function call instead of after. To be more specific a `become` expression acts as if the following events occurred in-order: -1. All variables that are being passed by-value are moved to temporary storage. +1. Function call arguments are evaluated into temporary storage. If a local variable is used as a value in the arguments, it is moved. 2. All local variables in the caller are destroyed according to usual Rust semantics. Destructors are called where necessary. Note that values moved from step 1 are _not_ dropped. 3. The caller's stack frame is removed from the stack. From dae7085d21bb252e154d1a8514b7270be06c0805 Mon Sep 17 00:00:00 2001 From: phi-go Date: Sat, 22 Apr 2023 12:40:58 +0200 Subject: [PATCH 46/92] fix mistake in drop order example --- text/0000-explicit-tail-calls.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index 6092f69a5d7..cca18c29b14 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -142,7 +142,7 @@ fn x(_arg_zero: Box<()>, _arg_one: ()) { let tmp_arg1 = foo(b); // End of the function, all variables not used in the `become` call are dropped, as would be done after a `return`. - // Return value of foo() is *not* dropped as it is used in the become call to y(). + // Return value of foo() is *not* dropped as it is moved in the become call to y(). drop(c); // `b` is *not* dropped because it is moved due to the call to foo(). // `a` is *not* dropped as it is used in the become call to y(). @@ -162,7 +162,7 @@ fn x(_arg_zero: Box<()>, _arg_one: ()) { let c = Box::new(()); return y(a, foo(b)); // Normal drop order: - // Return value of foo() is dropped. + // Return value of foo() is *not* dropped as it is moved in the call to y(). // drop(c); // `b` is *not* dropped because it is moved due to the call to foo(). // `a` is *not* dropped because it is moved to the callee y(). From acef7093de4701a9f63aca7d1b3b06e40b192ced Mon Sep 17 00:00:00 2001 From: phi-go Date: Mon, 24 Apr 2023 09:52:09 +0200 Subject: [PATCH 47/92] be more clear on where become can be used --- text/0000-explicit-tail-calls.md | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index cca18c29b14..f8bb14d3225 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -58,8 +58,10 @@ TCE is interesting for two groups of programmers: Those that want to use recursi which can overflow the stack if the stack frame is not reused; and those that want to create highly optimized code as creating new stack frames can be expensive. -To request TCE the `become` keyword can be used instead of `return` and only there. -However, several requirements need to be fulfilled for TCE (and TCO) to work. +To request TCE the `become` keyword can be used instead of `return`. +Note that, as both keywords act the same in terms of *control flow*, +`become` can be used everywhere that `return` is used. +However, there are several requirements on the called function which need to be fulfilled for TCE (and TCO) to work. The main restriction is that the argument to `become` is a tail call, a call that is the last action performed in the function. @@ -275,7 +277,7 @@ fn foo(x: i32) -> i32 { } ``` -### Alternating `become` and `return` calls +### Alternating `become` and `return` calls still grows the stack. ([original example](https://github.com/rust-lang/rfcs/pull/1888#issuecomment-279062656)) Here one function uses `become` the other `return`, this is another potential source of confusion. This mutual recursion From e8df19902b745110c4e7f4823499342bdb6f0ea6 Mon Sep 17 00:00:00 2001 From: phi-go Date: Mon, 24 Apr 2023 10:01:24 +0200 Subject: [PATCH 48/92] update description comparing return and become --- text/0000-explicit-tail-calls.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index f8bb14d3225..7c0d20df155 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -59,7 +59,7 @@ which can overflow the stack if the stack frame is not reused; and those that wa as creating new stack frames can be expensive. To request TCE the `become` keyword can be used instead of `return`. -Note that, as both keywords act the same in terms of *control flow*, +Note that, as both keywords act as the end of the function, `become` can be used everywhere that `return` is used. However, there are several requirements on the called function which need to be fulfilled for TCE (and TCO) to work. From 4dc10e8a0067fef0e4c43ad27836934823d886a4 Mon Sep 17 00:00:00 2001 From: phi-go Date: Mon, 24 Apr 2023 16:28:01 +0200 Subject: [PATCH 49/92] add return become alternative --- text/0000-explicit-tail-calls.md | 20 ++++++++++++++++---- 1 file changed, 16 insertions(+), 4 deletions(-) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index 7c0d20df155..e47d4e24495 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -387,19 +387,31 @@ While quite noisy it is also less flexible than the chosen approach. Indeed, TCE function definition, sometimes a call should be guaranteed to be TCE, and sometimes not, marking a function would be less flexible. -### Attribute on `return` -One alternative could be to use an attribute instead of the `become` keyword for function calls. Example: +### Adding a mark to `return`. + +The return keyword could be marked using an attribute or an extra keyword as in the example below. ```rust fn a() { + // The chosen variant. become b(); - // or + + // Using an attribute. #[become] return b(); + + // Adding an extra keyword. + return become b(); } ``` -This alternative mostly comes down to taste (or bikeshedding) and `become` was chosen as it is [reserved](https://rust-lang.github.io/rfcs/0601-replace-be-with-become.html) for this use, shorter to write, and as drop order changes compared to `return` a new keyword seems warranted. +These alternatives mostly come down to personal taste (or bikeshedding) and the plain keyword `become` was chosen because of the following reasons: + +- It is [reserved](https://rust-lang.github.io/rfcs/0601-replace-be-with-become.html) exactly this use case. +- It is shorter to write. +- The behavior changes in subtle ways compared to a plain `return`. To clearly indicate this change in behavior a stronger distinction from `return` than adding a mark seems warranted. + - TCE as proposed in this RFC requires dropping local variables before the function call instead of after with `return`. + - From a type system perspective the type of the `return` expression (`!`, the never type, see [here](https://doc.rust-lang.org/std/primitive.never.html) for an example) stays the same even when adding one of the markings. This means that type-checking can not help if the marking is forgotten or added mistakenly. (Note that the argument, the function call, can still be type checked, just not the `return` expression.) ### Custom Compiler or MIR Passes One more distant alternative would be to support a custom compiler or MIR pass so that this optimization can be done externally. While supported for LLVM [Zulip](https://rust-lang.zulipchat.com/#narrow/stream/187780-t-compiler.2Fwg-llvm/topic/.E2.9C.94.20Running.20Custom.20LLVM.20Pass/near/320275483), for MIR this is not supported [discussion](https://internals.rust-lang.org/t/mir-compiler-plugins-for-custom-mir-passes/3166/10). From 3c201192f82ceadadf7345c49e9679b2771be1db Mon Sep 17 00:00:00 2001 From: phi-go Date: Wed, 3 May 2023 13:55:33 +0200 Subject: [PATCH 50/92] add alternative: explicit dropping of variables --- text/0000-explicit-tail-calls.md | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index e47d4e24495..3aa35f9d6c7 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -412,6 +412,25 @@ These alternatives mostly come down to personal taste (or bikeshedding) and the - The behavior changes in subtle ways compared to a plain `return`. To clearly indicate this change in behavior a stronger distinction from `return` than adding a mark seems warranted. - TCE as proposed in this RFC requires dropping local variables before the function call instead of after with `return`. - From a type system perspective the type of the `return` expression (`!`, the never type, see [here](https://doc.rust-lang.org/std/primitive.never.html) for an example) stays the same even when adding one of the markings. This means that type-checking can not help if the marking is forgotten or added mistakenly. (Note that the argument, the function call, can still be type checked, just not the `return` expression.) + +### Require Explicit Dropping of Variables + +(Based on this [comment](https://github.com/rust-lang/rfcs/pull/3407#issuecomment-1532841475)) + +An alternative approach could be to refuse to compile functions that would need to run destructors before `become`. +This would force code to not rely on implicit drops and require calls to `drop(variable)`, as in the following example: + +```rust +fn f(x: String) { + drop(x); // necessary + become g(); +} +``` + +This approach would result in more verbose code but would also be easier to read for people not familiar with tail calls. +Also, it is forwards compatible with implicit dropping before `become`. + +The reason this approach is not chosen is that the tradeoff between increased verbosity and the reduction of initial learning time seems to not be worth it. ### Custom Compiler or MIR Passes One more distant alternative would be to support a custom compiler or MIR pass so that this optimization can be done externally. While supported for LLVM [Zulip](https://rust-lang.zulipchat.com/#narrow/stream/187780-t-compiler.2Fwg-llvm/topic/.E2.9C.94.20Running.20Custom.20LLVM.20Pass/near/320275483), for MIR this is not supported [discussion](https://internals.rust-lang.org/t/mir-compiler-plugins-for-custom-mir-passes/3166/10). From f46aa13d1516a0f1f3236c8f998fdb511e6c98b9 Mon Sep 17 00:00:00 2001 From: phi-go Date: Fri, 5 May 2023 14:35:18 +0200 Subject: [PATCH 51/92] fix date formatting Co-authored-by: oxalica --- text/0000-explicit-tail-calls.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index 3aa35f9d6c7..1ca97c9bb41 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -342,7 +342,7 @@ Further confusion could result from the same-signature restriction where the Rus As this feature should be mostly independent of other features the main drawback lies in the implementation and maintenance effort. This feature adds a new keyword which will need to be implemented not only in Rust but also in other tooling. The primary effort, however, lies in supporting this feature in the backends: -- LLVM supports a `musttail` marker to indicate that TCE should be performed [docs](https://llvm.org/docs/LangRef.html#id327). Clang which already depends on this feature seems to only generate correct code for the x86 backend [source](https://github.com/rust-lang/rfcs/issues/2691#issuecomment-1490009983) (as of 30.03.23). +- LLVM supports a `musttail` marker to indicate that TCE should be performed [docs](https://llvm.org/docs/LangRef.html#id327). Clang which already depends on this feature seems to only generate correct code for the x86 backend [source](https://github.com/rust-lang/rfcs/issues/2691#issuecomment-1490009983) (as of 2023-03-30). - GCC does seem to support an equivalent `musttail` marker, though it is only accessible via the [libgccjit API](https://gcc.gnu.org/onlinedocs/gcc-7.3.0/jit/topics/expressions.html#gcc_jit_rvalue_set_bool_require_tail_call) ([source](https://github.com/rust-lang/rfcs/pull/3407#discussion_r1160013809)). - WebAssembly accepted tail calls into the [standard](https://github.com/WebAssembly/proposals/pull/157/) and Cranelift is now [working](https://github.com/bytecodealliance/rfcs/pull/29) towards supporting it. From b6b309492602bf0a5b1733022234110ca2678632 Mon Sep 17 00:00:00 2001 From: phi-go Date: Sat, 6 May 2023 11:14:50 +0200 Subject: [PATCH 52/92] update with discussion on WebAssembly --- text/0000-explicit-tail-calls.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index 1ca97c9bb41..541b0ad2d31 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -344,7 +344,7 @@ maintenance effort. This feature adds a new keyword which will need to be implem tooling. The primary effort, however, lies in supporting this feature in the backends: - LLVM supports a `musttail` marker to indicate that TCE should be performed [docs](https://llvm.org/docs/LangRef.html#id327). Clang which already depends on this feature seems to only generate correct code for the x86 backend [source](https://github.com/rust-lang/rfcs/issues/2691#issuecomment-1490009983) (as of 2023-03-30). - GCC does seem to support an equivalent `musttail` marker, though it is only accessible via the [libgccjit API](https://gcc.gnu.org/onlinedocs/gcc-7.3.0/jit/topics/expressions.html#gcc_jit_rvalue_set_bool_require_tail_call) ([source](https://github.com/rust-lang/rfcs/pull/3407#discussion_r1160013809)). -- WebAssembly accepted tail calls into the [standard](https://github.com/WebAssembly/proposals/pull/157/) and Cranelift is now [working](https://github.com/bytecodealliance/rfcs/pull/29) towards supporting it. +- For WebAssembly see the discussion [here](https://github.com/rust-lang/rfcs/pull/3407#discussion_r1186003262). While general tail calls seem far off, supporting only matching function signatures seems more feasible. Also note that WebAssembly accepted tail calls into the [standard](https://github.com/WebAssembly/proposals/pull/157/) and Cranelift is now [working](https://github.com/bytecodealliance/rfcs/pull/29) towards supporting it. Additionally, this proposal is limited to exactly matching function signatures which will *not* allow general tail-calls, however, the work towards this initial version is likely to be useful for a more comprehensive version. From 94916aa0540e545ef89cd95c86821f849ce1fad4 Mon Sep 17 00:00:00 2001 From: phi-go Date: Fri, 12 May 2023 09:59:33 +0200 Subject: [PATCH 53/92] update summary and motivation --- text/0000-explicit-tail-calls.md | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index 541b0ad2d31..67af8a678d7 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -5,16 +5,18 @@ # Summary [summary]: #summary -While tail call elimination (TCE) is already possible via tail call optimization (TCO) in Rust, there is no way to guarantee that a stack frame should be reused. +While tail call elimination (TCE) is already possible via tail call optimization (TCO) in Rust, there is no way to guarantee that a stack frame must be reused. This RFC describes a language feature providing tail call elimination via the `become` keyword providing this guarantee. If this guarantee can not be provided by the compiler a compile time error is generated instead. # Motivation [motivation]: #motivation Tail call elimination (TCE) allows stack frames to be reused. -While TCE via tail call optimization (TCO) is already supported by Rust, as is normal for optimizations TCE will only be applied if the compiler expects an improvement by doing so. -There is currently no way to specify that TCE should be guaranteed. -This guarantee is interesting for two general goals. +While TCE via tail call optimization (TCO) is already supported by Rust, as is normal for optimizations, TCO will only be applied if the compiler expects an improvement by doing so. +However, the compiler can't have ideal analysis and thus will not always be correct in judging if a optimization should be applied. +This RFC, shows an approach how TCE can be guaranteed. + +The guarantee for TCE is interesting for two general goals. One goal is to do function calls without growing the stack, this mainly has semantic implications as recursive algorithms can overflow the stack without this guarantee. The other goal is to avoid paying the cost to create a new stack frame, replacing `call` instructions by `jmp` instructions, this optimization has performance implications and can provide massive speedups for algorithms that have a high density of function calls. This goal also depends on the guarantee as otherwise a subtle change or a new compiler version can have an unexpected impact on performance. From ae758f560cda30d83d4d6b6294bc1ac4bc333455 Mon Sep 17 00:00:00 2001 From: phi-go Date: Fri, 12 May 2023 10:49:35 +0200 Subject: [PATCH 54/92] update tail call elimination section --- text/0000-explicit-tail-calls.md | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index 67af8a678d7..9f80bc2b4ad 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -53,17 +53,17 @@ Pretending this RFC has already been accepted into Rust, it could be explained t [tail-call-elimination]: #tail-call-elimination Rust supports a way to guarantee tail call elimination (TCE) for function calls using the `become` keyword. -If TCE is requested for a call the called function will reuse the stack frame of the calling function, assuming all requirements are fulfilled. +If TCE is requested for a call, and all requirements are fulfilled, the called function will reuse the stack frame of the calling function. +The requirements, described in detail below, are checked by the compiler and a compiler error will be raised if they are not met. Note that TCE can opportunistically also be performed by Rust using tail call optimization (TCO), this will cause TCE to be used if it is deemed to be "better" (as in faster, or smaller if optimizing for space). TCE is interesting for two groups of programmers: Those that want to use recursive algorithms, which can overflow the stack if the stack frame is not reused; and those that want to create highly optimized code as creating new stack frames can be expensive. -To request TCE the `become` keyword can be used instead of `return`. -Note that, as both keywords act as the end of the function, -`become` can be used everywhere that `return` is used. -However, there are several requirements on the called function which need to be fulfilled for TCE (and TCO) to work. +The `become` keyword can be thought of similarly as `return` as both keywords act as the end of the current function. +The main difference is that the argument to `become` needs to be a function call. +However, there are several requirements on the called function which need to be fulfilled for TCE to be guaranteed, these are checked by the compiler. The main restriction is that the argument to `become` is a tail call, a call that is the last action performed in the function. @@ -79,8 +79,11 @@ As the stack frame is to be reused it needs to be similar enough for both functi This requires that the function signature and calling convention of the calling and called function need to match exactly. Additionally, there is a further restriction on the arguments. -The stack frame of the calling function is reused, it is essentially cleaned up and the called function takes the space. -As a result, it is not possible to pass references to local variables, nor will the called function "return" to the calling function. So all variables not used as an argument are dropped before the call and no cleanup will be done after the call. +As the stack frame of the calling function is reused, it needs to be cleaned up, so that the called function can take the space. +This is nearly identical to the clean up that happens when returning from a function, +all local variables, that are not returned or in the case of `become` used in the function call, are dropped. +For `become`, however, dropping necessarily happens before entering the called function. +As a result, it is not possible to pass references to local variables, nor will the called function "return" to the calling function. If any of these restrictions are not met when using `become` a compilation error is thrown. From 2824fc9042ac77afcf8193f4f279e19d4bfec758 Mon Sep 17 00:00:00 2001 From: phi-go Date: Fri, 12 May 2023 10:51:17 +0200 Subject: [PATCH 55/92] use slice instead of list example Co-authored-by: Waffle Maybe --- text/0000-explicit-tail-calls.md | 11 ++++------- 1 file changed, 4 insertions(+), 7 deletions(-) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index 9f80bc2b4ad..97274a5fd13 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -188,13 +188,10 @@ between `return` and `become`. A simple example is the following algorithm for summing the elements of a `Vec`. While this would usually be done with iteration in Rust, this example illustrates a simple use of `become`. Without TCE, this example could overflow the stack. ```rust -fn sum_list(data: Vec, mut offset: usize, mut accum: u64) -> u64 { - if offset < data.len() { - accum += data[offset]; - offset += 1; - become sum_list(data, offset, accum); // <- become here - } else { - accum // <- equivalent to `return accum;` +fn sum_slice(data: &[u64], accumulator: u64) -> u64 { + match data { + [first, rest @ ..] => become sum_slice(rest, accumulator + first), + [] => accumulator, } } ``` From da261a4bcc912f065296b9622443f95d4ef406a5 Mon Sep 17 00:00:00 2001 From: phi-go Date: Fri, 12 May 2023 10:53:26 +0200 Subject: [PATCH 56/92] update wording for use case 1 --- text/0000-explicit-tail-calls.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index 97274a5fd13..af85b152e30 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -185,7 +185,7 @@ local variables to be passed. Indeed, this difference in the handling of local between `return` and `become`. ### Use Case 1: Recursive Algorithm -A simple example is the following algorithm for summing the elements of a `Vec`. While this would usually be done with iteration in Rust, this example illustrates a simple use of `become`. Without TCE, this example could overflow the stack. +A simple example is the following algorithm for summing the elements of a slice. While this would usually be done with iteration in Rust, this example illustrates a simple use of `become`. Without guaranteed TCE, this example could overflow the stack if TCO is not applied. ```rust fn sum_slice(data: &[u64], accumulator: u64) -> u64 { From 5c2485e4a48e16386e4e1709055405c1c8d2f68e Mon Sep 17 00:00:00 2001 From: phi-go Date: Fri, 12 May 2023 10:59:25 +0200 Subject: [PATCH 57/92] update alternating example description Co-authored-by: Waffle Maybe --- text/0000-explicit-tail-calls.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index af85b152e30..a5fe5e78df5 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -284,7 +284,7 @@ fn foo(x: i32) -> i32 { Here one function uses `become` the other `return`, this is another potential source of confusion. This mutual recursion would eventually overflow the stack. As mutual recursion can also happen across more functions, `become` needs to be -used consistently in all functions if TCO should be guaranteed. (Maybe it is also possible to create a lint for these +used consistently in all functions if TCE is desired. (Maybe it is also possible to create a lint for these use cases as well.) ```rust From 0e4a645a87243116fa9d84bf0f77047cc5b79df7 Mon Sep 17 00:00:00 2001 From: phi-go Date: Fri, 12 May 2023 11:04:01 +0200 Subject: [PATCH 58/92] add impl effort for explicit drop alternative --- text/0000-explicit-tail-calls.md | 1 + 1 file changed, 1 insertion(+) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index a5fe5e78df5..7d0b8c82be3 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -433,6 +433,7 @@ This approach would result in more verbose code but would also be easier to read Also, it is forwards compatible with implicit dropping before `become`. The reason this approach is not chosen is that the tradeoff between increased verbosity and the reduction of initial learning time seems to not be worth it. +Additionally, implementing the diagnostic for forgotten drops can be expected to be more effort than for correct drop elaboration. ### Custom Compiler or MIR Passes One more distant alternative would be to support a custom compiler or MIR pass so that this optimization can be done externally. While supported for LLVM [Zulip](https://rust-lang.zulipchat.com/#narrow/stream/187780-t-compiler.2Fwg-llvm/topic/.E2.9C.94.20Running.20Custom.20LLVM.20Pass/near/320275483), for MIR this is not supported [discussion](https://internals.rust-lang.org/t/mir-compiler-plugins-for-custom-mir-passes/3166/10). From eeaa80c6db06ab1e9a7ecb755844e8b3da7baef8 Mon Sep 17 00:00:00 2001 From: phi-go Date: Fri, 12 May 2023 11:16:35 +0200 Subject: [PATCH 59/92] resolving unresolved questions --- text/0000-explicit-tail-calls.md | 3 --- 1 file changed, 3 deletions(-) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index 7d0b8c82be3..41a79e4adff 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -595,11 +595,8 @@ https://github.com/carbon-language/carbon-lang/issues/1761#issuecomment-11986720 - Are all calling-convention used by Rust available for TCE with the proposed restrictions on function signatures? - Can the restrictions on function signatures be relaxed? - One option for intra-crate direct calls is to automatically pad the arguments during compilation see [here](https://github.com/rust-lang/rfcs/pull/3407#issuecomment-1500620309). Does this have an influence on other calls? How much implementation effort is it? - - Can generic functions be supported? - Can async functions be supported? (see [here](https://github.com/rust-lang/rfcs/pull/1888#issuecomment-1186604115) for an initial assessment) - Can closures be supported? (see [here](https://github.com/rust-lang/rfcs/pull/1888#issuecomment-1186604115) for an initial assessment) - - Can dynamic function calls be supported? - - Can functions outside the current crate be supported, functions from dynamically loaded libraries? - Can functions that abort be supported? - Is there some way to reduce the impact on debugging and other features? - What related issues do you consider out of scope for this RFC that could be addressed in the future independently of From 18ec62d04e1ed591323aa334c020f3dfc4fa0998 Mon Sep 17 00:00:00 2001 From: phi-go Date: Sat, 13 May 2023 10:04:53 +0200 Subject: [PATCH 60/92] add resolved questions section --- text/0000-explicit-tail-calls.md | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index 41a79e4adff..f31ac25d822 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -603,6 +603,15 @@ https://github.com/carbon-language/carbon-lang/issues/1761#issuecomment-11986720 the solution that comes out of this RFC? - Supporting general tail calls, the current RFC restricts function signatures which can be loosened independently in the future. +## Resolved Questions + +- Can generic functions be supported? + - As Rust uses Monomophization, generic functions are not a problem. +- Can dynamic function calls be supported? + - Dynamic function calls are supported ([confirmation](https://github.com/rust-lang/rfcs/pull/3407#discussion_r1191600480)). +- Can functions outside the current crate be supported, functions from dynamically loaded libraries? + - Same as dynamic function calls these function calls are supported ([confirmed for LLVM](https://github.com/rust-lang/rfcs/pull/3407#discussion_r1191602364)). + # Future possibilities [future-possibilities]: #future-possibilities Implementation of this feature requires checks that all prerequisites to guarantee TCE are fulfilled. These checks are: + - The `become` keyword is only used in place of `return`. The intent is to reuse the semantics of a `return` signifying "the end of a function". See the section on [tail-call-elimination](#tail-call-elimination) for examples. -- The argument to `become` is a function (or method) call, that exactly matches the function signature and calling convention of the callee. The intent is to assure a compatible stack frame layout. +- The argument to `become` is a function (or method) call, that exactly matches the function signature and calling convention of the callee. The intent is to ensure a matching ABI. Note that mutability and lifetimes may differ as long as they pass borrow checking. - The stack frame of the calling function is reused, this also implies that the function is never returned to. The required checks to ensure this is possible are: no borrows of local variables are passed to the called function (passing local variables by copy/move is ok since that doesn't require the local variable to continue existing after the call), and no further cleanup is necessary. These checks can be done by using the borrow checker as already described in the [section](#difference) showing the difference between `return` and `become` above. If any of these checks fail a compiler error is issued. One additional check must be done, if the backend cannot guarantee that TCE will be performed an ICE is issued. It is also suggested to ensure that the invariants provided by the pre-requisites are maintained during compilation, raising an ICE if this is not the case. -Note that as `become` is a keyword reserved for exactly the use-case described in this RFC there is no backward-compatibility break. +The type of the expression `become ` is `!` (the never type, see [here](https://doc.rust-lang.org/std/primitive.never.html)). This is consistent with other control flow constructs such as `return`, which also have the type of `!`. + +Note that as `become` is a keyword reserved for exactly the use-case described in this RFC there is no backward-compatibility break. This RFC only specifies the use of `become` inside of functions and instead leaves usage outside of functions unspecfied for use by other features. This feature will have interactions with other features that depend on stack frames, for example, debugging and backtraces. See [drawbacks](#drawbacks) for further discussion. -See below for the reasoning why operators are not supported. +See below for specifics on interations with other features. + +## Coercions of the Tail Called Function's Return Type + +All coercions that do any work (like deref coercion, unsize coercion, etc) are prohibited. +Lifetime-shortening coercions (`&'static T` -> `&'a T`) are allowed but will be checked by the borrow checker. + +Note that, while in theory, never-to-any coercions (`! -> T`) could be allowed, they are difficult to implement and require backend support. As a result they are not allowed as per this RFC. This has no effect on using macros like `panic!()` as they are not functions, affected are only functions like the following: + +```rust +fn never() -> ! { + loop {} +} + +fn tail_call_never_type() -> usize { + become never(); +} +``` + +## Closures +[closures]: #closures + +Tail calling closures _and_ tail calling _from_ closures is **not** allowed. +This is due to the high implementation effort, see below, this restriction can be lifted by a future RFC. + +Closures use the `rust-call` unstable calling convention, which would need to be adapted to guarantee TCE. +Additionally, any closure that has captures would need special handling, since the captures would currently be dropped before the tail call. + +## Variadic functions using `c_variadic` + +Tail calling [variadic functions](https://doc.rust-lang.org/beta/unstable-book/language-features/c-variadic.html) _and_ tail calling _from_ variadic functions is **not** allowed. +As support for variadic function is stabilized on a per target level, support for tail-calls regarding variadic functions would need to follow a similar approach. To avoid this complexity and to minimize implementation effort for backends, this interaction is currently not allowed but supported can be added with a future RFC. + +## Generators + +Tail calling [generators](https://doc.rust-lang.org/beta/unstable-book/language-features/generators.html) is **not** allowed as it is a fundamental mismatch of functionality. Generators expect to be called multiple times `yield`ing values, however, when using a tail call control would never be returned to the calling function. + +Tail calling from generators is also **not** allowed, as the generator state is stored internally, tail calling from the generator function would require additional support to function correctly. To limit implementation effort this is not supported but can be supported by a future RFC. + +## Async + +Tail calling async functions is **not** allowed as it requires special support for the async state machine. +To minimize implementation effort this interaction is currently not allowed but can be supported by a future RFC. ## Operators are not supported + Invocations of operators were considered as valid targets but were rejected on grounds of being too error-prone. In any case, these can still be called as methods. One example of their error-prone nature ([source](https://github.com/rust-lang/rfcs/pull/3407#discussion_r1167112296)): ```rust @@ -414,7 +479,7 @@ These alternatives mostly come down to personal taste (or bikeshedding) and the - The behavior changes in subtle ways compared to a plain `return`. To clearly indicate this change in behavior a stronger distinction from `return` than adding a mark seems warranted. - TCE as proposed in this RFC requires dropping local variables before the function call instead of after with `return`. - From a type system perspective the type of the `return` expression (`!`, the never type, see [here](https://doc.rust-lang.org/std/primitive.never.html) for an example) stays the same even when adding one of the markings. This means that type-checking can not help if the marking is forgotten or added mistakenly. (Note that the argument, the function call, can still be type checked, just not the `return` expression.) - + ### Require Explicit Dropping of Variables (Based on this [comment](https://github.com/rust-lang/rfcs/pull/3407#issuecomment-1532841475)) @@ -596,7 +661,6 @@ https://github.com/carbon-language/carbon-lang/issues/1761#issuecomment-11986720 - Can the restrictions on function signatures be relaxed? - One option for intra-crate direct calls is to automatically pad the arguments during compilation see [here](https://github.com/rust-lang/rfcs/pull/3407#issuecomment-1500620309). Does this have an influence on other calls? How much implementation effort is it? - Can async functions be supported? (see [here](https://github.com/rust-lang/rfcs/pull/1888#issuecomment-1186604115) for an initial assessment) - - Can closures be supported? (see [here](https://github.com/rust-lang/rfcs/pull/1888#issuecomment-1186604115) for an initial assessment) - Can functions that abort be supported? - Is there some way to reduce the impact on debugging and other features? - What related issues do you consider out of scope for this RFC that could be addressed in the future independently of @@ -611,6 +675,8 @@ https://github.com/carbon-language/carbon-lang/issues/1761#issuecomment-11986720 - Dynamic function calls are supported ([confirmation](https://github.com/rust-lang/rfcs/pull/3407#discussion_r1191600480)). - Can functions outside the current crate be supported, functions from dynamically loaded libraries? - Same as dynamic function calls these function calls are supported ([confirmed for LLVM](https://github.com/rust-lang/rfcs/pull/3407#discussion_r1191602364)). +- Can closures be supported? + - Closures are **not** supported see [here](#closures). # Future possibilities [future-possibilities]: #future-possibilities From 735486278282cd6a837e85717a52f2af5ea092c2 Mon Sep 17 00:00:00 2001 From: phi-go Date: Mon, 15 May 2023 14:59:51 +0200 Subject: [PATCH 62/92] remove unneeded comment Co-authored-by: Waffle Maybe --- text/0000-explicit-tail-calls.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index 75b2ec355e4..469ac673db2 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -228,7 +228,7 @@ fn execute_instruction_bar(mut self) { } fn next_instruction(mut self) { - let next_instruction = self.read_instr(); // this call can be inlined + let next_instruction = self.read_instr(); match next_instruction { Instruction::Foo => become self.execute_instruction_foo(), Instruction::Bar => become self.execute_instruction_bar(), From c6473326155e42026b8331d5dbd81d831636f33f Mon Sep 17 00:00:00 2001 From: phi-go Date: Mon, 15 May 2023 17:00:08 +0200 Subject: [PATCH 63/92] add type comment to example Co-authored-by: Waffle Maybe --- text/0000-explicit-tail-calls.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index 469ac673db2..4e7e378fc21 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -358,7 +358,7 @@ fn never() -> ! { } fn tail_call_never_type() -> usize { - become never(); + become never(); //~ error: mismatched types } ``` From e7bc960d699a926a4c582f3809bcde51461dd91c Mon Sep 17 00:00:00 2001 From: phi-go Date: Mon, 15 May 2023 17:24:07 +0200 Subject: [PATCH 64/92] update description of return type coercion --- text/0000-explicit-tail-calls.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index 4e7e378fc21..69262e91a9a 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -350,7 +350,9 @@ See below for specifics on interations with other features. All coercions that do any work (like deref coercion, unsize coercion, etc) are prohibited. Lifetime-shortening coercions (`&'static T` -> `&'a T`) are allowed but will be checked by the borrow checker. -Note that, while in theory, never-to-any coercions (`! -> T`) could be allowed, they are difficult to implement and require backend support. As a result they are not allowed as per this RFC. This has no effect on using macros like `panic!()` as they are not functions, affected are only functions like the following: +Note that, while in theory, never-to-any coercions (`! -> T`) could be allowed, they are difficult to implement and require backend support. As a result they are not allowed as per this RFC. + +To be clear, this only concerns functions that have the never return type like the following example: ```rust fn never() -> ! { From c5301f2acac89c1d33904b70b374f3b3986fe193 Mon Sep 17 00:00:00 2001 From: phi-go Date: Mon, 15 May 2023 17:31:53 +0200 Subject: [PATCH 65/92] add pointer coercions --- text/0000-explicit-tail-calls.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index 69262e91a9a..54db64fdedb 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -350,9 +350,9 @@ See below for specifics on interations with other features. All coercions that do any work (like deref coercion, unsize coercion, etc) are prohibited. Lifetime-shortening coercions (`&'static T` -> `&'a T`) are allowed but will be checked by the borrow checker. -Note that, while in theory, never-to-any coercions (`! -> T`) could be allowed, they are difficult to implement and require backend support. As a result they are not allowed as per this RFC. +Reference/pointer coercions of the return type are **not** supported to minimize implementation effort. Though, coercions which don't change the pointee (`&mut T -> &T`, `*mut T -> *const T`, `&T -> *const T`, `&mut T -> *mut T`) could be added in the future. -To be clear, this only concerns functions that have the never return type like the following example: +Never-to-any coercions (`! -> T`) of the return type are **not** supported to minimize implementation effort. They are difficult to implement and require backend support. To be clear, this only concerns functions that have the never return type like the following example: ```rust fn never() -> ! { From 12c495b3a86326d0b4e0ece4bf5e72a04975757a Mon Sep 17 00:00:00 2001 From: phi-go Date: Tue, 16 May 2023 10:49:25 +0200 Subject: [PATCH 66/92] mismatches in mutability --- text/0000-explicit-tail-calls.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index 54db64fdedb..d2af8ae5e39 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -330,7 +330,7 @@ Implementation of this feature requires checks that all prerequisites to guarant These checks are: - The `become` keyword is only used in place of `return`. The intent is to reuse the semantics of a `return` signifying "the end of a function". See the section on [tail-call-elimination](#tail-call-elimination) for examples. -- The argument to `become` is a function (or method) call, that exactly matches the function signature and calling convention of the callee. The intent is to ensure a matching ABI. Note that mutability and lifetimes may differ as long as they pass borrow checking. +- The argument to `become` is a function (or method) call, that exactly matches the function signature and calling convention of the callee. The intent is to ensure a matching ABI. Note that lifetimes may differ as long as they pass borrow checking, see [below](#return-type-coercion) for specifics on the return type. - The stack frame of the calling function is reused, this also implies that the function is never returned to. The required checks to ensure this is possible are: no borrows of local variables are passed to the called function (passing local variables by copy/move is ok since that doesn't require the local variable to continue existing after the call), and no further cleanup is necessary. These checks can be done by using the borrow checker as already described in the [section](#difference) showing the difference between `return` and `become` above. If any of these checks fail a compiler error is issued. @@ -346,6 +346,7 @@ This feature will have interactions with other features that depend on stack fra See below for specifics on interations with other features. ## Coercions of the Tail Called Function's Return Type +[return-type-coercion]: #return-type-coercion All coercions that do any work (like deref coercion, unsize coercion, etc) are prohibited. Lifetime-shortening coercions (`&'static T` -> `&'a T`) are allowed but will be checked by the borrow checker. @@ -665,6 +666,7 @@ https://github.com/carbon-language/carbon-lang/issues/1761#issuecomment-11986720 - Can async functions be supported? (see [here](https://github.com/rust-lang/rfcs/pull/1888#issuecomment-1186604115) for an initial assessment) - Can functions that abort be supported? - Is there some way to reduce the impact on debugging and other features? + - Can mismatches in mutability be supported for the arguments and return type of the function signatures? - What related issues do you consider out of scope for this RFC that could be addressed in the future independently of the solution that comes out of this RFC? - Supporting general tail calls, the current RFC restricts function signatures which can be loosened independently in the future. From 4b184876a5334f2e6f4b445e7cc590ee4cbf2137 Mon Sep 17 00:00:00 2001 From: phi-go Date: Tue, 16 May 2023 11:29:05 +0200 Subject: [PATCH 67/92] update generators --- text/0000-explicit-tail-calls.md | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index d2af8ae5e39..399a767bdb4 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -381,9 +381,7 @@ As support for variadic function is stabilized on a per target level, support fo ## Generators -Tail calling [generators](https://doc.rust-lang.org/beta/unstable-book/language-features/generators.html) is **not** allowed as it is a fundamental mismatch of functionality. Generators expect to be called multiple times `yield`ing values, however, when using a tail call control would never be returned to the calling function. - -Tail calling from generators is also **not** allowed, as the generator state is stored internally, tail calling from the generator function would require additional support to function correctly. To limit implementation effort this is not supported but can be supported by a future RFC. +Tail calling from [generators](https://doc.rust-lang.org/beta/unstable-book/language-features/generators.html) is **not** allowed. As the generator state is stored internally, tail calling from the generator function would require additional support to function correctly. To limit implementation effort this is not supported but can be supported by a future RFC. ## Async From b3dc340150cd6244baa91eff0220e6b66dcfee3c Mon Sep 17 00:00:00 2001 From: phi-go Date: Tue, 16 May 2023 18:00:42 +0200 Subject: [PATCH 68/92] update async --- text/0000-explicit-tail-calls.md | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index 399a767bdb4..3fa70229429 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -385,8 +385,11 @@ Tail calling from [generators](https://doc.rust-lang.org/beta/unstable-book/lang ## Async -Tail calling async functions is **not** allowed as it requires special support for the async state machine. -To minimize implementation effort this interaction is currently not allowed but can be supported by a future RFC. +Tail calling _from_ async functions is **not** allowed, neither calling async nor calling sync functions is supported. This is due to the high implementation effort as it requires special handling for the async state machine. This restriction can be relaxed by a future RFC. + +Using `become` on a `.await` call, such as `become f().await`, is also **not** allowed. This is because when using `.await`, the `Future` returned by `f()` is not "called" but run by the executor, thus, tail calls do not apply here. + +Note that tail calling async functions from sync code is possible but the return type for async functions is `std::future::Future`, which is unlikely to be interesting. ## Operators are not supported From 3010c2e68c568f6c8741814c20034a5a38086c0e Mon Sep 17 00:00:00 2001 From: phi-go Date: Wed, 17 May 2023 10:20:24 +0200 Subject: [PATCH 69/92] add future possibilities to relax requirements --- text/0000-explicit-tail-calls.md | 24 +++++++++++++++++++++--- 1 file changed, 21 insertions(+), 3 deletions(-) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index 3fa70229429..3cd1e757593 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -345,6 +345,10 @@ This feature will have interactions with other features that depend on stack fra See below for specifics on interations with other features. +## Mismatches in Mutability + +Mismatches in mutability (like `&T` <-> `&mut T`) for arguments and return type of the function signatures are **not** supported. This support requires a guarantee that mutability has no effect on ABI. + ## Coercions of the Tail Called Function's Return Type [return-type-coercion]: #return-type-coercion @@ -447,6 +451,8 @@ One alternative would be to support some kind of local goto natively, indeed the [pre-RFC](https://internals.rust-lang.org/t/pre-rfc-safe-goto-with-value/14470/9?u=scottmcm) ([comment](https://github.com/rust-lang/rfcs/issues/2691#issuecomment-1458604986)). This design should be able to achieve the same performance and stack usage, though it seems to be quite difficult to implement and does not seem to be as flexible as the chosen design, especially regarding indirect calls and external functions. ### Attribute on Function Declaration +[attribute-on-function-declaration]: #attribute-on-function-declaration + One alternative is to mark a group of functions that should be mutually tail-callable [example](https://github.com/rust-lang/rfcs/pull/1888#issuecomment-1161525527) with some follow-up [discussion](https://github.com/rust-lang/rfcs/pull/1888#issuecomment-1185828948). The goal behind this design is to allow TCE of functions that do not have exactly matching function signatures, in @@ -662,12 +668,9 @@ https://github.com/carbon-language/carbon-lang/issues/1761#issuecomment-11986720 - Should a lint be added for functions that are marked to be a tail call or use become. See the discussion [here](https://github.com/rust-lang/rfcs/pull/3407#discussion_r1159822824), as well as, the clippy and rustfmt changes of an initial [implementation](https://github.com/semtexzv/rust/commit/29f430976542011d53e149650f8e6c7221545207#diff-6c8f5168858fed7066e1b6c8badaca8b4a033d0204007b3e3025bf7dd33fffcb) (2022). - What parts of the design do you expect to resolve through the implementation of this feature before stabilization? - Are all calling-convention used by Rust available for TCE with the proposed restrictions on function signatures? - - Can the restrictions on function signatures be relaxed? - - One option for intra-crate direct calls is to automatically pad the arguments during compilation see [here](https://github.com/rust-lang/rfcs/pull/3407#issuecomment-1500620309). Does this have an influence on other calls? How much implementation effort is it? - Can async functions be supported? (see [here](https://github.com/rust-lang/rfcs/pull/1888#issuecomment-1186604115) for an initial assessment) - Can functions that abort be supported? - Is there some way to reduce the impact on debugging and other features? - - Can mismatches in mutability be supported for the arguments and return type of the function signatures? - What related issues do you consider out of scope for this RFC that could be addressed in the future independently of the solution that comes out of this RFC? - Supporting general tail calls, the current RFC restricts function signatures which can be loosened independently in the future. @@ -702,9 +705,13 @@ Note that having something written down in the future-possibilities section is not a reason to accept the current or a future RFC; such notes should be in the section on motivation or rationale in this or subsequent RFCs. The section merely provides additional information. --> + ## Helpers +[helpers]: #helpers + It seems possible to keep the restriction on exactly matching function signatures by offering some kind of placeholder arguments to pad out the differences. For example: + ```rust foo(a: u32, b: u32) { // uses `a` and `b` @@ -714,7 +721,9 @@ bar(a: u32, _b: u32) { // only uses `a` } ``` + Maybe it is useful to provide a macro or attribute that inserts missing arguments. + ```rust #[pad_args(foo)] bar(a: u32) { @@ -722,7 +731,16 @@ bar(a: u32) { } ``` +## Relaxing the Requirement of Strictly Matching Function Signatures for Static Calls + +It should be possible to automatically pad the arguments of static tail calls, similar to the [helpers section](#helpers) above. See this [comment](https://github.com/rust-lang/rfcs/pull/3407#issuecomment-1500620309) for details. Note that this approach does not relax requirements for dynamic calls. + +## Relaxing the Requirement of Strictly Matching Function Signatures using + +In the future a calling convention could be added to allow `become` to be used with functions that have a mismatched function signatures. This approach is close to the alternative of [adding a marker to the function declaration](#attribute-on-function-declaration). Same as the alternative, a requirement needs to be added that backends provide a calling convention that support tail calling. + ## Functional Programming + This might be wishful thinking but if TCE is supported there could be further language extensions to make Rust more attractive for functional programming paradigms. Though it is unclear to me how far this should be taken or what changes exactly would be a benefit. From 60a290bf64275c5d7e354571c3e4bed9d0207f61 Mon Sep 17 00:00:00 2001 From: phi-go Date: Wed, 17 May 2023 10:22:54 +0200 Subject: [PATCH 70/92] remove async as open question --- text/0000-explicit-tail-calls.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index 3cd1e757593..2338dedea56 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -388,6 +388,7 @@ As support for variadic function is stabilized on a per target level, support fo Tail calling from [generators](https://doc.rust-lang.org/beta/unstable-book/language-features/generators.html) is **not** allowed. As the generator state is stored internally, tail calling from the generator function would require additional support to function correctly. To limit implementation effort this is not supported but can be supported by a future RFC. ## Async +[async]: #async Tail calling _from_ async functions is **not** allowed, neither calling async nor calling sync functions is supported. This is due to the high implementation effort as it requires special handling for the async state machine. This restriction can be relaxed by a future RFC. @@ -668,7 +669,6 @@ https://github.com/carbon-language/carbon-lang/issues/1761#issuecomment-11986720 - Should a lint be added for functions that are marked to be a tail call or use become. See the discussion [here](https://github.com/rust-lang/rfcs/pull/3407#discussion_r1159822824), as well as, the clippy and rustfmt changes of an initial [implementation](https://github.com/semtexzv/rust/commit/29f430976542011d53e149650f8e6c7221545207#diff-6c8f5168858fed7066e1b6c8badaca8b4a033d0204007b3e3025bf7dd33fffcb) (2022). - What parts of the design do you expect to resolve through the implementation of this feature before stabilization? - Are all calling-convention used by Rust available for TCE with the proposed restrictions on function signatures? - - Can async functions be supported? (see [here](https://github.com/rust-lang/rfcs/pull/1888#issuecomment-1186604115) for an initial assessment) - Can functions that abort be supported? - Is there some way to reduce the impact on debugging and other features? - What related issues do you consider out of scope for this RFC that could be addressed in the future independently of @@ -685,6 +685,8 @@ https://github.com/carbon-language/carbon-lang/issues/1761#issuecomment-11986720 - Same as dynamic function calls these function calls are supported ([confirmed for LLVM](https://github.com/rust-lang/rfcs/pull/3407#discussion_r1191602364)). - Can closures be supported? - Closures are **not** supported see [here](#closures). +- Can async functions be supported? + - Async functions are **not** supported see [here](#async). # Future possibilities [future-possibilities]: #future-possibilities From 72f8e4aee831411ed467151635a9552a9d9bf845 Mon Sep 17 00:00:00 2001 From: phi-go Date: Wed, 17 May 2023 10:23:56 +0200 Subject: [PATCH 71/92] remove open question on functions that abort --- text/0000-explicit-tail-calls.md | 1 - 1 file changed, 1 deletion(-) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index 2338dedea56..3058e88c7d5 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -669,7 +669,6 @@ https://github.com/carbon-language/carbon-lang/issues/1761#issuecomment-11986720 - Should a lint be added for functions that are marked to be a tail call or use become. See the discussion [here](https://github.com/rust-lang/rfcs/pull/3407#discussion_r1159822824), as well as, the clippy and rustfmt changes of an initial [implementation](https://github.com/semtexzv/rust/commit/29f430976542011d53e149650f8e6c7221545207#diff-6c8f5168858fed7066e1b6c8badaca8b4a033d0204007b3e3025bf7dd33fffcb) (2022). - What parts of the design do you expect to resolve through the implementation of this feature before stabilization? - Are all calling-convention used by Rust available for TCE with the proposed restrictions on function signatures? - - Can functions that abort be supported? - Is there some way to reduce the impact on debugging and other features? - What related issues do you consider out of scope for this RFC that could be addressed in the future independently of the solution that comes out of this RFC? From 980ebeb881088337060363ff621ec2d886965681 Mon Sep 17 00:00:00 2001 From: phi-go Date: Wed, 17 May 2023 10:30:57 +0200 Subject: [PATCH 72/92] fix typos --- text/0000-explicit-tail-calls.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index 3058e88c7d5..16bee5eabe1 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -465,7 +465,7 @@ While quite noisy it is also less flexible than the chosen approach. Indeed, TCE function definition, sometimes a call should be guaranteed to be TCE, and sometimes not, marking a function would be less flexible. -### Adding a mark to `return`. +### Adding a mark to `return` The return keyword could be marked using an attribute or an extra keyword as in the example below. @@ -736,7 +736,7 @@ bar(a: u32) { It should be possible to automatically pad the arguments of static tail calls, similar to the [helpers section](#helpers) above. See this [comment](https://github.com/rust-lang/rfcs/pull/3407#issuecomment-1500620309) for details. Note that this approach does not relax requirements for dynamic calls. -## Relaxing the Requirement of Strictly Matching Function Signatures using +## Relaxing the Requirement of Strictly Matching Function Signatures with a new Calling Convention In the future a calling convention could be added to allow `become` to be used with functions that have a mismatched function signatures. This approach is close to the alternative of [adding a marker to the function declaration](#attribute-on-function-declaration). Same as the alternative, a requirement needs to be added that backends provide a calling convention that support tail calling. From 51ecc12222b02663da5a581875be8e8b87025959 Mon Sep 17 00:00:00 2001 From: phi-go Date: Wed, 17 May 2023 11:29:13 +0200 Subject: [PATCH 73/92] update zig example --- text/0000-explicit-tail-calls.md | 20 +++++++++++--------- 1 file changed, 11 insertions(+), 9 deletions(-) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index 16bee5eabe1..c6692a6a61e 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -613,20 +613,22 @@ GCC does not support a feature equivalent to Clang's `musttail`, there also does ## Zig Zig provides separate syntax to allow more flexibility than normal function calls. There are options for async calls, inlining, compile-time evaluation of the called function, or specifying TCE on the call. ([source](https://ziglang.org/documentation/master/#call)) -```zig -const expect = @import("std").testing.expect; -test "noinline function call" { - try expect(@call(.auto, add, .{3, 9}) == 12); -} +The following is an example taken from here (https://zig.godbolt.org/z/v13vrjxG4, a toy lexer using tail calls in Zig): -fn add(a: i32, b: i32) i32 { - return a + b; +```zig +export fn lex(data: *Data) callconv(.C) u32 +{ + if(data.cursor >= data.input.len) + return data.tokens; + switch(data.input[data.cursor]) { + 'a' => return @call(.always_tail, lex_a, .{data}), + 'b' => return @call(.always_tail, lex_b, .{data}), + else => return @call(.always_tail, lex_err, .{data}), + } } ``` - - ## Carbon As per this [issue](https://github.com/carbon-language/carbon-lang/issues/1761) it seems providing TCE is of interest even if the implementation is difficult From dfcf7e84471c536fa75cbd4f2d8c4816e81dd503 Mon Sep 17 00:00:00 2001 From: phi-go Date: Sun, 21 May 2023 12:44:03 +0200 Subject: [PATCH 74/92] extend rationale and alternatives section --- text/0000-explicit-tail-calls.md | 41 +++++++++++++++++++++++++------- 1 file changed, 33 insertions(+), 8 deletions(-) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index c6692a6a61e..cb76c216e95 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -428,17 +428,40 @@ There is also an unwanted interaction between TCE and debugging. As TCE by desig # Rationale and alternatives + [rationale-and-alternatives]: #rationale-and-alternatives +In this section, the reason for choosing the design is discussed as well as possible alternatives that have been considered. + ## Why is this design the best in the space of possible designs? -This design is the best tradeoff between implementation effort and functionality while also offering a good starting -point toward further exploration of a more general implementation. To expand on this, compared to other options -creating a function local scope with the use of `become` greatly reduces implementation effort. Additionally, limiting -tail-callable functions to those with exactly matching function signatures and calling conventions enforces a common -stack layout across all functions. This should in theory, depending on the backend, allow tail calls to be performed -without any stack shuffling, indeed it is even possible to do so for indirect calls or external functions. + +Of all possible alternatives, this design best fits the tradeoff between implementation effort and functionality while also offering a starting point toward further exploration of a more general implementation. Regarding implementation effort, this design requires the least of backends while not already implementable via a library, see [here](#backend-requirements). Regarding functionality, the proposed design requires function signatures to match, however, this restriction still allows tail calls between functions that _use_ different arguments. This can be done by requiring the programmer to add (unused) arguments to both function definitions so that they match. Additionally, the chosen design allows tail calls for dynamic calls and other variations. + +### Creating a Function Local Scope + +Using the `become` keyword creates a function local scope that drops all variables not used in the tail call, as would be done for `return`. There is no alternative to this approach as the stack frame needs to be prepared for the tail call. + +### Backend Requirements + +The main hurdle to implementing this feature is the required work to be done by the backends (e.g. LLVM). See the following list for an overview of approaches that have been considered for this RFC, going by increasing demand on the backends: + +1. **Internal Transformation** - Use a MIR transformation to implement tail calls without any backend requirements, possible implementations are: Defunctionalization and [Trampolines](#trampoline-based-approach). However, all proposed options can only support static function calls. This is one reason this option is not chosen, as dynamic function calls seem too important to ignore, see [here](#what-should-be-tail-callable). Another reason is that this approach can already be done by libraries. +2. **Matching Function Signatures** (this RFC) - Require that the caller and callee have matching function signatures. With this restriction, it is possible to do tail calls regardless of the calling convention used, see the next point for why this is important. Though the calling convention needs to match between caller and callee. All that needs to be done by the backends is to overwrite the arguments in place with the values for the tail call. By requiring a matching calling convention and function signature between caller and callee the ABI is guaranteed to match as well. This is also quite similar to how `musttail` is used in practice for Clang: "Implementor experience with Clang shows that the ABI of the caller and callee must be identical for the feature to work [...]" (see [here](#clang)). +3. **Mark Function Definition** - One hurdle to guaranteeing tail calls is that the default calling conventions used by backends usually do not support tail calls and instead another calling convention needs to be used. As it is unreasonable to expect changing the default calling convention, one [option](#attribute-on-function-declaration) is to mark functions that should use a calling convention amenable to tail calls. This requires that backends can support tail calls when allowed to change the calling convention. This requirement, however, already seems quite difficult to establish. For example, this [thread](https://github.com/rust-lang/rfcs/pull/3407#discussion_r1186003262) discusses why this approach is not reasonable. +4. **Backend Specific** - Depend on the backend to decide if a tail call can be performed. While this approach allows gradual advancements it also seems the most unstable and difficult to use. + +As described by the list of approaches above, this RFC specifies the approach that is most attainable and still useful in practice while not already implementable via a library. + +### What should be Tail Callable + +Tail calls can be implemented without backend support if only static calls are supported, see the following list for reasons why other calls should be supported: + +- **Dynamic Calls** One example that depends on dynamic tail calls is a C implementation of a Protobuf parser, see [here](https://github.com/rust-lang/rfcs/pull/3407#issuecomment-1500291721). +- **Calls across Crates** This will allow tail calls to library functions, enabling libraries to support code that requires constant stack usage or make calls more performant. +- **Calls to Dynamically Loaded Functions** As an example this would be useful to improve performance for an emulator that uses a JIT. ## What other designs have been considered and what is the rationale for not choosing them? + There are some designs that either can not achieve the same performance or functionality as the chosen approach. Though most other designs evolve around how to mark what should be a tail-call or marking what functions can be tail called. There is also the possibility of providing support for a custom backend (e.g. LLVM) or MIR pass. ### Trampoline based Approach @@ -447,6 +470,8 @@ There could be a trampoline-based approach of using constant stack space, though they can not be used to achieve the performance that the chosen design is capable of. +Similarly, as mentioned [here](https://github.com/rust-lang/rfcs/pull/3407#discussion_r1190464739), an approach used by Chicken Scheme is to do normal calls and handle stack overflows by cleaning up the stack. + ### Principled Local Goto One alternative would be to support some kind of local goto natively, indeed there exists a [pre-RFC](https://internals.rust-lang.org/t/pre-rfc-safe-goto-with-value/14470/9?u=scottmcm) ([comment](https://github.com/rust-lang/rfcs/issues/2691#issuecomment-1458604986)). This design should be able to achieve the same performance and stack usage, though it seems to be quite difficult to implement and does not seem to be as flexible as the chosen design, especially regarding indirect calls and external functions. @@ -630,7 +655,7 @@ export fn lex(data: *Data) callconv(.C) u32 ``` ## Carbon -As per this [issue](https://github.com/carbon-language/carbon-lang/issues/1761) it seems providing TCE is of interest even if the implementation is difficult +As per this [issue](https://github.com/carbon-language/carbon-lang/issues/1761) it seems providing TCE is of interest even if the implementation is difficult. ## .Net @@ -740,7 +765,7 @@ It should be possible to automatically pad the arguments of static tail calls, s ## Relaxing the Requirement of Strictly Matching Function Signatures with a new Calling Convention -In the future a calling convention could be added to allow `become` to be used with functions that have a mismatched function signatures. This approach is close to the alternative of [adding a marker to the function declaration](#attribute-on-function-declaration). Same as the alternative, a requirement needs to be added that backends provide a calling convention that support tail calling. +In the future, a calling convention could be added to allow `become` to be used with functions that have mismatched function signatures. This approach is close to the alternative of [adding a marker to the function declaration](#attribute-on-function-declaration). Same as the alternative, a requirement needs to be added that backends provide a calling convention that support tail calling. ## Functional Programming From f7652a93f39b7dd9c09c1709849953923575cde7 Mon Sep 17 00:00:00 2001 From: phi-go Date: Tue, 23 May 2023 12:57:21 +0200 Subject: [PATCH 75/92] update async description Co-authored-by: Waffle Maybe --- text/0000-explicit-tail-calls.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index cb76c216e95..77dce5ba450 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -392,9 +392,9 @@ Tail calling from [generators](https://doc.rust-lang.org/beta/unstable-book/lang Tail calling _from_ async functions is **not** allowed, neither calling async nor calling sync functions is supported. This is due to the high implementation effort as it requires special handling for the async state machine. This restriction can be relaxed by a future RFC. -Using `become` on a `.await` call, such as `become f().await`, is also **not** allowed. This is because when using `.await`, the `Future` returned by `f()` is not "called" but run by the executor, thus, tail calls do not apply here. +Using `become` on a `.await` expression, such as `become f().await`, is also **not** allowed. This is because `become` requires a function call and `.await` is not a function call, but is a special construct. -Note that tail calling async functions from sync code is possible but the return type for async functions is `std::future::Future`, which is unlikely to be interesting. +Note that tail calling async functions from sync code is possible but the return type for async functions is `impl Future`, which is unlikely to be interesting. ## Operators are not supported From 0533221987fae17020ab3036aae8b3520b08324e Mon Sep 17 00:00:00 2001 From: phi-go Date: Tue, 23 May 2023 12:58:12 +0200 Subject: [PATCH 76/92] update async description Co-authored-by: Waffle Maybe --- text/0000-explicit-tail-calls.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index 77dce5ba450..584f0705690 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -390,7 +390,7 @@ Tail calling from [generators](https://doc.rust-lang.org/beta/unstable-book/lang ## Async [async]: #async -Tail calling _from_ async functions is **not** allowed, neither calling async nor calling sync functions is supported. This is due to the high implementation effort as it requires special handling for the async state machine. This restriction can be relaxed by a future RFC. +Tail calling _from_ async functions or async blocks is **not** allowed. This is due to the high implementation effort as it requires special handling for the async state machine. This restriction can be relaxed by a future RFC. Using `become` on a `.await` expression, such as `become f().await`, is also **not** allowed. This is because `become` requires a function call and `.await` is not a function call, but is a special construct. From 8706d7bfc79364261e49e14c12af09159a91f3e1 Mon Sep 17 00:00:00 2001 From: phi-go Date: Tue, 23 May 2023 12:58:51 +0200 Subject: [PATCH 77/92] fix typo Co-authored-by: Waffle Maybe --- text/0000-explicit-tail-calls.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index 584f0705690..7d2ca9de4db 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -410,7 +410,7 @@ pub fn fibonacci(n: u64) -> u64 { ``` In this case, a naive author might assume that this is going to be a stack space-efficient implementation since it uses tail recursion instead of normal recursion. However, the outcome is more or less the same since the critical recursive calls are not actually in tail call position. -Further confusion could result from the same-signature restriction where the Rust compiler raises an error since fibonacci and ::add do not share a common signature. +Further confusion could result from the same-signature restriction where the Rust compiler raises an error since fibonacci and `::add` do not share a common signature. # Drawbacks [drawbacks]: #drawbacks From 0b61f079d8eb0a8caf216d1937a770d5fafb188e Mon Sep 17 00:00:00 2001 From: phi-go Date: Tue, 23 May 2023 13:13:32 +0200 Subject: [PATCH 78/92] update reference-level explanation --- text/0000-explicit-tail-calls.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index 7d2ca9de4db..0016998dfd7 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -331,7 +331,8 @@ These checks are: - The `become` keyword is only used in place of `return`. The intent is to reuse the semantics of a `return` signifying "the end of a function". See the section on [tail-call-elimination](#tail-call-elimination) for examples. - The argument to `become` is a function (or method) call, that exactly matches the function signature and calling convention of the callee. The intent is to ensure a matching ABI. Note that lifetimes may differ as long as they pass borrow checking, see [below](#return-type-coercion) for specifics on the return type. -- The stack frame of the calling function is reused, this also implies that the function is never returned to. The required checks to ensure this is possible are: no borrows of local variables are passed to the called function (passing local variables by copy/move is ok since that doesn't require the local variable to continue existing after the call), and no further cleanup is necessary. These checks can be done by using the borrow checker as already described in the [section](#difference) showing the difference between `return` and `become` above. +- The stack frame of the calling function is reused, this also implies that the function is never returned to. The required checks to ensure this is possible are: no borrows of local variables are passed to the called function (passing local variables by copy/move is ok since that doesn't require the local variable to continue existing after the call), and no further cleanup is necessary. These checks can be done by using the borrow checker as already described in the [section](#difference) showing the difference between `return` and `become` above. +- The restrictions, caused by interactions with other features, are followed. See below for details, the restrictions mostly concern caller context and callee signatures. If any of these checks fail a compiler error is issued. From b7c69e4a21a5d9552a744fb02bcbab16d455b351 Mon Sep 17 00:00:00 2001 From: phi-go Date: Tue, 23 May 2023 13:18:34 +0200 Subject: [PATCH 79/92] move mutability mismatch to future --- text/0000-explicit-tail-calls.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index 0016998dfd7..7ce67124a31 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -346,10 +346,6 @@ This feature will have interactions with other features that depend on stack fra See below for specifics on interations with other features. -## Mismatches in Mutability - -Mismatches in mutability (like `&T` <-> `&mut T`) for arguments and return type of the function signatures are **not** supported. This support requires a guarantee that mutability has no effect on ABI. - ## Coercions of the Tail Called Function's Return Type [return-type-coercion]: #return-type-coercion @@ -768,6 +764,10 @@ It should be possible to automatically pad the arguments of static tail calls, s In the future, a calling convention could be added to allow `become` to be used with functions that have mismatched function signatures. This approach is close to the alternative of [adding a marker to the function declaration](#attribute-on-function-declaration). Same as the alternative, a requirement needs to be added that backends provide a calling convention that support tail calling. +## Mismatches in Mutability + +Mismatches in mutability (like `&T` <-> `&mut T`) for arguments and return type of the function signatures are currently not supported as they are different types. However, this mismatch could be supported if there is a guarantee that mutability has no effect on ABI. For more details, see [here](https://github.com/rust-lang/rfcs/pull/3407#discussion_r1193897615). + ## Functional Programming This might be wishful thinking but if TCE is supported there could be further language extensions to make Rust From 23106d2e678363ee4d4d6c6f16e535539dcd3e65 Mon Sep 17 00:00:00 2001 From: phi-go Date: Tue, 23 May 2023 14:49:03 +0200 Subject: [PATCH 80/92] add fn tokens --- text/0000-explicit-tail-calls.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index 7ce67124a31..49440968b77 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -738,11 +738,11 @@ It seems possible to keep the restriction on exactly matching function signature arguments to pad out the differences. For example: ```rust -foo(a: u32, b: u32) { +fn foo(a: u32, b: u32) { // uses `a` and `b` } -bar(a: u32, _b: u32) { +fn bar(a: u32, _b: u32) { // only uses `a` } ``` @@ -751,7 +751,7 @@ Maybe it is useful to provide a macro or attribute that inserts missing argument ```rust #[pad_args(foo)] -bar(a: u32) { +fn bar(a: u32) { // ... } ``` From 2946d6346048cd9737f4a7d73fd5bada0589ea44 Mon Sep 17 00:00:00 2001 From: phi-go Date: Wed, 24 May 2023 11:05:46 +0200 Subject: [PATCH 81/92] performance guarantee --- text/0000-explicit-tail-calls.md | 19 ++++++++++++++----- 1 file changed, 14 insertions(+), 5 deletions(-) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index 49440968b77..6d2293b9f9a 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -331,12 +331,12 @@ These checks are: - The `become` keyword is only used in place of `return`. The intent is to reuse the semantics of a `return` signifying "the end of a function". See the section on [tail-call-elimination](#tail-call-elimination) for examples. - The argument to `become` is a function (or method) call, that exactly matches the function signature and calling convention of the callee. The intent is to ensure a matching ABI. Note that lifetimes may differ as long as they pass borrow checking, see [below](#return-type-coercion) for specifics on the return type. -- The stack frame of the calling function is reused, this also implies that the function is never returned to. The required checks to ensure this is possible are: no borrows of local variables are passed to the called function (passing local variables by copy/move is ok since that doesn't require the local variable to continue existing after the call), and no further cleanup is necessary. These checks can be done by using the borrow checker as already described in the [section](#difference) showing the difference between `return` and `become` above. +- The stack frame of the calling function is reused, this also implies that the function is never returned to. The required checks to ensure this is possible are: no borrows of local variables are passed to the called function (passing local variables by copy/move is ok since that doesn't require the local variable to continue existing after the call), and no further cleanup is necessary. These checks can be done by using the borrow checker as already described in the [section](#difference) showing the difference between `return` and `become` above. - The restrictions, caused by interactions with other features, are followed. See below for details, the restrictions mostly concern caller context and callee signatures. -If any of these checks fail a compiler error is issued. +If any of these checks fail a compiler error is issued. It is also suggested to ensure that the invariants provided by the pre-requisites are maintained during compilation, raising an ICE if this is not the case. -One additional check must be done, if the backend cannot guarantee that TCE will be performed an ICE is issued. It is also suggested to ensure that the invariants provided by the pre-requisites are maintained during compilation, raising an ICE if this is not the case. +One additional check must be done, if the backend cannot guarantee that TCE will be performed an ICE is issued. To be specific the backend is required that: "A tail call will not cause unbounded stack growth if it is part of a recursive cycle in the call graph". The type of the expression `become ` is `!` (the never type, see [here](https://doc.rust-lang.org/std/primitive.never.html)). This is consistent with other control flow constructs such as `return`, which also have the type of `!`. @@ -344,7 +344,7 @@ Note that as `become` is a keyword reserved for exactly the use-case described i This feature will have interactions with other features that depend on stack frames, for example, debugging and backtraces. See [drawbacks](#drawbacks) for further discussion. -See below for specifics on interations with other features. +See below for specifics on interactions with other features. ## Coercions of the Tail Called Function's Return Type [return-type-coercion]: #return-type-coercion @@ -688,7 +688,6 @@ https://github.com/carbon-language/carbon-lang/issues/1761#issuecomment-11986720 - What parts of the design do you expect to resolve through the RFC process before this gets merged? - One point that needs to be decided is if TCE should be a feature that needs to be required from all backends or if it can be optional. Currently, the RFC specifies that an ICE should be issued if a backend cannot guarantee that TCE will be performed. - - Another point that needs to be decided is if TCE is supported by a backend what exactly should be guaranteed? While the guarantee that there is no stack growth should be necessary, should performance (as in transforming `call` instructions into `jmp`) also be guaranteed? Note that a backend that guarantees performance should do so **always** otherwise the main intent of this RFC seems to be lost. - Migration guidance, it might be interesting to provide a lint that indicates that a trivial transformation from `return` to `become` can be done for function calls where all requisites are already fulfilled. However, this lint might be confusing and noisy. Decide on if this lint or others should be added. - Should a lint be added for functions that are marked to be a tail call or use become. See the discussion [here](https://github.com/rust-lang/rfcs/pull/3407#discussion_r1159822824), as well as, the clippy and rustfmt changes of an initial [implementation](https://github.com/semtexzv/rust/commit/29f430976542011d53e149650f8e6c7221545207#diff-6c8f5168858fed7066e1b6c8badaca8b4a033d0204007b3e3025bf7dd33fffcb) (2022). - What parts of the design do you expect to resolve through the implementation of this feature before stabilization? @@ -710,6 +709,8 @@ https://github.com/carbon-language/carbon-lang/issues/1761#issuecomment-11986720 - Closures are **not** supported see [here](#closures). - Can async functions be supported? - Async functions are **not** supported see [here](#async). +- Should "performance" be guaranteed by the backends? + - "Performance" is **not** guaranteed by the backends, see [here](#performance-guarantee). # Future possibilities [future-possibilities]: #future-possibilities @@ -768,6 +769,14 @@ In the future, a calling convention could be added to allow `become` to be used Mismatches in mutability (like `&T` <-> `&mut T`) for arguments and return type of the function signatures are currently not supported as they are different types. However, this mismatch could be supported if there is a guarantee that mutability has no effect on ABI. For more details, see [here](https://github.com/rust-lang/rfcs/pull/3407#discussion_r1193897615). +## Performance Guarantee + +First of all, performance is ambiguous. As the stand in we use the requirement that no new stack frame is created for a tail call. The reason for this choice is that creating a new stack frame can be a large part of computation time in hot loops that do calls, this is a code construct that can likely be optimized with tail calls. + +Can the requirement to not create new stack frames when using tail calls be imposed on backends? This answer seems to be no, even for LLVM. LLVM provides some [guarantees](https://llvm.org/docs/LangRef.html#call-instruction) for tail calls, however, none do ensure that no new stack frame is created (as of 24-05-2023). + +If it turns out that in practice the no new stack frame requirement is not already kept it might be worthwhile to revisit this performance requirement. + ## Functional Programming This might be wishful thinking but if TCE is supported there could be further language extensions to make Rust From cdac5c208443fa34c8a481f5e8a23ed8b3cebc32 Mon Sep 17 00:00:00 2001 From: phi-go Date: Wed, 24 May 2023 12:11:55 +0200 Subject: [PATCH 82/92] lint unresolved question to future possibilites --- text/0000-explicit-tail-calls.md | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index 6d2293b9f9a..66f89049b9e 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -688,8 +688,6 @@ https://github.com/carbon-language/carbon-lang/issues/1761#issuecomment-11986720 - What parts of the design do you expect to resolve through the RFC process before this gets merged? - One point that needs to be decided is if TCE should be a feature that needs to be required from all backends or if it can be optional. Currently, the RFC specifies that an ICE should be issued if a backend cannot guarantee that TCE will be performed. - - Migration guidance, it might be interesting to provide a lint that indicates that a trivial transformation from `return` to `become` can be done for function calls where all requisites are already fulfilled. However, this lint might be confusing and noisy. Decide on if this lint or others should be added. - - Should a lint be added for functions that are marked to be a tail call or use become. See the discussion [here](https://github.com/rust-lang/rfcs/pull/3407#discussion_r1159822824), as well as, the clippy and rustfmt changes of an initial [implementation](https://github.com/semtexzv/rust/commit/29f430976542011d53e149650f8e6c7221545207#diff-6c8f5168858fed7066e1b6c8badaca8b4a033d0204007b3e3025bf7dd33fffcb) (2022). - What parts of the design do you expect to resolve through the implementation of this feature before stabilization? - Are all calling-convention used by Rust available for TCE with the proposed restrictions on function signatures? - Is there some way to reduce the impact on debugging and other features? @@ -732,6 +730,13 @@ is not a reason to accept the current or a future RFC; such notes should be in the section on motivation or rationale in this or subsequent RFCs. The section merely provides additional information. --> +## Lints + +The functionality introduced by RFC also has possible pitfalls, it is likely worthwhile to provide lints that warn of these issues. See the discussion [here](https://github.com/rust-lang/rfcs/pull/3407#discussion_r1159822824) for possible lints. + +Additionally, there can be another class of lints, those that guide migration to using `become`. +For example, provide a lint that indicates if a trivial transformation from `return` to `become` can be done for function calls where all requisites are already fulfilled. Note that, this lint might be confusing and noisy. + ## Helpers [helpers]: #helpers From c7e4ea7695d261a8a196cad10f725d135588130c Mon Sep 17 00:00:00 2001 From: phi-go Date: Thu, 25 May 2023 10:46:26 +0200 Subject: [PATCH 83/92] expand motivation section --- text/0000-explicit-tail-calls.md | 54 +++++++++++++++++++++++++++----- 1 file changed, 46 insertions(+), 8 deletions(-) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index 66f89049b9e..a9a5b153358 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -13,8 +13,8 @@ If this guarantee can not be provided by the compiler a compile time error is ge [motivation]: #motivation Tail call elimination (TCE) allows stack frames to be reused. While TCE via tail call optimization (TCO) is already supported by Rust, as is normal for optimizations, TCO will only be applied if the compiler expects an improvement by doing so. -However, the compiler can't have ideal analysis and thus will not always be correct in judging if a optimization should be applied. -This RFC, shows an approach how TCE can be guaranteed. +However, the compiler can't have ideal analysis and thus will not always be correct in judging if an optimization should be applied. +This RFC shows how TCE can be guaranteed in Rust. The guarantee for TCE is interesting for two general goals. One goal is to do function calls without growing the stack, this mainly has semantic implications as recursive algorithms can overflow the stack without this guarantee. @@ -27,12 +27,50 @@ programming style. For the second goal, TCO can have the intended effect, however, there is no guarantee. This can result in unexpected slow-downs, for example, as can be seen in this [issue](https://github.com/rust-lang/rust/issues/102952). Some specific use cases that are supported by this feature are new ways to encode state machines and jump tables, -allowing code to be written in a continuation-passing style, using recursive algorithms without the danger of -overflowing the stack or guaranteeing significantly faster interpreters/emulators. One common example of the -usefulness of tail calls in C is improving the performance of Protobuf parsing as described in this -[blog post](https://blog.reverberate.org/2021/04/21/musttail-efficient-interpreters.html), -this approach would then also be possible in Rust. - +code written in a continuation-passing style, ensuring recursive algorithms do not +overflow the stack, and guaranteeing good code generation for interpreters. For a language like Rust that considers performance-oriented uses to be in scope, it is important to support these kinds of programs. + +## Examples from the C/C++ ecosystem + +(This section is based on this [comment](https://github.com/rust-lang/rfcs/pull/3407#issuecomment-1562094439), all credit goes to @traviscross.) + +The C/C++ ecosystem already has access to guaranteed TCE via Clang's [`musttail`](https://clang.llvm.org/docs/AttributeReference.html#musttail) attribute and GCC/Clang's [computed goto](https://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html). Based on the assumption that code which uses `musttail` or computed gotos would also use `become` in Rust, we can gauge the impact of this feature by collecting a list of example programs that would not be replicable in Rust without this RFC. + +The list of programs is generated as follows: +- GitHub was searched for [uses of `musttail`](https://github.com/search?q=%2Fclang%3A%3Amusttail%7C__attribute__%5C%28%5C%28musttail%5C%29%5C%29%2F&type=code) and [uses of computed goto](https://github.com/search?q=%2Fgoto+%5C*%5Ba-zA-Z%28%5D%2F&type=code). GitHub's search only returns five pages, so this is only a sampling. +- The most popular projects are picked and each result is checked to confirm that `musttail` or computed gotos are used. +- Additionally, for `musttail`, which was only introduced in Clang 13, projects that have comments which indicate the desire to use `musttail` once legacy compiler support can be dropped are included as well. (Of which, there are two: FreeRADIUS and Pyston). + +The resulting list of notable projects using [`musttail`](https://clang.llvm.org/docs/AttributeReference.html#musttail): + +- [Protobuf](https://github.com/protocolbuffers/protobuf/blob/755f572a6b68518bde2773d215026659fa1a69a5/src/google/protobuf/port_def.inc#L337) +- [Julia](https://github.com/JuliaLang/julia/blob/aea56a9d9547cff43c3bcfb3dac0fff91bd53793/src/llvm-multiversioning.cpp#L696) +- [Swift](https://github.com/apple/swift/blob/670f5d24577d2196730f08762f2e70be10363cf3/stdlib/public/SwiftShims/swift/shims/Visibility.h#L112) +- [Zig](https://github.com/ziglang/zig/blob/5744ceedb8ea4b3e5906175033f634b17287f3ca/lib/zig.h#L110) +- [GHC](https://github.com/ghc/ghc/blob/994bda563604461ffb8454d6e298b0310520bcc8/rts/include/Stg.h#L372) +- [Firefly](https://github.com/GetFirefly/firefly/blob/8e89bc7ec33cb8ffa9a60283c8dcb7ff62ead5fa/compiler/driver/src/compiler/passes/ssa_to_mlir/builder/function.rs#L1388) (a BEAM/Erlang implementation) +- [MLton](https://github.com/MLton/mlton/blob/d082c4a36110321b00dc099858bb640c4d2d2c24/mlton/codegen/llvm-codegen/llvm-codegen.fun#L1405) (a Standard ML compiler) +- [FreeRADIUS](https://github.com/FreeRADIUS/freeradius-server/blob/fb281257fb86aa83547d5dacecebc12271d091ab/src/lib/util/lst.c#L560) (a RADIUS implementation) +- [Skia](https://github.com/google/skia/blob/bac819cdc94a0a9fc4b3954f2ea5eec4150be103/src/opts/SkRasterPipeline_opts.h#L1205) (a graphics library from Google) +- Example [BPF code](https://blog.cloudflare.com/assembly-within-bpf-tail-calls-on-x86-and-arm/) from a Cloudflare blog post +- [RSM](https://github.com/rsms/rsm/blob/d539fd5f09876700c0c38758f2b4354df433dd1c/src/rsmimpl.h#L115) (a virtual computer in the form of a virtual machine) +- [Tails](https://github.com/snej/tails/blob/d3b14fcce18c542211bc1fd37e378f667fdee42f/src/core/platform.hh#L52) (a Forth-like interpreter) +- [Jasmin](https://github.com/asoffer/jasmin/blob/f035ef0752c09846331c8deb2109e4ebfce83200/jasmin/internal/attributes.h#L13) (a stack-based byte-code interpreter) +- [CHERIoT RTOS](https://github.com/microsoft/cheriot-rtos/blob/3e6811279fedd0195e105eb3b7ac77db93d67ec5/sdk/core/allocator/alloc.h#L1460) (a realtime operating system with memory safety) +- [Pyston](https://github.com/pyston/pyston/blob/6103fc013e9dd726efca9100a22be1ac08c58591/pyston/aot/aot_gen.py#L276) (a performance-optimizing JIT for Python) +- [upb](https://github.com/classicvalues/upb/blob/2effcce774ce05d08af635ba02b1733873e73757/upb/port_def.inc#L177) (a small protobuf implementation in C) +- Robbepop's [wasm3](https://github.com/Robbepop/wasm3/blob/1a6ca56ee1250d95363424cc3a60f8fd14f24fa7/source/m3_config_platforms.h#L86) ("the fastest WebASsembly interpreter") + + +The resulting list of notable projects using [computed goto](https://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html): + +- The [Linux](https://github.com/torvalds/linux/blob/933174ae28ba72ab8de5b35cb7c98fc211235096/kernel/bpf/core.c#L1678) kernel +- [PostgreSQL](https://github.com/postgres/postgres/blob/5c2c59ba0b5f723b067a6fa8bf8452d41fbb2125/src/backend/executor/execExprInterp.c#L119) +- [CPython](https://github.com/python/cpython/blob/41768a2bd3a8f57e6ce4e4ae9cab083b69817ec1/Python/ceval_macros.h#L76) +- [MicroPython](https://github.com/ksekimoto/micropython/blob/cd36298b9a8aec0872b439e6b302565f631c594d/py/vm.c#L219) (a lean Python implementation) +- [Godot](https://github.com/godotengine/godot/blob/4c677c88e918e22ad696f225d189124444f9665e/modules/gdscript/gdscript_vm.cpp#L392) (a 2D/3D game engine) +- [Ruby](https://github.com/ruby/ruby/blob/31b28b31fa5a0452cb9d5f7eee88eebfebe5b4d1/regexec.c#L2171) (they use it in their [regex](https://github.com/ruby/ruby/blob/31b28b31fa5a0452cb9d5f7eee88eebfebe5b4d1/regexec.c#L2171) engine as well as in their [interpreter](https://github.com/ruby/ruby/blob/31b28b31fa5a0452cb9d5f7eee88eebfebe5b4d1/vm_exec.h#L98)) +- [HHVM](https://github.com/facebook/hhvm/blob/7b0dc442a81861ee65a2fc09afe51adf89faea70/hphp/runtime/vm/bytecode.cpp#L5690) (a PHP implementation from Facebook) # Guide-level explanation [guide-level-explanation]: #guide-level-explanation From b4b3db41459beff7c2db3733d1d801e98b6bac6f Mon Sep 17 00:00:00 2001 From: phi-go Date: Thu, 25 May 2023 10:54:10 +0200 Subject: [PATCH 84/92] update wasm3 description Co-authored-by: Robin Freyler --- text/0000-explicit-tail-calls.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index a9a5b153358..788ca3149b1 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -59,7 +59,7 @@ The resulting list of notable projects using [`musttail`](https://clang.llvm.org - [CHERIoT RTOS](https://github.com/microsoft/cheriot-rtos/blob/3e6811279fedd0195e105eb3b7ac77db93d67ec5/sdk/core/allocator/alloc.h#L1460) (a realtime operating system with memory safety) - [Pyston](https://github.com/pyston/pyston/blob/6103fc013e9dd726efca9100a22be1ac08c58591/pyston/aot/aot_gen.py#L276) (a performance-optimizing JIT for Python) - [upb](https://github.com/classicvalues/upb/blob/2effcce774ce05d08af635ba02b1733873e73757/upb/port_def.inc#L177) (a small protobuf implementation in C) -- Robbepop's [wasm3](https://github.com/Robbepop/wasm3/blob/1a6ca56ee1250d95363424cc3a60f8fd14f24fa7/source/m3_config_platforms.h#L86) ("the fastest WebASsembly interpreter") +- [wasm3](https://github.com/Robbepop/wasm3/blob/1a6ca56ee1250d95363424cc3a60f8fd14f24fa7/source/m3_config_platforms.h#L86) ("the self-proclaimed fastest WebAssembly interpreter") The resulting list of notable projects using [computed goto](https://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html): From e548a962791e020c3f92efb8b33cf127ffb077bf Mon Sep 17 00:00:00 2001 From: phi-go Date: Thu, 25 May 2023 16:31:28 +0200 Subject: [PATCH 85/92] Update wasm3 link Co-authored-by: Robin Freyler --- text/0000-explicit-tail-calls.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index 788ca3149b1..b9334b622b9 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -59,7 +59,7 @@ The resulting list of notable projects using [`musttail`](https://clang.llvm.org - [CHERIoT RTOS](https://github.com/microsoft/cheriot-rtos/blob/3e6811279fedd0195e105eb3b7ac77db93d67ec5/sdk/core/allocator/alloc.h#L1460) (a realtime operating system with memory safety) - [Pyston](https://github.com/pyston/pyston/blob/6103fc013e9dd726efca9100a22be1ac08c58591/pyston/aot/aot_gen.py#L276) (a performance-optimizing JIT for Python) - [upb](https://github.com/classicvalues/upb/blob/2effcce774ce05d08af635ba02b1733873e73757/upb/port_def.inc#L177) (a small protobuf implementation in C) -- [wasm3](https://github.com/Robbepop/wasm3/blob/1a6ca56ee1250d95363424cc3a60f8fd14f24fa7/source/m3_config_platforms.h#L86) ("the self-proclaimed fastest WebAssembly interpreter") +- [wasm3](https://github.com/wasm3/wasm3/blob/1a6ca56ee1250d95363424cc3a60f8fd14f24fa7/source/m3_config_platforms.h#L86) ("the self-proclaimed fastest WebAssembly interpreter") The resulting list of notable projects using [computed goto](https://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html): From d38ba99211b1969cd4203b8526f246bc4466c89c Mon Sep 17 00:00:00 2001 From: phi-go Date: Fri, 26 May 2023 09:33:41 +0200 Subject: [PATCH 86/92] update motivation examples split up internal use and use for code generation --- text/0000-explicit-tail-calls.md | 24 +++++++++++++++--------- 1 file changed, 15 insertions(+), 9 deletions(-) diff --git a/text/0000-explicit-tail-calls.md b/text/0000-explicit-tail-calls.md index b9334b622b9..a2a3dda2534 100644 --- a/text/0000-explicit-tail-calls.md +++ b/text/0000-explicit-tail-calls.md @@ -40,28 +40,22 @@ The list of programs is generated as follows: - GitHub was searched for [uses of `musttail`](https://github.com/search?q=%2Fclang%3A%3Amusttail%7C__attribute__%5C%28%5C%28musttail%5C%29%5C%29%2F&type=code) and [uses of computed goto](https://github.com/search?q=%2Fgoto+%5C*%5Ba-zA-Z%28%5D%2F&type=code). GitHub's search only returns five pages, so this is only a sampling. - The most popular projects are picked and each result is checked to confirm that `musttail` or computed gotos are used. - Additionally, for `musttail`, which was only introduced in Clang 13, projects that have comments which indicate the desire to use `musttail` once legacy compiler support can be dropped are included as well. (Of which, there are two: FreeRADIUS and Pyston). +- Some projects use `musttail` (either Clang's or LLVM's) for code generation only, these are placed in a separate section. It is noted which of these projects expose guaranteed TCE to user code. (One project, Swift, uses it both internally and for code generation.) The resulting list of notable projects using [`musttail`](https://clang.llvm.org/docs/AttributeReference.html#musttail): - [Protobuf](https://github.com/protocolbuffers/protobuf/blob/755f572a6b68518bde2773d215026659fa1a69a5/src/google/protobuf/port_def.inc#L337) -- [Julia](https://github.com/JuliaLang/julia/blob/aea56a9d9547cff43c3bcfb3dac0fff91bd53793/src/llvm-multiversioning.cpp#L696) - [Swift](https://github.com/apple/swift/blob/670f5d24577d2196730f08762f2e70be10363cf3/stdlib/public/SwiftShims/swift/shims/Visibility.h#L112) -- [Zig](https://github.com/ziglang/zig/blob/5744ceedb8ea4b3e5906175033f634b17287f3ca/lib/zig.h#L110) -- [GHC](https://github.com/ghc/ghc/blob/994bda563604461ffb8454d6e298b0310520bcc8/rts/include/Stg.h#L372) -- [Firefly](https://github.com/GetFirefly/firefly/blob/8e89bc7ec33cb8ffa9a60283c8dcb7ff62ead5fa/compiler/driver/src/compiler/passes/ssa_to_mlir/builder/function.rs#L1388) (a BEAM/Erlang implementation) -- [MLton](https://github.com/MLton/mlton/blob/d082c4a36110321b00dc099858bb640c4d2d2c24/mlton/codegen/llvm-codegen/llvm-codegen.fun#L1405) (a Standard ML compiler) -- [FreeRADIUS](https://github.com/FreeRADIUS/freeradius-server/blob/fb281257fb86aa83547d5dacecebc12271d091ab/src/lib/util/lst.c#L560) (a RADIUS implementation) - [Skia](https://github.com/google/skia/blob/bac819cdc94a0a9fc4b3954f2ea5eec4150be103/src/opts/SkRasterPipeline_opts.h#L1205) (a graphics library from Google) +- [CHERIoT RTOS](https://github.com/microsoft/cheriot-rtos/blob/3e6811279fedd0195e105eb3b7ac77db93d67ec5/sdk/core/allocator/alloc.h#L1460) (a realtime operating system with memory safety) +- [FreeRADIUS](https://github.com/FreeRADIUS/freeradius-server/blob/fb281257fb86aa83547d5dacecebc12271d091ab/src/lib/util/lst.c#L560) (a RADIUS implementation) (_planning to use_) - Example [BPF code](https://blog.cloudflare.com/assembly-within-bpf-tail-calls-on-x86-and-arm/) from a Cloudflare blog post - [RSM](https://github.com/rsms/rsm/blob/d539fd5f09876700c0c38758f2b4354df433dd1c/src/rsmimpl.h#L115) (a virtual computer in the form of a virtual machine) - [Tails](https://github.com/snej/tails/blob/d3b14fcce18c542211bc1fd37e378f667fdee42f/src/core/platform.hh#L52) (a Forth-like interpreter) - [Jasmin](https://github.com/asoffer/jasmin/blob/f035ef0752c09846331c8deb2109e4ebfce83200/jasmin/internal/attributes.h#L13) (a stack-based byte-code interpreter) -- [CHERIoT RTOS](https://github.com/microsoft/cheriot-rtos/blob/3e6811279fedd0195e105eb3b7ac77db93d67ec5/sdk/core/allocator/alloc.h#L1460) (a realtime operating system with memory safety) -- [Pyston](https://github.com/pyston/pyston/blob/6103fc013e9dd726efca9100a22be1ac08c58591/pyston/aot/aot_gen.py#L276) (a performance-optimizing JIT for Python) - [upb](https://github.com/classicvalues/upb/blob/2effcce774ce05d08af635ba02b1733873e73757/upb/port_def.inc#L177) (a small protobuf implementation in C) - [wasm3](https://github.com/wasm3/wasm3/blob/1a6ca56ee1250d95363424cc3a60f8fd14f24fa7/source/m3_config_platforms.h#L86) ("the self-proclaimed fastest WebAssembly interpreter") - The resulting list of notable projects using [computed goto](https://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html): - The [Linux](https://github.com/torvalds/linux/blob/933174ae28ba72ab8de5b35cb7c98fc211235096/kernel/bpf/core.c#L1678) kernel @@ -72,6 +66,18 @@ The resulting list of notable projects using [computed goto](https://gcc.gnu.org - [Ruby](https://github.com/ruby/ruby/blob/31b28b31fa5a0452cb9d5f7eee88eebfebe5b4d1/regexec.c#L2171) (they use it in their [regex](https://github.com/ruby/ruby/blob/31b28b31fa5a0452cb9d5f7eee88eebfebe5b4d1/regexec.c#L2171) engine as well as in their [interpreter](https://github.com/ruby/ruby/blob/31b28b31fa5a0452cb9d5f7eee88eebfebe5b4d1/vm_exec.h#L98)) - [HHVM](https://github.com/facebook/hhvm/blob/7b0dc442a81861ee65a2fc09afe51adf89faea70/hphp/runtime/vm/bytecode.cpp#L5690) (a PHP implementation from Facebook) +The resulting list of notable projects using [`musttail`](https://clang.llvm.org/docs/AttributeReference.html#musttail) for code generation: + +- [Swift](https://github.com/apple/swift/blob/ba67156608763a58fc0dbddbc9d1ccce2dc05c02/lib/IRGen/IRGenModule.cpp#L583) +- [Zig](https://github.com/ziglang/zig/blob/5744ceedb8ea4b3e5906175033f634b17287f3ca/lib/zig.h#L110) (+ guaranteed [TCE exposed](https://ziglang.org/documentation/master/#call) to user code) +- [GHC](https://github.com/ghc/ghc/blob/994bda563604461ffb8454d6e298b0310520bcc8/rts/include/Stg.h#L372) (+ guaranteed [TCE exposed](https://wiki.haskell.org/Tail_recursion) to user code) +- [Clang](https://github.com/llvm/llvm-project/blob/59ad9c3f38c285e988072d100931bcbfb24196fb/clang/lib/CodeGen/CGCall.cpp#L544) (+ guaranteed [TCE exposed](https://clang.llvm.org/docs/AttributeReference.html#musttail) to user code) +- [Julia](https://github.com/JuliaLang/julia/blob/aea56a9d9547cff43c3bcfb3dac0fff91bd53793/src/llvm-multiversioning.cpp#L696) +- [Firefly](https://github.com/GetFirefly/firefly/blob/8e89bc7ec33cb8ffa9a60283c8dcb7ff62ead5fa/compiler/driver/src/compiler/passes/ssa_to_mlir/builder/function.rs#L1388) (a BEAM/Erlang implementation) (+ guaranteed [TCE exposed](https://www.erlang.org/doc/reference_manual/functions.html#tail-recursion) to user code) +- [MLton](https://github.com/MLton/mlton/blob/d082c4a36110321b00dc099858bb640c4d2d2c24/mlton/codegen/llvm-codegen/llvm-codegen.fun#L1405) (a Standard ML compiler) (+ guaranteed TCE exposed to user code) +- [Pyston](https://github.com/pyston/pyston/blob/6103fc013e9dd726efca9100a22be1ac08c58591/pyston/aot/aot_gen.py#L276) (a performance-optimizing JIT for Python) (_planning to use_) + + # Guide-level explanation [guide-level-explanation]: #guide-level-explanation