-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Constrained Naked Functions #2972
Changes from all commits
a3c6633
a580606
9122563
92b909a
7ecd561
08adbc0
3537016
7a02895
c1e2bbd
3e7940e
d4a418b
4137ed6
8e40788
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,131 @@ | ||
- Feature Name: `constrained_naked` | ||
- Start Date: 2020-08-06 | ||
- RFC PR: [rust-lang/rfcs#2972](https://github.com/rust-lang/rfcs/pull/2972) | ||
- Rust Issue: [rust-lang/rust#90957](https://github.com/rust-lang/rust/issues/90957) | ||
|
||
# Summary | ||
This document attempts to increase the utility of [naked functions](https://github.com/rust-lang/rfcs/blob/master/text/1201-naked-fns.md) by constraining their use and increasing their defined invariants. | ||
|
||
# Motivation | ||
|
||
Naked functions have long been a feature of compilers. These functions are typically defined as normal functions in every regard, except that the compiler does not emit the function prologue and epilogue. Rust's early attempt to support this feature ([RFC 1201](https://github.com/rust-lang/rfcs/blob/master/text/1201-naked-fns.md)) mostly copied the existing compiler behaviors. | ||
|
||
However, naked functions are often avoided in practice because their behavior is not well defined. The root cause of this problem is that naked functions are defined by negation: they are functions which lack a prologue and epilogue. Unfortunately, functions that lack a prologue and epilogue present a number of complicated problems that the compiler needs to solve and developers need to work around. And there is a long history of compilers and developers getting this wrong. | ||
|
||
This document seeks to define naked functions in a much more constrained, positivistic way. In doing so, naked functions can become more useful. | ||
|
||
# Naked function definition | ||
|
||
A naked function has a defined calling convention and a body which contains only assembly code which can rely upon the defined calling convention. | ||
|
||
A naked function is identified by the `#[naked]` attribute and: | ||
1. should specify a calling convention besides `extern "Rust"`. | ||
1. should define only FFI-safe arguments and return types. | ||
1. must not specify the `#[inline]` or `#[inline(*)]` attribute. | ||
1. must have a body which contains only a single `asm!()` statement which: | ||
1. may be wrapped in an `unsafe` block. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why "may"? I would expect all invocations of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @bstrie All invocations of |
||
1. must not contain any operands except `const` or `sym`. | ||
1. must contain the `noreturn` option. | ||
bjorn3 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
1. must not contain any other options except `att_syntax`. | ||
1. must ensure that the calling convention is followed or the function is `unsafe`. | ||
|
||
In exchange for the above constraints, the compiler commits to: | ||
1. produce a clear error if any of the above requirements are violated. | ||
1. produce a clear warning if any of the above suggestions are not heeded. | ||
1. disable the unused argument lint for the function (implicit `#[allow(unused_variables)]`). | ||
1. never inline the function (implicit `#[inline(never)]`). | ||
1. emit no additional instructions to the function body before the `asm!()` statement. | ||
|
||
As a (weaker) corollary to the last compiler commitment, the initial state of all registers in the `asm!()` statement conform to the specified calling convention. | ||
|
||
# Explanation | ||
|
||
Since a naked function has no prologue, any naive attempt to use the stack can produce invalid code. This certainly includes local variables. But this can also include attempts to reference function arguments which may be placed on the stack. This is why a naked function may only contain a single `asm!()` statement. | ||
|
||
Further, since many platforms store the return address on the stack, it is the responsibility of the `asm!()` statement to return in the appropriate way. This is why the `options(noreturn)` option is required. | ||
|
||
Any attempt to use function arguments, even as operands, may cause stack access or modification. Likewise, any register operands may cause the compiler to attempt to preserve registers on the stack. Since the function has no prologue, this is problematic. To avoid this problem, we simply refuse to allow the use of any function arguments in Rust. | ||
|
||
If this were the end of the story, naked functions would not be very useful. In order to re-enable access to the function arguments, the compiler ensures that the initial state of the registers in the `asm!()` statement conform to the function's calling convention. This allows hand-crafted assembly access to the function arguments through the calling convention. Since the `extern "Rust"` calling convention is undefined, its use is discouraged and an alternative, well-defined calling convention should be specified. Likewise, since the `asm!()` statement can access the function arguments through the calling convention, the arguments themselves should be FFI safe to ensure that they can be reliably accessed from assembly. | ||
|
||
Because naked functions depend upon the calling convention so heavily, inlining of these functions would make code generation extremely difficult. Therefore, we disallow inlining. | ||
|
||
Since the `const` and `sym` operands modify neither the stack nor the registers, their use is permitted. | ||
|
||
## Examples | ||
|
||
This function adds three to a number and returns the result: | ||
|
||
```rust | ||
const THREE: usize = 3; | ||
|
||
#[naked] | ||
pub extern "sysv64" fn add_n(number: usize) -> usize { | ||
unsafe { | ||
asm!( | ||
"add rdi, {}" | ||
"mov rax, rdi" | ||
"ret", | ||
const THREE, | ||
options(noreturn) | ||
); | ||
} | ||
} | ||
``` | ||
|
||
The calling convention is defined as `extern "sysv64"`, therefore we know that the input is in the `rdi` register and the return value is in the `rax` register. The `asm!()` statement contains `options(noreturn)` and therefore we handle the return directly through the `ret` instruction. We can provide a `const` operand since it modifies neither registers nor stack. Since we have strong guarantees about the state of the registers, we can mark this function as safe and wrap the `asm!()` statement in an `unsafe` block. | ||
|
||
# Drawbacks | ||
|
||
Implementing this will break compatibility of existing uses of the nightly `#[naked]` attribute. All of these uses likely depend on undefined behavior. If this is a problem, we could simply use a different attribute. | ||
|
||
This definition may be overly strict. There is certainly some code that would work without this. The counter argument is that this code relies on undefined behavior and is probably not worth preserving. It might also be possible to reasonably ease the constraints over time. | ||
|
||
This proposal requires the use of assembly where it is theoretically possible to use Rust in a naked function. However, practically the use of Rust in naked functions is nearly impossible and relies on extensively undefined behavior. | ||
|
||
Adopting this definition changes the invariants of `asm!()`. Currently, all registers not supplied as operands to `asm!()` contain undefined values. This proposal changes this to define that the initial register state is unmodified from the function call. This is an even stronger commitment than merely guaranteeing calling convention conformance, which some may dislike. However, the change to permit defined initial register state applies **only** to the use of `asm!()` as the body of a naked function. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is it worth making this difference in behavior explicit in There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @bstrie I'm willing to defer to @Amanieu on this one. However, I believe this phenomenon emerges already from the commitment (already enforced by a lint) that a naked function must contain only a single The important point here is that the compiler behavior is well defined and any failure of that behavior is a bug. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yea this is a bit of a wierd edge case.
Perhaps we want to use a different macro for naked function bodies? Or something more exotic? I'm honestly not sure. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not confident that we can invent something better than we have now -- particularly as features are added to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would it make sense for naked functions to use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @bstrie No. Naked functions additionally provide mangling, debuginfo and docs. |
||
|
||
Refusing to allow argument operands means that architectures that have multiple calling conventions (i.e. x86_64 SystemV vs Windows) cannot share function bodies. This could be remedied with a future improvement. | ||
|
||
# Alternatives | ||
|
||
## Do nothing | ||
|
||
We could do nothing and let naked functions work as they currently do. However, this is likely to be a source of a long stream of difficult compiler bugs and therefore there is no clear path to stabilization. Further, because of the lack of clear constraints, naked functions today are hard to use correctly. And when the developer fails to get every detail right, the result can be hard to debug. | ||
|
||
## Remove naked functions | ||
|
||
Another possibility is to simply remove support for naked functions altogether. This does solve the undefined behavior problem. But it forces the developer to pursue other options. Most notably `global_asm!()` or using an external assembler. | ||
|
||
It would be possible to use `global_asm!()` to define functions with existing constraints. However, there is not currently a clear path to stabilization for `global_asm!()` since it is a thin wrapper around LLVM functionality. Further, `global_asm!()` does not provide features like namespacing and documentation. Nor can you use `global_asm!()` to provide `const` or `sym` operands, which are very useful. | ||
|
||
Alternatively, developers could use an external assembler and link in the result. This approach is similar to `global_asm!()` but offloads the problem to external tooling such as the `cc` crate. It has the same drawbacks as `global_asm!()` and also puts additional requirements on compilers of the software. | ||
|
||
# Prior art | ||
|
||
All languages represented here follow the weaker definition of naked functions: | ||
|
||
| | supported | | ||
|-------------------|-----------| | ||
| | | | ||
| C/C\+\+ \(GCC\) | x | | ||
| C/C\+\+ \(Clang\) | x | | ||
| C/C\+\+ \(MSVC\) | x | | ||
| C/C\+\+ \(ICC\) | | | ||
| D | x | | ||
| Go | | | ||
| Nim | x | | ||
| Rust | x | | ||
| Zig | x | | ||
|
||
# Unresolved questions | ||
|
||
All outstanding questions have been resolved. | ||
|
||
# Future possibilities | ||
|
||
It would be possible to define new calling conventions that can be used with naked functions. | ||
|
||
A previous version of this document defined an `extern "custom"` calling convention. It was observed in conversation that calling conventions are really a *type* and that it could be useful to have calling conventions as part of the type system. In the interest of moving forward with constrained naked functions, it is best to limit the scope of this RFC and defer this (very good) conversation to a future RFC. As a simple workaround, naked functions which do not conform to their specified calling convention should be marked as unsafe and the caller requirements should be documented in the safety section of the documentation per standard convention. | ||
|
||
It may also be possible to loosen the definition of a naked function in a future RFC. For example, it might be possible to allow the use of some additional, possibly new, operands to the `asm!()` block. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why "should" instead of "must not"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bstrie An earlier version of this document did have
must not
. It was removed because someone (IIRC @amluto) objected that there are useful naked functions that can guarantee compliance to an undefined calling convention. I mention this explicitly in the pull request commit message.