-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC for unsafe blocks in unsafe fn #2585
Changes from all commits
cbbd0ce
51ca5b4
77a9992
aa85ef7
5d46a2d
5571c39
2550a0b
153d286
8cb1b3f
949590b
ec454f2
1b3e1dd
2e47c41
814301e
9838e5d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,210 @@ | ||
- Feature Name: unsafe_block_in_unsafe_fn | ||
- Start Date: 2018-11-04 | ||
- RFC PR: (leave this empty) | ||
- Rust Issue: (leave this empty) | ||
|
||
# Summary | ||
[summary]: #summary | ||
|
||
No longer treat the body of an `unsafe fn` as being an `unsafe` block. To avoid | ||
a breaking change, this is a warning now and may become an error in a future | ||
edition. | ||
|
||
# Motivation | ||
[motivation]: #motivation | ||
|
||
Marking a function as `unsafe` is one of Rust's key protections against | ||
undefined behavior: Even if the programmer does not read the documentation, | ||
calling an `unsafe` function (or performing another unsafe operation) outside an | ||
`unsafe` block will lead to a compile error, hopefully followed by reading the | ||
documentation. | ||
|
||
However, we currently entirely lose this protection when writing an `unsafe fn`: | ||
If I, say, accidentally call `offset` instead of `wrapping_offset`, or if I | ||
dereference a raw pointer thinking it is a reference, this happens without any | ||
further notice when I am writing an `unsafe fn` because the body of an `unsafe | ||
fn` is treated as an `unsafe` block. | ||
|
||
For example, notice how | ||
[this PR](https://github.com/rust-lang/rust/pull/55043/files) significantly | ||
increased the amount of code in the thread spawning function that is considered | ||
to be inside an `unsafe` block. | ||
|
||
The original justification for this behavior (according to my understanding) was | ||
that calling this function is anyway unsafe, so there is no harm done in | ||
allowing *it* to perform unsafe operations. And indeed the current situation | ||
*does* provide the guarantee that a program without `unsafe` cannot be UB. | ||
However, this neglects the other aspect of `unsafe` that I described above: To | ||
make the programmer aware that they are treading dangerous ground even when they | ||
may not realize they are doing so. | ||
|
||
In fact, this double role of `unsafe` in `unsafe fn` (making it both unsafe to | ||
call and enabling it to call other unsafe operations) conflates the two *dual* | ||
roles that `unsafe` plays in Rust. On the one hand, there are places that | ||
*define* a proof obligation, these make things "unsafe to call/do" (e.g., the | ||
language definition says that dereferencing a raw pointer requires it not to be | ||
dangling). On the other hand, there are places that *discharge* the proof | ||
obligation, these are "unsafe blocks of code" (e.g., unsafe code that | ||
dereferences a raw pointer has to locally argue why it cannot be dangling). | ||
|
||
`unsafe {}` blocks are about *discharging* obligations, but `unsafe fn` are | ||
about *defining* obligations. The fact that the body of an `unsafe fn` is also | ||
implicitly treated like a block has made it hard to realize this duality | ||
[even for experienced Rust developers][unsafe-dual]. (Completing the picture, | ||
`unsafe Trait` also defines an obligation, that is discharged by `unsafe impl`. | ||
Curiously, `unsafe trait` does *not* implicitly make all bodies of default | ||
functions defined inside this trait unsafe blocks, which is somewhat | ||
inconsistent with `unsafe fn` when viewed through this lens.) | ||
|
||
[unsafe-dual]: https://github.com/rust-lang/rfcs/pull/2585#issuecomment-577852430 | ||
|
||
# Guide-level explanation | ||
[guide-level-explanation]: #guide-level-explanation | ||
|
||
The `unsafe` keyword in Rust serves two related purposes. | ||
|
||
When you perform an "unsafe to call" operation, like dereferencing a raw pointer | ||
or calling an `unsafe fn`, you must enclose that code in an `unsafe {}` block. | ||
The purpose of this is to acknowledge that the operation you are performing here | ||
has *not* been checked by the compiler, you are responsible yourself for | ||
upholding Rust's safety guarantees. Generally, unsafe operations come with | ||
detailed documentation for the conditions that must be met when this operation | ||
is executed; it is up to you to check that all these conditions are indeed met. | ||
|
||
When you are writing a function that itself has additional conditions to ensure | ||
safety (say, it accesses some data without making some necessary bounds checks, | ||
or it takes some raw pointers as arguments and performs memory operations based | ||
on them), then you should mark this as an `unsafe fn` and it is up to you to | ||
document the conditions that must be met for the arguments. This use of the | ||
`unsafe` keyword makes your function itself "unsafe to call". | ||
|
||
The same duality can be observed in traits: `unsafe trait` is like `unsafe fn`; | ||
it makes implementing this trait an "unsafe to call" operation and it is up to | ||
whoever defines the trait to precisely document what is unsafe about it. | ||
`unsafe impl` is like `unsafe {}`, it acknowledges that there are extra | ||
requirements here that are not checked by the compiler and that the programmer | ||
is responsible to uphold. | ||
|
||
For this reason, "unsafe to call" operations inside an `unsafe fn` must be | ||
contained inside an `unsafe {}` block like everywhere else. The author of these | ||
functions has to ensure that the requirements of the operation are upheld. To | ||
this end, the author may of course assume that the caller of the `unsafe fn` in | ||
turn uphold their own requirements. | ||
|
||
For backwards compatibility reasons, this unsafety check inside `unsafe fn` is | ||
controlled by a lint, `unsafe_op_in_unsafe_fn`. By setting | ||
`#[deny(unsafe_op_in_unsafe_fn)]`, the compiler is as strict about unsafe | ||
operations inside `unsafe fn` as it is everywhere else. | ||
|
||
This lint is allow-by-default initially, and will be warn-by-default across all | ||
editions eventually. In future editions, it may become deny-by-default, or even | ||
a hard error. | ||
|
||
# Reference-level explanation | ||
[reference-level-explanation]: #reference-level-explanation | ||
|
||
The new `unsafe_op_in_unsafe_fn` lint triggers when an unsafe operation is used | ||
inside an `unsafe fn` but outside `unsafe {}` blocks. So, the following will | ||
emit a warning: | ||
|
||
```rust | ||
#[warn(unsafe_op_in_unsafe_fn)] | ||
unsafe fn get_unchecked<T>(x: &[T], i: usize) -> &T { | ||
x.get_unchecked(i) | ||
} | ||
``` | ||
|
||
Moreover, if and only if the `unsafe_op_in_unsafe_fn` lint is not `allow`ed, we | ||
no longer warn that an `unsafe` block is unnecessary when it is nested | ||
immediately inside an `unsafe fn`. So, the following compiles without any | ||
warning: | ||
|
||
```rust | ||
#[warn(unsafe_op_in_unsafe_fn)] | ||
unsafe fn get_unchecked<T>(x: &[T], i: usize) -> &T { | ||
unsafe { x.get_unchecked(i) } | ||
} | ||
``` | ||
|
||
However, nested `unsafe` blocks are still redundant, so this warns: | ||
|
||
```rust | ||
#[warn(unsafe_op_in_unsafe_fn)] | ||
unsafe fn get_unchecked<T>(x: &[T], i: usize) -> &T { | ||
unsafe { unsafe { x.get_unchecked(i) } } | ||
} | ||
``` | ||
|
||
# Drawbacks | ||
[drawbacks]: #drawbacks | ||
|
||
Many `unsafe fn` are actually rather short (no more than 3 lines) and will end | ||
up just being one large `unsafe` block. This change would make such functions | ||
less ergonomic to write, they would likely become | ||
|
||
```rust | ||
unsafe fn foo(...) -> ... { unsafe { | ||
// Code goes here | ||
} } | ||
``` | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Other possible drawbacks to list:
unsafe fn frobnicate(x: T, y: U, ...) -> R {
unsafe {
... // Actual code.
}
} and then nothing has been gained. I don't know what the risk of this is, but worth mentioning. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I added (a variant of) 1. For 2., I think something has been gained: It is not possible to incrementally improve this function's There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah I'm not entirely sure 2. is a drawback or not; I usually try to write the section as what someone else might think is a potential drawback (but not necessarily me) -- i.e. this is the section where I try to bring out my inner Dr. Phil / empathy =P There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The drawbacks section now says
That should cover 2, right? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @RalfJung the concern is actually slightly different here; it is that people will just go and write one big There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Isn't that already possible today? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @mark-i-m yes; sure -- the concern is that the change we do here might not have any noticable effect cause people could be lazy and... There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Wouldn't it be good that people can explicitly choose to not write a big |
||
# Rationale and alternatives | ||
[rationale-and-alternatives]: #rationale-and-alternatives | ||
|
||
To achieve the goals laid out in the motivation section, the proposed approach | ||
is least invasive in the sense that it avoids introducing new keywords, and | ||
instead relies on the existing lint mechanism to perform the transition. | ||
|
||
One alternative always is to not do anything, and live with the current | ||
situation. | ||
|
||
We could avoid using `unsafe` for dual purpose, and instead have `unsafe_to_call | ||
fn` for functions that are "unsafe to call" but do not implicitly have an | ||
`unsafe {}` block in their body. For consistency, we might want `unsafe_to_impl | ||
trait` for traits, though the behavior would be the same as `unsafe trait`. | ||
|
||
We could introduce named proof obligations (proposed by @Centril) such that the | ||
compiler can be be told (to some extend) if the assumptions made by the `unsafe | ||
fn` are sufficient to discharge the requirements of the unsafe operations. | ||
|
||
We could restrict this requirement to use `unsafe` blocks in `unsafe fn` to | ||
those `unsafe fn` that contain at least one `unsafe` block, meaning short | ||
`unsafe fn` would keep compiling like they do now. | ||
|
||
And of course, the lint name is subject to bikeshedding. | ||
|
||
# Prior art | ||
[prior-art]: #prior-art | ||
|
||
The only other language that I am aware of that has a notion of `unsafe` blocks | ||
and `unsafe` functions is C#. It | ||
[looks like](https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/keywords/unsafe) | ||
there, unsafe operations can be freely used inside an `unsafe` function even | ||
without a further `unsafe` block. However, based on @Ixrec's experience, | ||
`unsafe` plays hardly any role in the C# ecosystem and they do not have a | ||
culture of thinking about this in terms of proof obligations. | ||
|
||
# Unresolved questions | ||
[unresolved-questions]: #unresolved-questions | ||
|
||
What is the timeline for adding the lint, and cranking up its default level? | ||
Should the default level depend on the edition? | ||
|
||
Should we ever make this deny-by-default or even a hard error, in a future | ||
edition? | ||
|
||
Should we require `cargo fix` to be able to do *something* about this warning | ||
before making it even warn-by-default? (We certainly need to do something | ||
before making it deny-by-default or a hard error in a future edition.) `cargo | ||
fix` could add big `unsafe {}` blocks around the entire body of every `unsafe | ||
fn`. That would not improve the amount of care that is taken for unsafety in | ||
the fixed code, but it would provide a way to the incrementally improve the big | ||
functions, and new functions written later would have the appropriate amount of | ||
care applied to them from the start. Potentially, `rustfmt` could be taught to | ||
format `unsafe` blocks that wrap the entire function body in a way that avoids | ||
double-indent. "function bodies as expressions" would enable a format like | ||
`unsafe fn foo() = unsafe { body }`. | ||
|
||
It is not entirely clear if having the behavior of one lint depend on another | ||
will work (though most likely, it will). If it does not, we should try to find | ||
some other mechanism to opt-in to the new treatment of `unsafe fn` bodies. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is curious at all. Or at least, I think its very easy to justify: The methods of an
unsafe trait
are not themselvesunsafe fn
. So it would be unsound to make their default method definitions implicitly haveunsafe
bodies.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rationale: We have an
unsafe
keyword associated to an item with a body (trait
,impl
, orfn
in this case), should the body be logically surrounded by aunsafe
block? (all items in thetrait
/impl
or all statements in afn
). Currently the answer differs betweentrait
/impl
andfn
, and that's curious.If we apply your logic to
unsafe fn
, we shouldn't be wrapping the body of anunsafe fn
with anunsafe
block, because that would be unsound (anunsafe
fn is requires the caller to enforce invariants, and says nothing about the callee).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this depends on the particular definition of "unsound".
unsafe
is still required to cause any UB. And the boy of afn
inside anunsafe impl
already does not to be manually reviewed to avoid UB elsewhere (in callers that rely on the unsafe part of the contract).I agree with you for our conventional definition of "unsound", though.