-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: No (opsem) Magic Boxes #3712
base: master
Are you sure you want to change the base?
Conversation
This comment was marked as resolved.
This comment was marked as resolved.
Clarify the constraint o the invariant in footnote Co-authored-by: Jacob Lifshay <programmerjake@gmail.com>
It feels odd that one of the clear options is left out: why not expose a Like, I definitely agree that I agree that whatever happens shouldn't be specific to the |
I, at least, fully expect us to eventually have some way of writing alignment-obeying raw pointers in Rust in some way. If nothing else, (Transmuting between EDIT: added a word to try to communicate that I wasn't expecting this RFC to include such a type. |
That's my hope for the future as well, but to avoid the RFC becoming too cluttered, I am refraining from defining such a type in this RFC. |
Is there a list of optimisations that depend on noalias being emitted for Box’es? |
The RFC seems pretty clear that noalias hasn't really provided many benefits compared to being an extra burden to uphold for implementers, but maybe it is worth seeing if there are any sources that can provide a bit more detail on that. |
text/3712-box-yesalias.md
Outdated
* A pointer with an address that is not well-aligned for `T` (or in the case of a DST, the `align_of_val_raw` of the value), or | ||
* A pointer with an address that offsetting that address (as though by `.wrapping_byte_offset`) by `size_of_val_raw` bytes would wrap arround the address space | ||
|
||
The [`alloc::boxed::Box<T>`] type shall be laid out as though a `repr(transparent)` struct containing a field of type `WellFormed<T>`. The behaviour of doing a typed copy as type [`alloc::boxed::Box<T>`] shall be the same as though a typed copy of the struct `#[repr(transparent)] struct Box<T>(WellFormed<T>);`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The urlo definition is probably good.
It's defined in the opsem, but I don't know if we have a very good written record of that other than spread arround zulip threads and github issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a fan of this. I think that people moving from Vec<T>
to Box<[T]>
having to deal with drastically-different soundness rules is a giant footgun, and getting rid of the special [ST]B behaviour here sounds good to me.
My general take: The two "endpoints" here are
From what I can tell, we current orient If I could go back in time, I think I would favor end-user abstractions and offer different types (e.g., |
The RFC would benefit from some attempt to quantify the impact on performance, though our lack of standardized runtime benchmarks makes that hard. |
@rust-lang/opsem: We were curious in our discussions, does this RFC represent an existing T-opsem consensus? |
It does not represent any FCP done by T-opsem, which is why I've included them here. The claims I make, including those about the impact on the operation semantics, are included in the request for comment and consensus.
I recall some perf PR's (using the default rustc-perf suite) being done to determine the impact, which showed negligible impact. I can probably pull them up at some point in the RFC's lifecycle.
|
|
||
(Note that we do not define this type in the public standard library interface, though an implementation of the standard library could define the type locally) | ||
|
||
The following are not valid values of type `WellFormed<T>`, and a typed copy that would produce such a value is undefined behaviour: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Reference has been adjusted a while ago to state validity invariants positively, i..e by listing what must be true, instead of listing what must not be false. IMO that's more understandable, and the RFC should be updated to also do that.
There are patterns of using a custom per-
It's "every LLVM optimization that looks at alias information". The question is how much that matters in practice, which is hard to evaluate.
As Connor said, not in any formal sense. Several opsem members have expressed repeatedly that they want to see My own position is that I love how this simplifies the model and Miri, I am just slightly concerned about this being an irreversible loss of optimization potential that we might regret later. Absence of evidence of optimization benefit is not evidence of absence. Our benchmarks likely just don't have a lot of functions that take Is there a way we can query the ecosystem for functions taking |
- While the easiest alternative is to do nothing and maintain the status quo, as mentioned this has suprisingly implications both for the operational semantics of Rust | ||
- Alternative 2: Introduce a new type `AlisableBox<T>` which has the same interface as `Box<T>` but lacks the opsem implications that `Box<T>` has. | ||
- This also does not help remove the impact on the opsem model that the current `Box<T>` has, though provides an ergonomically equivalent option for `unsafe` code. | ||
- Alternative 3: We maintain the behaviour only for the unparameterized `Box<T>` type using the `Global` allocator, and remove it for `Box<T,A>` (for any allocator other than `A`), allowing unsafe code to use `Box<T, CustomAllocator>` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually the status quo, since rust-lang/rust#122018
Just to follow up on some of the discussion, it wasn't immediately clear to me that types similar to I would still love if there's more data showing the lack of returns on |
…C++ counterpointer `std::unique_ptr`, in the prior art section
FTR, speaking right now as one of the main developers of lccc, my opinion is that the best way to mitigate any loss of future optimization potential is to just be far more granular with
I mentioned |
And more than them just not having them, IIRC someone tried to implement |
would still love if there's more data showing the lack of returns on noalias optimisations, since it feels wrong that something with a lot of history and usage isn't helping that much
It helps for references. I suspect people added it for Box because "why not".
|
Co-authored-by: Ralf Jung <post@ralfj.de>
Just catching up with things. I wanted to share that for automatic differentiation (and likely GPU Kernels written in pure Rust) noalias is very relevant. At the llvm-dev meeting I've been presenting 5 benchmark that showed Performance Penalties in 4/5 Benchmarks ranging from 30% - 10x by removing noalias from the Rust code. Removing noalias from the C++ version of the benchmark also had large performance impacts (up to 4x), so it's not a pure Rust problem, and noalias is just generally very valuable for AutoDiff and GPU support. The only benchmark not suffering didn't had enough pointers/references for noalias to be valuable. For now I have to admit that I have written all benchmarks using slices or arrays and not box, but both autodiff and GPU offload work on llvm-ir, and there are logical reasons about why noalias has such a big performance impact. So I don't see any reason why the penalty for noalias should be smaller for box. I will try to get at least some of these Benchmarks ported to use a Box instead of slices to prove my point, but for now I can share the existing slice/array versions that I used for my talk: EnzymeAD/Enzyme#1797 That being said, both autodiff and gpu-offloading are of course only experimental project goals, and noalias on box is of course less relevant than noalias on say slices. However, I think especially efficient GPU support is valuable beyond the scientific Computing/HPC/ML crowd and of interest for a lot of people. It might still be worth dropping it if it causes too many compiler issues, I haven't read up on the whole discussion yet. But we should at least be aware then that it likely will have relevant performance impact for these targets. If desired, I can also reach out to some of the LLVM people involved in GPU programming, they surely can provide more numbers/examples on noalias usage there. Edit: Iirc Nikita also pointed out that people often pass Box behind a reference, which would keep noalias, so that might be enough for our cases (I will need to check the IR). |
The idea is that box doesn’t get passed around that much: you usually just pass a reference, and mutable references have (This is distinct from passing a box behind a reference / a reference to a box, which would have |
Agree, but in our case the existence of 2+ Box types inside of a function (with at least one write to it) would often force us to cache the data behind all Boxes for autodiff (instead of just the one we modified), so it's not only about passing it to other functions. Vectors also don't have noalias iirc so I think also any combination of 2+ vec/box types existing in one function could then cause unnecessary copies. I want to be clear that the average IR from Rust is still miles ahead of the average IR we get from C++ (since people rarely use restrict), but the difference would be that people manually could fix C++ by adding restrict, whereas Rust users wouldn't be able to add noalias to Rust's builtin types. It might still be worth to sacrifice performance in some niche cases for easier correctness, I'll try to write up in more detail under which cases lacking noalias would have an impact for the HPC/etc. community so people can decide if it's worth it. I just wanted to share this as an FYI, while I'm working out more details on it. |
Within a function you already don't get noalias, due to limitations of llvm and what code rustc emits. |
Are you talking about sret preventing noalias? Yes, we're following the work on the llvm side too, e.g. https://discourse.llvm.org/t/restrict-noalias-for-struct-members/78568, so we hope to increase the number of locations where llvm can use noalias info. I however don't have precise insights on how many of these llvm noalias improvements will later be useable by Rust and where we e.g. already allow unsafe code to alias and thus won't be able to add noalias. |
No, he's talking about the fact that syntactically in LLVM IR
This can't be happening today since, as Connor said, Would be good to know which parts of your comments are based on concrete data and which are speculative. |
That's a good point that I had not considered. Would be good to get some data on the GPU side here, too. |
They could delegate to an inner function taking (Like how autovectorization works better on |
Just to clarify, LLVM does have something called |
Yeah |
I think we should just do this, and make I agree with Niko's post above that ideally we'd have made Thus I think it's better that we not have a big footgun of code that's sound with @rfcbot fcp merge |
Team member @scottmcm has proposed to merge this. The next step is review by the rest of the tagged team members:
No concerns currently listed. Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! cc @rust-lang/lang-advisors: FCP proposed for lang, please feel free to register concerns. |
@rustbot labels +I-lang-nominated
Is this footgun actually big, though? It gave me a lot of pause about doing this when @RalfJung said:
If that's true, maybe we should consider whether we can in fact go the other way somehow. As a rule, I would like to make things less magical. But here -- I don't know -- I wonder whether it's an improvement to lose this optimization and to, in some situations, have to suggest that people write inner functions like this: // Without the inner function, this would take a long long time with
// `-Zbox-noalias=no`.
pub fn f_box(mut x: Box<[u64; 1]>, y: Box<[u64; 1]>) -> u64 {
fn inner(x: &mut [u64; 1], y: &[u64; 1]) -> u64 {
let mut z: u64 = 0;
for i in 0..const { 2u64.pow(40) } {
x[0] = i;
z = z.wrapping_add(y[0]);
}
z
}
inner(&mut *x, &*y)
} That's its own kind of new footgun that we'd be adding. It's particularly kind of odd here since there's still of course the library invariant that a The stated motivation for this RFC leans a lot on the So I just wonder whether there's a strong enough reason to do this, given (quoting @RalfJung again):
|
What is this based on? I'm not aware of any checker that we have for the noalias rules that would enable a systematic search. |
It is based on the fact that, to my knowledge, so far we have not discussed even an artificial example of code that violates It is often claimed that examples like this are UB under let mut v = Vec::with_capacity(10);
v.push(0);
let ptr0 = &raw mut v[0];
let mut v = v; // move the Vec
v.push(1);
let ptr1 = &raw mut v[1];
ptr::swap(ptr0, ptr1); That's just not true. Stacked Borrows and Tree Borrows rule out patterns like this, but We don't have any way to do systematic testing. But not even having a single realistic example of such UB does make me wonder how much of a problem it would be. IMO we should be asking: what is the support for the claim that adding |
So, here is such an example: unsafe fn f(mut v: Vec<i32>, ptr: *mut i32) {
*ptr = 0;
v[0] = 0;
}
let mut v = vec![0];
let ptr = &raw mut v[0];
f(v, ptr); This is UB because inside OTOH, also note that we could currently not even compile this to a |
There’s always an option of having a |
That would have to be a weaker |
FTR, I don't like this argument - whether or not its true, it has no bearing on what is undefined behaviour in Rust. None of the proposed aliasing models for Rust that I've seen are "Exactly Part of the point of this RFC is removing special cases in the memory model, so IMO it's completely against the proposal to add even more special-cased rules to SB or TB to handle something closer to what |
Rust doesn't yet have an aliasing model, only several WIP proposals. If there's some good benefit from having |
# Reference-level explanation | ||
[reference-level-explanation]: #reference-level-explanation | ||
|
||
For the remainder of this section, let `WellFormed<T>` designate a type for *exposition purposes only* defined as follows: | ||
```rust | ||
#[repr(transparent)] | ||
struct WellFormed<T: ?Sized>(core::ptr::NonNull<T>); | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there any differences semantically between what is proposed for Box<_>
in this RFC and what would be true of MaybeDangling<Box<_>>
today?
That is, since we already accepted RFC 3336, if there is no daylight between these, then it should say that normatively, and it perhaps could even lean into that for defining the semantics.
cc @RalfJung
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We haven't spelled out whether MaybeDangling<Box<_>>
would allow pointers too close to the end of the address space, but it seems reasonable to say "no" to that. In that case the answer to your question is yes, this RFC proposes to weaken Box
so that its validity requirements are equivalent to MaybeDangling<Box<_>>
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there any differences semantically between what is proposed for
Box<_>
in this RFC and what would be true ofMaybeDangling<Box<_>>
today?
Restating Ralf's answer to this, MaybeDangling
only removes aliasing invariants, but preserves the normal validity invariants of the type. This RFC proposes to remove the aliasing invariants from Box<_>
as a whole, so it would naturally leave it in an identical state to MaybeDangling<Box<_>>
.
However, this also spells out that validity invariant in full, which we cannot rely on existing types yet to do.
In the case of allocators[^3], without special handling of them in the language as well, the protectors assigned to `Box<T>` were violated by (almost) any non-trivial allocator that provides the storage itself (without delegating to something like `malloc` or `alloc::alloc::alloc`). This is because the allocators access the same memory that the `Box` stores to mark it as deallocated and available again. In an extreme example, the same memory could even be yielded back to another `Allocator::allocate` call. Solving this requires special casing `Allocator`s, which is a heavily unresolved discussion, only applying the special opsem behaviour to `Global`, which is opaque via the global allocator functions, or forgoing custom allocators for `Box` entirely (thus depriving anyone needing to use a custom allocator from the user-visible language features `Box` alone provides). With the exception of the former, which is desired for other optimization reasons though [heavily debated and not resolved](https://github.com/rust-lang/unsafe-code-guidelines/issues/442), these solutions are merely solving the symptom, not the problem. | ||
|
||
Any `unsafe` code that may want to temporarily maintain aliased `Box<T>`s for various reasons (such as low-level copy operations), or may want to use something like `ManuallyDrop<Box<T>>`, is put into an interesting position: While they can use a user-defined smart pointer, this requires both care on the part of the smart pointer implementor, but also affects the ergonomics and expressiveness of that code, as `Box<T>` has many special language features that surface at the syntax level, which cannot be replicated today by a user-defined type. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problems described in these motivation items are also solved by MaybeDangling
, no?
Either way, the motivation should be adjusted to describe this. That is, it's confusing for the motivation to be written as though we did not already cover this ground and accept RFC 3336. If that RFC did solve the problem, but the idea is that the present proposal solves it better for Box<_>
somehow, then that should be described here. Or alternatively, if there's some way in which RFC 3336 did not solve the problem, then that should be detailed specifically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only the last one. unless we say that in order to use a custom allocator with Box
, you have always have to wrap the box in MaybeDangling
(in which case, I'm not sure how to create one in the first instance).
|
||
In the case of `ManuallyDrop<T>`, because `Box<T>` asserts aliasing validity on a typed copy, and is invalidated on drop, it introduces unique behaviour - `ManuallyDrop<Box<T>>` *cannot* be moved after calling `ManuallyDrop::drop` *even* to trusted code known not to access or double-drop the `Box`. No other type in the language or library has the same behaviour[^2], as primitive references do not have any behaviour on drop (let alone behaviour that includes invalidating themselves), and only `Box<T>`, references, and aggregates containing references are retagged on move. There are proposed solutions on the `ManuallyDrop<T>` type, such as treating specially by supressing retags on move, but this is a novel idea (as `ManuallyDrop<T>` asserts non-aliasing validity invariants on move), and it would interfere with retags of references without justification. The proposed complexity is only required because of `Box<T>`. | ||
|
||
In the case of allocators[^3], without special handling of them in the language as well, the protectors assigned to `Box<T>` were violated by (almost) any non-trivial allocator that provides the storage itself (without delegating to something like `malloc` or `alloc::alloc::alloc`). This is because the allocators access the same memory that the `Box` stores to mark it as deallocated and available again. In an extreme example, the same memory could even be yielded back to another `Allocator::allocate` call. Solving this requires special casing `Allocator`s, which is a heavily unresolved discussion, only applying the special opsem behaviour to `Global`, which is opaque via the global allocator functions, or forgoing custom allocators for `Box` entirely (thus depriving anyone needing to use a custom allocator from the user-visible language features `Box` alone provides). With the exception of the former, which is desired for other optimization reasons though [heavily debated and not resolved](https://github.com/rust-lang/unsafe-code-guidelines/issues/442), these solutions are merely solving the symptom, not the problem. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't properly explain why "(almost) any non-trivial allocator" violates noalias
.
There is a class of non-trivial custom allocators that violates Stacked Borrows, specifically if it accesses "metadata memory" that is stored outside the region returned by the allocator, using the pointer that was passed in to deallocate
. If that's what you mean, it should be stated explicitly. Is that really "almost any non-trivial allocator"? That seems like a strong claim. It took a while for Miri to run into this issue.
With Tree Borrows, at least some of these cases are not UB any more, since Tree Borrows supports the &Header
pattern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It took a while for Miri to run into this issue.
Most people likely don't use custom allocators with miri.
Summary
Currently, the operational semantics of the type
alloc::boxed::Box<T>
is in dispute, but the compiler adds llvmnoalias
to it. To support it, the current operational semantics models have the type use a special form of theUnique
(Stacked Borrows) orActive
(Tree Borrows) tag, which has aliasing implications, validity implications, and also presents some unique complications in the model and in improvements to the type (e.g. Custom Allocators). We propose that, for the purposes of the runtime semantics of Rust,Box
is treated as no more special than a user-defined smart pointer you can write today1. In particular, it is given similar behaviour on a typed copy to a raw pointer.Rendered
Footnotes
We maintain some trivial validity invariants (such as alignment and address space limits) that a user cannot define, but these invariants only depend upon the value of the
Box
itself, rather than on memory. ↩