Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Pointer metadata & VTable #2580
RFC: Pointer metadata & VTable #2580
Changes from 23 commits
0bb757e
d4acf9b
8de354c
24a2ff8
dd3de2f
637647f
99fbed9
74e185c
65418d0
ef9ee32
f6d95d7
e1aa0df
18f1fef
3dd1484
8ede231
fb6ebad
bf80163
2edbe38
bb5e09a
bc2a2e7
7be2349
538b9aa
e47aa6e
29c7547
1e5622f
04ba25f
f94f627
236e657
7b0175e
3ff39f0
50b567b
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so... I'm assuming the compiler implements
Which means theoretically we could make
Vtable
generic overT
allowing thedrop_in_place
method to take a raw pointer with the correct pointee type?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These impls would be accurate in current Rust, but what I had in mind instead was that the compiler would automatically generate impls, similar to what it does for the
std::marker::Unsize
trait. As far as the standard library is concerned these impls would be "magic", not based on specialization.Regardless, yes, making
VTable
generic with a type parameter for the trait object type is possible.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having to list
Unpin
here is concerning... does that mean that if we get another auto trait in libstd we'll want to add it here but cannot due to backwards compatibility?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are the trait bounds there in the first place? I can understand
Copy
and maybeSend + Sync
, since you copy the metadata when you copy the pointer. But why does metadata need to beOrd
orHash
? If some function really needs those, it could always useT: Pointee<Metadata: Ord>
or similar.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've explained the rationale of this list in #2580 (comment) and
Unpin
in #2580 (comment).@comex Because the following works on stable all the way back to 1.0:
Without introducing
?DynSized
, theT: ?Sized
have to match all types including slices and trait pointers andextern type
and custom DST pointers. The fat-pointer comparison is always implemented as(a.ptr, a.meta) < (b.ptr, b.meta)
without requiringT::Metadata: PartialOrd
inis_before
. This means such bound must be implicitly satisfied everywhere, and thus included in the list.Every other bound are included for the same reason that
*const T
and/or&T
unconditionally implements that trait.@RalfJung we could tweak the auto trait rule so that it needs to check
T::Metadata
first before impl for*const T
,*mut T
,&T
and&mut T
, assuming auto-trait is not stabilized beforePointee
is implemented.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it?
This RFC proposes that this trait is automatically implemented for all types, so "manual" implementations of it cannot exist. (Unless some day the language grows types that cannot be used behind a pointer, but I can’t imagine how that would be useful. And even then, I assume we’d disallow manual impls.)
And it’s only those impls that would potentially be broken by adding a new bound, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it really? I would expect it to be implemented as
a.ptr < b.ptr
, ignoring the metadata. That seems like a more reasonable implementation?I see. That leaves no room for custom DST though. I am aware those are out-of-scope, but they should still be possible to add in a future-compatible way I assume.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, that’s a good point. I suppose when custom DSTs are added, then we may need to freeze those bounds.
Indeed, it looks like fat-pointer comparison is represented specifically in MIR, and lowered to LLVM integer-comparison instructions here:
https://github.com/rust-lang/rust/blob/1e2a73867/src/librustc_codegen_ssa/mir/rvalue.rs#L630-L642
It doesn’t call
PartialOrd
methods. In fact it’s the other way around:PartialOrd
uses the<
operator which is special-cased for raw pointers to not call the trait:https://github.com/rust-lang/rust/blob/1e2a73867/src/libcore/ptr/mod.rs#L2930
So maybe custom DSTs would also need language-level changes in this area.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is_before
is a very good example for why it needs to be comparable etc. But I'm missing why it needs to beSend
andSync
(I mean if the DST type is Send+Sync then the meta must be Send+Sync, too. But if not it also doesn't need to have that bounds I think).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Today we have
impl<T: ?Sized> !Send for *mut T {}
(similarlySync
) in libcore, without aT: Send
bound. For a wide pointer to beSend
, the metadata needs to beSend
as well.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it's better to actually introduce thin pointers instead of using this trait alias.
The problem is that this trait alias doesn't necessary play well with structs containing DSTs (which might again be "composed" DSTs). It should be possible to have a thin pointer to such a struct and get access to all of it's fields without needing any unsafe code. Except the DST field which would if referenced just return a thin reference to the DST. For many thinks it would make writing unsafe code around all kinds of DST types much easier if we actually had thin pointers.
For example we could have something like (scatch):
The reason why it's a lang item is because it has a bit special deref handling, mainly:
struct Foo { f1: usize, f2: usize, dst: [u8] }
let foo: &Thin<Foo> = ...
,&foo.f1
is of type&usize
&foo.f2
is of type&usize
&foo.dst
is of type&Thin<[u8]>
This (or something similar) has some benefits:
&Thin<[T]>
,&mut Thin<[T]>
,*const Thin<[T]>
,*mut Thin<[T]>
*raw
or similar) which we maybe might addfrom_raw_parts
more type save by accepting*const Thin<T>
instead of an*const ()
By keeping the type instead of using
*const ()
we should need less complex unsafe code to e.g. implement a ThinBox for trait objects shown above we would "just" get the thin pointer to the DST field and then pass it with the metadata tofrom_raw_parts
.Sure this doesn't really eliminate the data alignment fixing done in
data_ptr
it just moves it from the user written library into the implementation offrom_raw_parts
(aThin<dyn Trait>
would not be guaranteed to be correctly aligned as we simply can't know the alignment without knowing the type so thefrom_raw_parts
impl. would need to do whatdata_ptr
does).Still moving this tricky bits from user libraries to the core/std library seems to be a good idea I think and the improved type safety around
from_raw_parts
would be good too as (I think) in many case we don't handle completely erased types but just the erasure of the exact metadata used.Through we probably still would need to have a way to create a
*const Thin<dyn Trait>
from a*const ()
for some ffi aspects.(Note that a
&Thin<dyn Trait>
is kinda pointless but a&Thin<DSTContainingDynTrait>
isn't as you still could access other fields safely).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An additional problem I found is that you can't have unaligned references in rust. It's UB.
So
&Thin<dyn Trait>
would need to "encapsulate" the unalignedness while not doing so for everything for which we know the alignment i.e. another reason whyThin
would be a lang item. Keeping with how rust tends to do thinks it might be more appropriate to makeThin
a "special" syntax. But then we would need to have thinks like&thin X
,&mut thin X
, or similar. But having aThin<T>
does just work very egonomical with the rest of rust.(Solved below)
Another problem I found is that
Thin<dyn Trait>
orThin<[u8]>
as a unknown layout in the sense that:Thin<dyn Trait>
(Thin<dyn Trait>
is on the outside well but "unknown" aligned), as we can't move it in memory anyway due to now knowing the size we can just report an alignment of 1, but between the start ofThin<dyn Trait>
and the data ofdyn Trait
there is some unknown padding to "fix" the alignment of the internal field.But
Layout::for_value(&Thin<[u8]>)
does still need to return a value.So either we return a "not so useful" value like alignment 1 and size of
usize::MAX
, which I don't like or we need to directly have Thin pointers instead of a Thin wrapper around theT: Pointee
type.Like
ThinRef<'a,T>
,ThinMut<'a,T>
,ThinConstPtr
,ThinMutPtr
. (as e.g.#[...] struct ThinMutPtr<T>(ptr: *mut (), PhantomData<T>)
)Which is slightly less elegant but would still work otherwise like described for
Thin
above.This also would fix the inner alignment problem much easier as e.g.
ThinRef<'a, T>
could be implemented as#[...] struct ThinRef<'a, T> { ptr: *const (), marker: PhantomData<&'a T> }
which works just fine for a not-necessary-correctly alignedThinRef<'a, dyn Trait>
.The only real problem would be if we point to not-allocated memory. But we are guaranteed to point to allocated memory.
The only potential wrong thing is the alignment if the pointer is to a DST field. In which case we might point into the padding in front of the field instead of the field directly. Which is fine and expected (see the
data_ptr
example and what I wrote previously about it).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The
Thin
trait alias proposed here describes an existing property of some existing types and the existing pointer/reference types to those.*const u32
is a thin pointer,*const str
is not. Thereforeu32: Thin
holds,str: Thin
does not.In today’s Rust a struct that contains a DST field is itself also a DST and therefore pointers to it are wide, not thin.
Introducing thin pointers to DSTs (structs or not) is an entirely new language feature that is not part of this RFC. If you feel that feature should be pursued at the language level, consider writing a separate RFC for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This RFCs would enable a third-party library to implement a type like
ThinBox
that stores the metadata in the heap allocation and keeps theBox
-like pointer thin.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have in the end 2 kinds of DSTs one which I refereed to as composed DST's which have any number of non DST fields and a tailing DST fields and what I will refer to as fundamental DSTs which are currently
[T]
anddyn Trait
.The problem is that without thin pointers to DSTs which allow access to non DST fields any library
ThinBox
implementation would need to either create a fat-pointer or use a lot of unsafe code to access non DST fields.This makes it harder to implement all kind of custom DST types like e.g. slice with the length prefixing the slice or similar.
Mainly a lot of unsafe code is needed which with thin pointers to DST can be moved into the
from_raw_parts
implementation in core/std.Also semantically the
from_raw_parts
functions(s) does take a thin pointer to an DST and meta data and creates a potential non-thin pointer. But instead we represent it as taking a untyped pointer and metadata making the functions more unsafe then they need to be IMHO (as you could put in a "wrong" thinned pointer.While we seem to clearly disagree on this, thin pointers to DST are IMHO a fundamental part of the API around handling pointer metadata as they are the direct representation of fat-pointers without metadata.
But then a new RFC could also try to amend any non yet stabilized RFC so I guess putting this into a separate RFC is worth a try.
My goal is to long term make it possible to handle/write DST types without needing to know about all kinds of unsafe easy to get wrong without noticing it details like alignment fixing making custom DSTs available to any rust programmer.
I will see if I can find time to write a RFC for this in the next week or so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One potential improvement I see with
DynMetadata
is that it does two levels of erasure: 1. the erasure fromdyn
and 2. the erasure from a specificdyn Trait
todyn *
.If we would keep the specific
dyn Trait
we might in the future add trait specific vtable based methods.(This would mean
struct DynMetadata<T:?Sized>
e.g.DynMetadata<dyn Debug>
)But I'm not sure if this is worth the effort and added complexity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was discussed before (when DynMetadata still was VTable.)
One of the major arguments against it was the potential increased compiler time. (Me guessing: Which given that this is a auto trait might be non neglibale?? Maybe?)
I wonder if we have a way to erase the exact type of DynMetadata having just
impl DynMetadata
instead. But I guess this would require (existential)impl Trait
features we currently don't have to be doable and nicely usable.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’ve added an unresolved question.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure
&()
is valid?The empty tuple
()
is a zero-sized type. Which means it can not be allocated in the classical sense which also implies that a&()
with an arbitrary address can't be created in a safe way in rust and as such using&()
as such the compiler should be able to assume (it currently doesn't) that all&()
have the same compiler predetermined address. Or did I miss something?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is certainly not true today, and it would be hard to make true. When you take a reference to the first field of a
((), i32)
, you'll get a reference that actually points to that memory where the pair is stored.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resolved, see comments below it turns out responding by EMail doesn't work with "sub-threads".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is private field, so its exact type is an implementation detail that’s not too important for the purpose of this RFC. I’ve changed it to
NonNull
.(I’m not sure whether making all
&ZeroSizeType
have a fixed address is realistic or what advantage that would bring, but that’s out of scope for this RFC.)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No drawbacks section...?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I came up short trying to think of a reason not to do this at all (as opposed to doing it differently). Suggestions welcome.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would that avoid forcing this? Can you elaborate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without a type parameter,
x.size()
withx: &'static VTable
necessarily executes the same code for any vtable. With a type parameter,x: &'static VTable<dyn Foo>
andx: &'static VTable<dyn Bar>
are different types and could execute different code. (For example, do table lookup with different offsets.) However, keeping the offset ofsize
the same within all vtables might be desirable regardless of this API.