-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Validity of wide pointer metadata #166
Comments
Is there consensus about the validity invariant of the vtable pointer ? E.g. if the vtable pointer can be null, then there doesn't need to be a vtable at all. If the vtable is a reference, then whether there needs to be a valid vtable at all would depend on whether validity of references is transitive. Or do we want to be more strict about vtables ? |
Nit: The issue title talks about references, but the text (and |
Food for thought example: creating a As long as the nightly-only impl<T: ?Sized> Weak<T> {
pub fn new() -> Weak<T>
where <T as Pointee>::Metadata: Default
{
unimplemented!()
}
} (obvious disadvantage: unnecessary complexity confusing beginners in 99% of use-cases) Equally, it might be reasonable to eventually (with arbitrary self types) change the slice API so that impl<T> [T] {
pub const len(self: *const Self) -> usize {
unimplemented!()
}
} ...which in this case could actually be accessed via let w: Weak<[u8]> = Weak::new_unsized(42);
assert_eq!(w.as_raw().len(), 42); Heck, you could even imagine a As for how this would interact with trait objects, the compiler can probably implement |
Good point! That actually explains part of my confusion. I updated text and title. |
I just realized that with the new title, this discussion would now also apply to fat raw pointers. But IIRC we agreed that fat raw pointers do not have to have valid metadata. So actually maybe this thread should be about references? |
Similar to #72 (comment), I wonder if it would be worth to write up a (small) RFC that says that fat raw pointers (including |
@RalfJung So like Do we assume the metadata for all wide pointers is always safely castable to |
Do you have a concrete application in mind? |
Exactly. And if we decide we are fine with uninitialized integers, then so would be wide raw pointer metadata.
I guess we should clarify that everything we are saying right now is only for wide pointer types that already exist (slices and trait objects). I do not want to unnecessarily constrain custom DST designs. |
Having a hard time coming up with one, other than trait objects where it's plausible you'd want to call a method on the trait for a potentially invalid pointer. Basically, that'd allow a call like this to work: |
Dereferencing the raw pointer will require the vtable to be valid. The question here is about invariants that are maintained even when the pointer is just "passed around". |
Note the "dyn Foo" - I'm not talking about an example where we're dereferencing the pointer itself. Rather, I'm talking about a potential new feature. |
It seems reasonable to say that a pointer/reference to Note that by dereference I mean the |
It seems like rustc actually marks the vtable in wide raw pointers as non-null. That is quite surprising to me. Is that really what we want? |
Turns out for slices however, safe code can create raw slices with invalid metadata: 123456789 as *const [(); usize::max_value()] as *const [()] as *const [u64] What a mess. Maybe the size condition is just pat of the safety invariant for slices? Then at least we'd uniformly say that metadata has to be valid for raw pointers. |
I'm surprised by mere suggestion of any restriction on the size part of a raw slice (apart from things that apply to all I don't even see why it would have to be part of the safety invariant. |
Maybe we can consider the metadata as part of the pointee instead of as part of the pointer ? That is, the validity of What a |
Aren't references always assumed dereferenceable |
That would depend on what the validity of references requires (#77 ) (EDIT: and/or the aliasing model). It could require that, and that would allow us to emit For wide pointers, we cannot, in general, emit fn foo(x: &[T]) { ... } How would we emit In cases like: static X: [i32; 4] = [0, 1, 2, 3];
pub unsafe fn foo() -> /* dereferenceable(16) */ &'static [i32] { &X } it is always correct to emit EDIT: Arguably, LLVM could add a new attribute that allowed expressing this, similar to how the |
This is probably my mistake. I learned about the size restriction for slices and was like "sounds like a validity invariant to me". And then when metadata validity was expanded to raw pointers I naturally also expanded this. But you are right, the more natural way to treat this is to fold this into the requirement that
I meant that for safe slices only. We agree it is part of the safety invariant for
At least for now, I'd like to be as strict as we can, and demand them to be dereferencable for |
@RalfJung Ok, I see.
Yes. |
I must say I'm really surprised by this discussion. there's no good reason I can think of why this shouldn't be valid: https://play.rust-lang.org/?gist=50e814066bcb1f4e9f7fc8c2ca71c6c0 I really hope that unsafe raw pointers will stay that way, if we start requiring validity then what makes them raw pointers? |
Notice that wide raw pointers never were "raw" in that sense. Since Rust 1.0, the compiler assumed that vtables are non-null. So, there's nothing to "leave as-is" here. But we might suggest in an RFC to change them to be "fully raw".
That's not really an accurate summary. By "pointers", do you mean everything that has a pointer value or just specifically raw pointers? I am assuming the latter but it is not clear. And even then, you need to make sure that there are no references anywhere that are in conflict with what you are doing with your raw pointers. That is more than just making sure they are "valid" -- "valid" is a technical term in these discussions (also see our glossary).
Just to make sure we are talking about the same thing, the validity requirement is for the metadata of a wide raw pointer only. And (repeating myself) that metadata never was "raw", so it cannot stay that way either. We started requiring validity for that before Rust 1.0. What you are suggesting is to change them to be that way.
The good reason that has been raised in this thread is using raw pointers as method receivers in object-safe traits. |
I'm sorry if I wasn't clear enough (wrote it in the middle of the night).
Yes, I understand that. But I still think a lot of the ecosystem(and the FFI/unsafe one) doesn't know/understand this.
Yes, I talked about just raw pointers. and again everything "you need to make sure" is only at dereference time.
You're right, I shouldn't have said "stay this way" I was more thinking about letting my brain model of unsafe stay the way it is :P
Can these even be useful without dereferencing the raw pointer? You keep saying "only the metadata need to be valid". so how can I create a null/dangling pointer with a valid metadata? :) |
You can't. ;) But the reason I said that is that thin raw pointers are unaffected. I doubt there is a "lot of ecosystem" that has to handle wide raw pointers -- and in FFI they should never occur as their representation is unstable. Right now a raw trait object pointer consists of a data pointer and a vtable pointer but that is not a stable guarantee. Code relying on the data layout of wide pointers (or references) is just as incorrect as code relying on the data layout of So what I am saying is, the code you are talking about has some more fundamental problems than NULL vtables. Someone needs to push through an RFC that stabilizes the representation of wide pointers; before that happens I think there is very little motivation for talking about their validity requirements.
That's fair. ;) But sometimes models are wrong and need to be adjusted to reality... that said, I had a similar reaction to you first, which is why I am generally in favor of not making any validity requirements for raw pointers. But that would make calling a method on a raw |
So let's remove the possiblity of raw pointers to dyn traits. Why do we need them in the language? if you can't construct "invalid" pointers and you can't do pointer arithmetic(which also makes no sense in a pointer pointing to a trait and not a struct) on them then let's leave references only :).
IIRC wide pointers as in slices do have a stable memory layout, (struct Slice {ptr: *const T, len: usize}).
My only question now is what is the usefullness of raw in summary:
Now I might be overstating it and the vtable always points to a static location which means it has no life time to it etc, but it still feels wrong to me. |
You can create a null trait object by first creating a null pointer to a sized type, then casting or coercing to the unsized pointer. |
You can have "invalid" pointers as far as the data ptr is concerned; just the vtable has to be valid. There's still plenty of use-cases for that. In particular this means a general So, "defeating the borrow checker" as you called it, works fine with raw trait object pointers.
You are wrong. Generally, in terms of data layout, the only things you can rely on are things that are explicitly documented. There is no implicit stabilization of layout. |
the UCG docs currently document that such a layout is the real layout, but that is an unstable implementation detail and subject to change etc etc. |
In this case the vtable pointer would be valid and non-null, though. |
Yes, I meant that in reply to "how can I create a null/dangling pointer with a valid metadata?" |
When the UCGs get RFC'ed, this layout becomes guaranteed for the cases mentioned in the document at least (e.g. no multi-trait objects). |
Accepted RFC#2580 Pointer Metadata is currently written such that all Specifically, it provides the API pub fn metadata<T: ?Sized>(ptr: *const T) -> <T as Pointee>::Metadata;
pub struct DynMetadata<DynTrait: ?Sized> { ... }
impl<DynTrait> DynMetadata<DynTrait> {
pub fn size(self) -> usize;
pub fn align(self) -> usize;
pub fn layout(self) -> alloc::Layout;
} This API does only mandate a safety invariant on pointer metadata, but it is at least interesting in the context of this discussion to note that it applies to raw pointers. |
That sounds like it could still make a bunch of existing code unsound since "raw pointers do not have a safety invariant" seems like a reasonable assumption people would make. |
Status update: the safety invariant (not validity invariant) for wide raw pointer metadata are pretty much fixed at this point:
However the validity invariants can of course be more liberal than that. |
With rust-lang/rust#124220, Miri now enforces the validity invariant on dyn trait pointers and references as: must point to a vtable for the right trait. My current inclination is that we should use very liberal validity invariants for metadata of wide raw pointers, which matches the pointer itself: it has to be initialized, and that's it. This means we have to get rid of the niche in wide raw ptr vtable pointers. Meanwhile, for references we should go as strong as possible: for dyn Trait, require the vtable to be for the right trait; for slices, require the full size of the type (unsized slice tail plus the statically sized prefix) to fit into |
I have opened #516 to track the remaining question here, which is about vtable pointers. |
The discussions at #76 and #77 are about the validity invariant that all references have to maintain in general; this issue here is specifically about the validity invariant of the metadata component of any fat pointer -- which do not have to be references!
Rc<[u8]>
is also a fat pointer. The expected metadata depends on the type of the "unsized tail".For slices, it seems clear that the invariant will be the same as that for
usize
. But fordyn Trait
pointers, what should we require about the vtable? My guess would have been at least as much as an&[usize; 3]
(a vtable has at least 3 slots: size, alignment and drop function, not necessarily in that order). But in this forum thread, the valid question came up whyWeak::new
does not work fordyn Trait
, and indeed it could if we could "fake" the fat pointer component (though doing that generically seems hard).And then there is the question how this interacts with eventual custom/user-defined DSTs.
The text was updated successfully, but these errors were encountered: