-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement unsized unions #47650
Implement unsized unions #47650
Conversation
src/librustc_typeck/check/wfcheck.rs
Outdated
@@ -141,7 +141,7 @@ impl<'a, 'gcx> CheckTypeWellFormedVisitor<'a, 'gcx> { | |||
self.check_variances_for_type_defn(item, ast_generics); | |||
} | |||
hir::ItemUnion(ref struct_def, ref ast_generics) => { | |||
self.check_type_defn(item, true, |fcx| { | |||
self.check_type_defn(item, false, |fcx| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's this argument and can it be removed from check_type_defn
's definition (i.e. are there any non-false
calls)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The argument is called all_sized
and is currently true for enums
| | ||
= help: the trait `std::marker::Sized` is not implemented for `T` | ||
= help: consider adding a `where T: std::marker::Sized` bound | ||
= note: no field of a union may have a dynamically sized type | ||
= note: only the last field of a union may have a dynamically sized type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem right. union
doesn't have ordered fields. Maybe require that at most one field is unsized?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not really right, I just did it this way for now to avoid having to refactor things. I can make it more general if I must :)
Is there a specific reason why this should only be allowed for a single field? Just today I wanted to create a union with two unsized fields. I'm working with postgresql where this "varlena" format is how they represent variable-length data. This code is an exact translation of the C code I'm interacting with (of course the extern type is a 0-sized array there). I want to build |
Wait, is |
@eddyb the metadata is created during unsizing, and the actual value of the field being unsized is not needed – only the type. Once unsized, the metadata cannot change, and the unsized union field cannot be written to. There is no way of knowing which field type is active, so the return value of There are currently no types in Rust for which this isn't true, but there may be in the future, and there would have to be some way of disallowing calls to Single-field unions like |
Interesting question @eddyb. As @main-- asked, what happens if we support unions with ≥2 DST fields? union U {
slice: [u8],
trobj: dyn Debug,
} The problem is how do we represent the metadata:
I think supporting unions with ≥2 DST fields require an RFC to define the proper semantic, and thus best avoided in this PR. |
@mikeyhew Okay so what if I have this: union OneOrManyBytes {
one: u8,
many: [u8]
}
size_of_val(&OneOrManyBytes { one: 0 }) Can I Just never create this? Does that mean a |
Given that rust-lang/rfcs#1897 wants to prohibit unsized unions altogether that does seem a little strange. At least to me it was simple and obvious/intuitive that an unsized enum would represent its metadata as an union of the metadata of its members. Clearly, the unsized metadata of a sized type is empty (union). I understand that this probably warrants an RFC though. I don‘t think a union can ever logically be DynSized, given that you can never know in advance which variant is active. Even if only one field is unsized, the size metadata is garbage if a sized variant is active - I feel like the concept is just ill-defined like that. |
So the only way to get something useful is to borrow one of the fields, which would extract the metadata appropriate for that field, right? This definitely needs cc @nikomatsakis I like the idea of having an union of metadata but it should fit with custom DSTs. |
It could be DynSized if the metadata is a struct, i.e. you provide the metadata for all fields no matter they are active or not, e.g. union U {
a: [u8],
b: [u16],
}
=>
struct U::Meta {
a: usize,
b: usize,
} then |
@kennytm How do you initialize those metadata fields? Especially when trait objects are involved. |
@eddyb I think it can only be initialized through unsize coercion; or require that there is a type ( |
@kennytm Oh so the unsize coercion would have to do multiple fields in parallel or not be allowed at all, that makes a bit more sense. |
FWIW, rust-lang/rfcs#1897 doesn't try to prohibit unsized unions (they are already "prohibited" aka not supported), it just doesn't try to introduce them (they are listed in future directions). |
I agree, single-field unions are all we need for |
Regarding unions with metadata for each unsized field:
Theoretically you could unsize one field, and then unsize the other – after the first coercion, you'd have pointer metadata for one field, and after the second coercion, metadata for both. |
@mikeyhew Right, by "in parallel" I meant independently, which means you could allow multiple at once or one at a time, but you'd still be able to unsize more than one field (kind of neat, huh?). |
@eddyb oh 😄, I thought you meant "at the same time" |
@mikeyhew I did mean that, originally, but the important bit is that they're independent. |
Huh. Re-reading this thread, I'm feeling a bit lost. I certainly agree that the concept of an "unsized union" with multiple fields has a lot of question marks.
I don't get it. =) How would you unsize multiple fields independently? Would they have to have compatible metadata? |
@nikomatsakis The context is choosing |
@nikomatsakis Just clarifying #47650 (comment) since the thread seems to start from my comment. Unsizing will never modify data. It just generates the necessary metadata from the original sized type, and reinterpret-cast the data into the DST. So unsizing a
The actual representation of (I don't think we should support this in this PR.) |
Hmm, maybe I was thinking about it wrong. I guess that unsizing doesn't require that the actual data be valid, at least not currently. That is, But this may not be true once we hit custom dst, right? I guess it depends on just how the trait looks. |
Wait. With the custom DST proposal, |
I think by this you mean: unless it can compute the size without a reference to the actual data, i.e., purely from the metadata...right? If so, that is precisely what I was trying to get at, yeah. =) |
This is now on the unsafe code guidelines agenda: https://internals.rust-lang.org/t/proposal-reboot-the-unsafe-code-guidelines-team-as-a-working-group/7307 |
@pietroalbini thanks for the ping, and sorry for letting this slide. I'm still interested in finishing this, if it is desired
I wasn't at the meeting, so I'm not sure what specific issues were discussed around unsized unions or how unions are dropped. But would it be OK to implement unsized unions anyway, with the intention of keeping them unstable at least until the details have been figured out? |
The issue is that we may not want to make any restrictions for the bit patterns that are valid for a union. If we adapt that policy, then unsized unions with thin pointers do not work as they have to always be able to determine the size by dereferencing the union. This is not related to dropping.
I have no idea who's even in charge of such decisions. ;) They certainly have to be unstable at first. Is there an RFC for this feature? The unions RFC does not mention "unsized" at all. |
I don't think that this PR puts us into any corners that we can't get out of. My guess is that unions with thin-pointer DST fields would be completely unsized by default, since as you said you have to dereference the union to read the metadata and get the size/alignment of the union field. This PR does not support thin-pointer DSTs, but it doesn't rule them out either, it just supports the dynamically-sized types that currently exist in Rust, where the size and alignment are determined from the pointer metadata. |
Interesting. I don't think this alternative came up in the discussions so far. But wouldn't that make them fairly useless? |
@RalfJung by themselves, yeah. But AFAIK there is no safe way to get the size of such a union, so that's what it would have to be. What were the other options that came up? |
The idea so far was that if DSTs work in unions, then they'd fully work. So This is in conflict with the idea that a union is just a bag of bytes and makes no assumptions about its contents being valid. |
And keep in mind, with custom DST you could still implement |
OK, I see. It looks like the conflict lies in one statement saying that the vtable will always be valid, and the other saying that no fields are guaranteed to be valid. That works for normal trait objects, because the vtable is stored in the pointer metadata, but not for thin trait objects, where the vtable is stored with the actual data. I always thought that at least one union field had to be active (and therefore valid). Is there an advantage to that not being the case? |
This is a discussion we are currently having in another thread. |
Ping from triage @RalfJung! What's the outcome of that discussion? |
I don't think there is an outcome yet... Someone will have to make an RFC for settling this, I think. |
Should this PR be closed in the meantime then? |
I think that's for the lang team to decide. |
Ping from triage @rust-lang/lang will someone have time to review this PR ? |
r? @eddyb
This allows unions to have at most one unsized field/variant, in order to allow
ManuallyDrop<T> where T: ?Sized
(see #47034). For now, I made it so that the unsized field has to be the last field, which makes the implementation simpler because there's a lot of similarities to structs.cc #32836
Questions for reviewers:
Sized
bound onManuallyDrop
in this PR? That would be insta-stable, so would require an FCP.