-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unnamed fields of struct and union type #2102
Conversation
Also discuss the drawbacks of that alternative.
Possible alternative: Use struct Foo { x: i32, y: i32 }
struct Bar { _: Foo, z: i32 }
fn main() {
let bar = Bar { x: 1, y: 2, z: 3 };
bar.x = bar.y * 2 + bar.z;
} |
@retep998 There are two aspects to that, and I'd like to unpack them and discuss them separately: the use of First, the use of Second, the separate declarations of types. Since the inner fields appear directly within the containing type, it seems to me an advantage to require declaring them inline within that containing type, so that you can easily find and reason about all of those fields in one place. Allowing a separate structure declaration, and then expanding the fields of that structure declaration into another structure, seems more likely to lead to confusion and action-at-a-distance. On top of that, it would require several additional semantic considerations. What happens if I can certainly document this as an alternative within the RFC. But could you explain whether this is an alternative you would prefer, and if so, why? Or are you primarily mentioning it for the sake of completeness, to make sure we've fully explored the design space? |
RE type Foo struct {
x int
y int
}
type Bar struct {
Foo
z int
} Interestingly, Go does not consider it an error to have overlapping field names-- it just uses the top-most field with the given name. |
Well the big advantage that I see it having is the fact that it removes all questions about the behavior of types declared inline. Questions regarding visibility, attributes, rustdoc, trait implementations, and so on are rendered moot because the type is defined normally instead of inline. Also, your RFC combines two concepts, anonymous fields and anonymous structs. It is entirely possible in C to have an anonymous inline struct with a named field, or to have a named inline struct with a named field, whereas this RFC only allows for anonymous inline structs with an anonymous field. My alternative touches on exclusively anonymous fields allowing it to be more flexible in the ways it can be used, and could be combined with a future proposal for anonymous structs/unions. Furthermore, for someone actually using such a type, they would not be able to observe any difference between a struct declaring an anonymous field of an anonymous type and an anonymous field of a named type, unless they looked in rustdoc and saw that the field's type had a name. In either case they'd still be able to do |
I am wondering if having to give a name to the field is really such a big problem that the entire language needs to get a new concept for the benefit of a really small fraction of the code written in Rust. Not being able to see the structure of a type (as in, what overlaps and what does not) is a huge footgun; mostly the compiler should do enough tracking to prevent mistakes but I still wouldn't be surprised if, in unsafe code, people would end up writing to Is there any situation besides C(++) FFI where using this new feature would be a good idea? I don't think so. This is a misfeature of C, in my opinion, and we should have really good reasons to copy that misfeature. Maybe FFI is a good enough reason, but even then I would prefer the RFC to explicitly say that we are not adding this because we think it is a good idea to have the feature on its own merrits. Or maybe, actually, the feature has its own merrits, but then the RFC should state them. |
So my use-case is for Binder (the kerne API, header at https://github.com/torvalds/linux/blob/master/include/uapi/linux/android/binder.h). This includes both nested unnamed fields (which this proposal covers) and nested named fields. An example of nested unnamed field : struct somestruct {
// A nested unnamed field
union {
binder_uintptr_t pad_binder;
__u32 fd;
};
// A nested named field
union {
__u32 handle;
binder_uintptr_t ptr;
} target;
}; Currently, when running bindgen, it will generate types like So big 👍 for me. Regarding @retep998 's proposal, I don't think that's such a good idea as my main goal is actually to "hide" those (unnamed) structs from the documentation. As for @RalfJung 's concern, I agree that this is a misfeature and probably shouldn't be used outside of providing FFI/mirroring an existing API. I guess we could make it a warning to use this feature without |
I agree with @RalfJung : this would be a bad feature for general Rust structs. As FFI is one of Rust's goals though, some way of doing this for FFI is desirable. Therefore I agree with @roblabla and think we should restrict this feature to |
There already exists "some way of doing this for FFI" - give each union field a name (if you're using Variadics are basically glued-on to allow accessing the relevant C ABI. |
That's basically inherent associated fields (#1546). We could have: #[repr(C)]
pub union _Bindgen_somestruct_Field_0 {
pad_binder: binder_uintptr_t,
fd: u32
}
#[repr(C)]
pub union _Bindgen_somestruct_Field_target {
handle: u32,
ptr: binder_uintptr_t,
}
#[repr(C)]
pub struct somestruct {
_Field_0: _Bindgen_somestruct_Field_0,
target: _Bindgen_somestruct_Field_target
};
impl somestruct {
unsafe pad_binder: self._Field_0.pad_binder;
unsafe fd: self._Field_0.fd;
}
fn foo() {
let somestruct = unsafe { somestruct {
fd: 2,
target
}};
} Anonymous types are just annoying because you can't put them in structs, can't impl them, etc. |
@roblabla It's a good thing we have |
I've tried very intentionally to separate those two, focusing exclusively on one concept (unnamed fields), and not the other (anonymous types), by ensuring that the type and aggregate is never visible or accessible, and is used solely for structure layout and field naming. I intended that as the biggest reduction in scope from the pre-RFC to this RFC. Standard C doesn't allow you to have a inline struct while naming the struct type; that's a Microsoft extension. See the mention of I also feel like inlining a named structure's fields into another structure has more of an "action at a distance" feeling; you could even do that with a type in another module, which seems quite problematic. Inlining the declaration seems much clearer to me. That also produces clearer rustdoc results that reflect the resulting structure. We could teach rustdoc to inline the fields and their documentation, but that makes the documentation not match the code. I'd rather have the code match the underlying interface. |
@RalfJung The primary value in this feature is for FFI, and I emphasized that in many places of the RFC. In particular:
@est31 Restricting this to |
I think we'd need to allow all non-Rust reprs rather than only for |
@sfackler I think the intent was that you must have |
To some extent, I intended that as a feature, for simplicity's sake. Don't think of this as an "anonymous type", think of this as a way of laying out the fields of a structure. I think this feature should be as simple as possible to handle the specific issue in question. We may want to, later, have an "anonymous type mechanism". In the pre-RFC, that proved wildly controversial. I'd like to focus on the narrowest possible thing that solves the FFI issue here. |
One of Rust's major strengths is the exceptional support for interfacing to C; that's a critical part of the story for replacing C code with Rust code. This is a pattern that occurs pervasively in Windows APIs, POSIX APIs, Linux APIs, and many other C APIs. It's so common that C11 standardized it. I also don't see this as a completely "new concept"; this is just a way of describing how the fields of a structure are laid out. The way C programmers think of this is that the uses of
And this:
(In fact, I should probably put that diagram into the RFC.) |
That's why I was suggesting it! I don't think this feature should be had outside of support for C. |
@est31 No, |
I don't have any immediate use for this feature other than FFI, but it is very much needed for FFI. Kernel-user interfaces on both Unix and Windows use the C equivalent extensively, which means something that provides this functionality is unavoidable. The absence of this feature is the reason I gave up on trying to get I'm going to work through the process of mapping the OSX struct ip6_hdr {
union {
struct ip6_hdrctl {
u_int32_t ip6_un1_flow; /* 20 bits of flow-ID */
u_int16_t ip6_un1_plen; /* payload length */
u_int8_t ip6_un1_nxt; /* next header */
u_int8_t ip6_un1_hlim; /* hop limit */
} ip6_un1;
u_int8_t ip6_un2_vfc; /* 4 bits version, top 4 bits class */
} ip6_ctlun;
struct in6_addr ip6_src; /* source address */
struct in6_addr ip6_dst; /* destination address */
} __attribute__((__packed__));
#define ip6_vfc ip6_ctlun.ip6_un2_vfc
#define ip6_flow ip6_ctlun.ip6_un1.ip6_un1_flow
#define ip6_plen ip6_ctlun.ip6_un1.ip6_un1_plen
#define ip6_nxt ip6_ctlun.ip6_un1.ip6_un1_nxt
#define ip6_hlim ip6_ctlun.ip6_un1.ip6_un1_hlim
#define ip6_hops ip6_ctlun.ip6_un1.ip6_un1_hlim (Yes, this is what you usually see in a C system header when this feature is needed on the Rust side. Most of the interfaces in question, at least on Unix, were specified long before C11 added an anonymous-struct-field feature.) (It is possible that you are supposed to be able to refer to the #[repr(C,packed)]
struct ip6_hdr {
union {
struct {
ip6_flow : u32, // 20 bits of flow-ID
ip6_plen : u16, // payload length
ip6_nxt : u8, // next header
ip6_hlim : u8, // hop limit
},
ip6_vfc : u8, // 4 bits version, top 4 bits class
},
ip6_src : in6_addr, // source address
ip6_dst : in6_addr, // destination address
} Nice and mechanical. I like it. 😉 This case answers the question "does this need to work with repr(packed)?" in the affirmative, incidentally. |
Oh, also, this quick-and-dirty check...
... indicates that there are something like 120 different structures in the OSX system headers for which this feature would be useful. (Some of them may be kernel-internal.) |
@joshtriplett Good diagram, do put it into the RFC. 😀 |
Hang on, does Rust have bitfields? Because that IPv6 header really ought to have been defined in C as struct ip6_hdr {
uint32_t ip6_ver : 4;
uint32_t ip6_tclass : 8;
uint32_t ip6_flow : 20;
uint16_t ip6_plen;
uint8_t ip6_nxt;
uint8_t ip6_hlim;
in6_addr ip6_src;
in6_addr ip6_dst;
} __attribute__((__packed__)); and this isn't just academic: I am certain that there is C out there that expects to be able to write to both I still want the feature proposed here, it's just I picked a bad example. 🤦♂️ |
@zackw Rust doesn't have bitfields yet; that's also being worked on. Perhaps you could post, as an example, two different versions of |
@joshtriplett OSX
The type |
Unclear by the RFC, but is this allowed for enum variants? e.g. would:
be allowed? I think it should, but the RFC doesn't clarify. |
@clarcharr I see no reason why that couldn't be allowed. However whether it should be allowed is another question, which can be resolved later. This feature is primarily useful for FFI, where Rust enums are very rarely used, so that specific case might not be worth supporting. |
@retep998 Considering how |
Also consider the case of offloading variants' fields to another type: That is very useful outside of FFI. |
I'd like to avoid introducing them to enums right away, precisely because they're less likely to be useful for FFI, and there's no parallel to C there. I'm open to doing so, but my inclination would be to prioritize that lower than the initial minimal step. |
🔔 This is now entering its final comment period, as per the review above. 🔔 |
I like that this proposal is confined to repr(C), because while it's conceptually and syntactically pretty unfortunate it's also one of those eat-your-vegetables things that we need to serve FFI purposes. I do want to make sure that we don't see this as being a stepping stone to more general support elsewhere though (IOW that the comparison with variadic argument lists remains apt), because I'd really hate to deal with this sort of complication in Rust code in general. |
The final comment period is now complete. |
Huzzah! 🎉 This RFC has been merged! Tracking issue: rust-lang/rust#49804 |
Rendered
Tracking issue
Allow unnamed fields of union and struct type, contained within structs and unions, respectively; the fields they contain appear directly within the containing structure, with the use of union and struct determining which fields have non-overlapping storage (making them usable at the same time). This allows grouping and laying out fields in arbitrary ways, to match C data structures used in FFI. The C11 standard allows this, and C compilers have allowed it for decades as an extension. This proposal allows Rust to represent such types using the same names as the C structures, without interposing artificial field names that will confuse users of well-established interfaces from existing platforms.
This was originally proposed in a pre-RFC, but that proposal also included a more general mechanism for anonymous types for named fields; that more general mechanism proved controversial, and people rightfully observed that we should add such types to the type system in general, not just within structures. I would like to see a solution for that as well. But based on the discussion from that pre-RFC, in this RFC I've separated out and proposed a minimal change that allows us to represent this common FFI pattern in Rust.
For more information about unnamed fields in C, see the GCC documentation.
CC @retep998 (Windows APIs use this extensively), @zackw (
siginfo_t
and similar structures), @roblabla (Android APIs), @aturon