-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
repr(C) on MSVC targets does not always match MSVC type layout when ZST are involved #81996
Comments
@mahkoh can you reproduce this without |
@jyn514 Yes |
repr(C) has served two purposes:
The first purpose is advertised both in the reference and in stdlib code e.g. in The second purpose is also advertised in the reference. However, these purposes are not compatible as shown above. |
The layout algorithm is not the same across all targets, it is always supposed to be whatever the C ABI mandates on that particular target |
Does this reproduce with clang? |
Let's make sure the windows notification group is aware of this: @rustbot ping windows |
Hey Windows Group! This bug has been identified as a good "Windows candidate". cc @arlosi @danielframpton @gdr-at-ms @kennykerr @luqmana @lzybkr @nico-abram @retep998 @rylev @sivadeilra |
Assigning P-high as discussed as part of the Prioritization Working Group procedure and removing I-prioritize. |
So to summarize:
|
One more thing #[repr(C, packed(1))]
struct X {
c: u8,
m: std::arch::x86_64::__m128,
}
#[no_mangle]
pub fn f() -> usize {
// Expected 32
// Actual 17
std::mem::size_of::<X>()
} |
That is, if we assume |
Yikes, I think a lot of folks do assume that, yeah. |
Basically all types that have a This particular problem could be fixed by adding such an annotation to __m128 on MSVC targets. The definition of __m128 is broken anyway because it has to be 16 byte aligned on MSVC targets but is defined as 4 byte aligned in the stdlib. So in the end this is not an inherent problem of the layout algorithm because you're not supposed to be able to write this anyway. |
C/C++ do not permit zero-sized structs or arrays. Aggregates (structures and classes) that have no members still have a non-zero size. MSVC, Clang, and GCC all have different extensions to control the behavior of zero-sized arrays. It is unfortunate that In the near term, perhaps the best thing is to add a new diagnostic, which is "you asked for |
GCC and Clang do in fact accept even completely empty structs and unions and such types have size 0 when compiled with these compilers. (Except that Clang tries to emulate MSVC when compiling for
That seems to be the most pragmatic solution. Alternatively one could deprecate repr(C) completely and replace it by repr(NativeC) and repr(stable).
Maybe only warn on msvc targets. |
In C++ they have size 1:
Actually the nomicon says:
So it seems that this (unsound) behavior is even documented. |
repr(C) is about C compatibility not about C++ compatibility.
Otherwise empty structs in msvc have size at least 4 bytes not 1 byte in C mode. |
These are non-standard extensions, and they deviate from the C/C++ specification. If I understand the desire for compatibility with the de-facto standard behavior of these compilers, from a practical point of view. At the same time, these are areas where they do deviate from the language standard. Figuring out the best solution for Rust will require some careful consideration, and the short-term solution of "make it work like Clang / MSVC / GCC" should not be chosen without due consideration. |
Fix UB from misalignment and provenance widening in `std::sys::windows` This fixes two types of UB: 1. Reading past the end of a reference in types like `&c::REPARSE_DATA_BUFFER` (see rust-lang/unsafe-code-guidelines#256). This is fixed by using `addr_of!`. I think there are probably a couple more cases where we do this for other structures, and will look into it in a bit. 2. Failing to ensure that a `[u8; N]` on the stack is sufficiently aligned to convert to a `REPARSE_DATA_BUFFER`. ~~This was done by introducing a new `AlignedAs` struct that allows aligning one type to the alignment of another type. I expect there are other places where we have this issue too, or I wouldn't introduce this type, but will get to them after this lands.~~ ~~Worth noting, it *is* implemented in a way that can cause problems depending on how we fix rust-lang#81996, but this would be caught by the test I added (and presumably if we decide to fix that in a way that would break this code, we'd also introduce a `#[repr(simple)]` or `#[repr(linear)]` as a replacement for this usage of `#[repr(C)]`).~~ Edit: None of that is still in the code, I just went with a `Align8` since that's all we'll need for almost everything we want to call. These are more or less "potential UB" since it's likely at the moment everything works fine, although the alignment not causing issues might just be down to luck (and x86 being forgiving). ~~NB: I've only ensured this check builds, but will run tests soon.~~ All tests pass, including stage2 compiler tests. r? `@ChrisDenton`
This comment was marked as resolved.
This comment was marked as resolved.
(prepare for more accidental closures of this issue when that commit lands on the beta and stable branches -- at least that's how things went in the past) |
… MSVC abi fix. Tried fixing marker type declarations to be zero sized after MSVC abi. This commit assumes that this issue will get resolved to "#[repr(C)] structs can't be zero-sized": rust-lang/rust#81996
This issue would probably benefit from being split up into one issue per problem following this summary as well as this comment. |
Looking at it, it actually seems to be largely two issues -- enums with too big discriminants, and then everything around size 0. I have opened #124403 for the enums, so this issue is now about the case of structs / unions with no fields / zero-sized fields. |
In general it probably makes sense to consider this together with #100743, in the wider question of -- how do we deal with the differences between MSVC's layout algorithm and the one everyone else uses. There seem to be two differences that surfaced so far: how to deal with structs/unions where all fields have size 0, and how to deal with explicitly aligned types occurring as (potentially nested) fields of packed types.
Option 1 seems unlikely as there's a lot of code we'd have to exclude to be sure that we are in the fragment where all our current targets agree. Option 1 is also extremely unsatisfying for the aligned-field-in-packed-struct case as it leaves a gap in what can be expressed with repr(C) types in Rust. Option 3 seems closer to the original goal of |
Even ISO C23 doesn't require implementations to support structs/unions without fields; the C grammar requires at least one field. So an ABI may rightly not define the ABI for a struct/union without fields, not even define that they are explicitly disallowed. This seems to be the case for the Windows ABIs and for ARM ABIs. Similarly, zero-sized arrays aren't allowed by C. So, I think that it makes sense to, by default, assume a target doesn't support those types with
Instead of making the definition of these a hard error, we could make non-FFI-safe. (I am suggesting that
If the ABI doesn't specify what to do (i.e. doesn't disallow them), we could compute the layout using both rules, and if the layouts match, then accept the type. If they differ, definitely don't. But I think the best thing to do is to ask the target maintainer to specify what to do in the target spec, make this documentation a requirement for adding a new target, and add the info to existing target definitions. |
I assume the FFI safety lint will already triggee on these cases. However, that lint runs on generic code (pre-monomorphization) so it will necessarily miss some cases.
Most importantly though, the lint does not absolve us from specifying what our layout and ABI are for all repr(C) types, so it does not really help for this issue.
Am 28. März 2025 21:45:22 UTC schrieb Brian Smith ***@***.***>:
…briansmith left a comment (rust-lang/rust#81996)
Even ISO C23 doesn't require implementations to support structs/unions without fields; the C grammar requires at least one field. So an ABI may rightly not define the ABI for a struct/union without fields, not even define that they are explicitly disallowed. This seems to be the case for the Windows ABIs and for ARM ABIs.
Similarly, zero-sized arrays aren't allowed by C.
So, I think that it makes sense to, by default, assume a target doesn't support those types with `#[repr(C)]`.
> Make enough things hard errors so that what is allowed is consistent between GCC and MSVC. For aligned-field-in-packed, this is what we tried to do, but we #100743 (comment) at that. For zero-sized cases, this would require disallowing a bunch of things, like repr(C) unit structs and repr(C) generic newtpyes wrapping an arbitrary T (since the newtype could be wrapping an empty array).
Instead of making the definition of these a hard error, we could make non-FFI-safe. (I am suggesting that `#[repr(C)]` ZSTs be considered to be not FFI-safe for targets that don't define their ABI, but I am not suggesting changing anything for non-`#[repr(C)]` ZST types.)
> Compute layout using GCC/clang rules, but emit a lint for cases where that differs from MSVC.
> Compute layout using the rules of the dominant C compiler for a target, i.e., use MSVC rules for MSVC targets.
If the ABI doesn't specify what to do (i.e. doesn't disallow them), we could compute the layout using both rules, and if the layouts match, then accept the type. If they differ, definitely don't. But I think the best thing to do is to ask the target maintainer to specify what to do in the target spec, make this documentation a requirement for adding a new target, and add the info to existing target definitions.
--
Reply to this email directly or view it on GitHub:
#81996 (comment)
You are receiving this because you are subscribed to this thread.
Message ID: ***@***.***>
|
You don't have to use the existing lint to reject FFI-unsafe types. In fact, there are a lot of reasons, which you frequently mention, for not relying on the lint. Instead, a new, better check for FFI-unsafe types is needed. |
So IIUC, your proposed resolution to this issue is
What is not clear to me is whether you think a lint with false negatives is sufficient to resolve this issue or not. Personally I think a solution involving false negatives is not properly resolving this issue. So for cases where That said, in the OP we see MSVC generate code with empty arrays, don't we? So, there is a layout we could reasonably be expected to produce. It's just not the one we produce today. |
No, I would instead expect To clarify further my positoin, I think that
Note that the ABI might define a representation for zero-sized arrays but not field-less structs, or vice-versa. The idea that these are somehow similar constructs is a Rust-ism that doesn't naturally apply to C.
I don't distinguish between "lints" and other kinds of type checks, and I don't know enough to suggest how they should be implemented. I am just suggesting the checks that should exist, independently of how they are implemented |
Note that the examples above don't even involve any field-less structs, they involve 0-length arrays. And apparently those are allowed by MSVC, there's a link in the issue description. So I don't understand why you are focusing on fieldless structs. Other than that, warnings have been suggested above, so -- not sure what new point you are intending to bring to this conversation. |
Hi Ralf, I am used to your unnecessarily negative responses to my comments so that one isn't isn't surprising. You might try searching this page for "array" in my message so you can see that I am not "focusing" on fieldless structs.
Likewise. |
I am genuinely puzzled by your comments, I am sorry if that comes across as negative. This is a huge thread, and lints were discussed multiple times before. I was trying to understand if you are suggesting something new or supporting some previously voiced position.
I was referring specifically to the 2nd bullet about windows -- sorry, I should have made that more clear. I'm sometimes replying in haste when wading through my backlog. Maybe I should stop engaging in some threads entirely so that I can spend proper time on the remaining ones and be more likely to be productive there. In that bullet, you mostly spoke about "fieldless structs"; there was one mention of "fieldless arrays" which looked like a typo since the immediately next sentence just spoke about "structs" again and "fieldless array" does not make sense (I think? arrays don't usually have "fields"). Now you edited this adding more mentions of "arrays", without leaving a note that there was an edit, which is disingenuous given that I already replied. In the edited version you are claiming the windows ABI does not have zero-sized arrays, but in the OP we have C code using zero-sized arrays with MSVC. I am still puzzled by your comments.
I mean, the Rust compiler does, so when proposing changes to the Rust compiler it'd be good to use terms that make sense in this context. Otherwise how should anyone understand what you are saying? But, I think I am getting the clue, you don't like the way I am probing your comments to try and understand them. I am not sure what to do about that. Asking a question does not mean I am saying you are wrong, or being negative. Don't try to read things between the lines, there's nothing there, what I want to express is just literally the question that I wrote. |
There are two ways of defining "the Win64 ABI:" what the Microsoft Learn documentation says, and what MSVC does. I believe @briansmith's position is that the prose documentation (which in the MSVC case, is never referred to as a specification) should be considered authoritative, and anything not defined in said documentation should not be considered defined for FFI. The more typically expected understanding is that MSVC provides a stable ABI for any C signature that it allows you to write, and that what the compiler does is the source of truth, not the documentation. |
Also see this summary below.
Consider
and the corresponding MSVC output: https://godbolt.org/z/csv4qc
The behavior of MSVC is described here as far as it is known to me: https://github.com/mahkoh/repr-c/blob/a04e931b67eed500aea672587492bd7335ea549d/repc/impl/src/builder/msvc.rs#L215-L236
The text was updated successfully, but these errors were encountered: