Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

repr(C) on MSVC targets does not always match MSVC type layout when ZST are involved #81996

Open
mahkoh opened this issue Feb 11, 2021 · 129 comments · Fixed by #101171
Open

repr(C) on MSVC targets does not always match MSVC type layout when ZST are involved #81996

mahkoh opened this issue Feb 11, 2021 · 129 comments · Fixed by #101171
Labels
A-FFI Area: Foreign function interface (FFI) A-repr Area: the `#[repr(stuff)]` attribute C-bug Category: This is a bug. I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness O-windows Operating system: Windows O-windows-msvc Toolchain: MSVC, Operating system: Windows P-medium Medium priority T-lang Relevant to the language team, which will review and decide on the PR/issue.

Comments

@mahkoh
Copy link
Contributor

mahkoh commented Feb 11, 2021

Also see this summary below.

Consider

#![allow(dead_code)]

use std::mem;

#[no_mangle]
pub fn sizeof_empty_struct_1() -> usize {
    #[repr(C)]
    struct EmptyS1 {
        f: [i64; 0],
    }

    // Expected: 4
    // Actual: 0
    mem::size_of::<EmptyS1>()
}

#[no_mangle]
pub fn sizeof_empty_struct_2() -> usize {
    #[repr(C, align(8))]
    struct X {
        i: i32,
    }

    #[repr(C)]
    struct EmptyS2 {
        x: [X; 0],
    }

    // Expected: 8
    // Actual: 0
    mem::size_of::<EmptyS2>()
}

#[no_mangle]
pub fn sizeof_enum() -> usize {
    #[repr(C)]
    enum E {
        A = 1111111111111111111
    }

    // Expected: 4
    // Actual: 8
    mem::size_of::<E>()
}

#[no_mangle]
pub fn sizeof_empty_union_1() -> usize {
    #[repr(C)]
    union EmptyU1 {
        f: [i8; 0],
    }

    // Expected: 1
    // Actual: 0
    mem::size_of::<EmptyU1>()
}

#[no_mangle]
pub fn sizeof_empty_union_2() -> usize {
    #[repr(C)]
    union EmptyU2 {
        f: [i64; 0],
    }

    // Expected: 8
    // Actual: 0
    mem::size_of::<EmptyU2>()
}

and the corresponding MSVC output: https://godbolt.org/z/csv4qc

The behavior of MSVC is described here as far as it is known to me: https://github.com/mahkoh/repr-c/blob/a04e931b67eed500aea672587492bd7335ea549d/repc/impl/src/builder/msvc.rs#L215-L236

@mahkoh mahkoh added the C-bug Category: This is a bug. label Feb 11, 2021
@jyn514
Copy link
Member

jyn514 commented Feb 11, 2021

@mahkoh can you reproduce this without no_mangle? no_mangle should really require unsafe, it can cause unsoundness if it overlaps with another linker symbol.

@mahkoh
Copy link
Contributor Author

mahkoh commented Feb 11, 2021

@jyn514 Yes

@mahkoh
Copy link
Contributor Author

mahkoh commented Feb 11, 2021

repr(C) has served two purposes:

  • A representation with a know layout algorithm which is the same across all targets
  • A representation that is compatible with C types

The first purpose is advertised both in the reference and in stdlib code e.g. in Layout. It is probably used in many other places.

The second purpose is also advertised in the reference.

However, these purposes are not compatible as shown above.

@mahkoh mahkoh changed the title repr(C) unsound on MSVC targets repr(C) is unsound on MSVC targets Feb 11, 2021
@jonas-schievink
Copy link
Contributor

The layout algorithm is not the same across all targets, it is always supposed to be whatever the C ABI mandates on that particular target

@jonas-schievink
Copy link
Contributor

Does this reproduce with clang?

@jonas-schievink jonas-schievink added A-FFI Area: Foreign function interface (FFI) I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness O-windows-msvc Toolchain: MSVC, Operating system: Windows labels Feb 11, 2021
@rustbot rustbot added the I-prioritize Issue: Indicates that prioritization has been requested for this issue. label Feb 11, 2021
@mahkoh
Copy link
Contributor Author

mahkoh commented Feb 11, 2021

The layout algorithm is not the same across all targets, it is always supposed to be whatever the C ABI mandates on that particular target

The layout algorithms used by the C compilers are not the same. But repr(C) is advertised with a specific layout algorithm that is the same across all targets. Namely in these places

Does this reproduce with clang?

Clang contains many bugs in their MSVC-compatible layout algorithm. It should not be used as a reference:

The output of clang is incompatible with both MSVC and rustc: https://github.com/llvm/llvm-project/blob/661f9e2a92302b1c7140528423fdbfc133a68b41/clang/lib/AST/RecordLayoutBuilder.cpp#L3076-L3087

@rylev
Copy link
Member

rylev commented Feb 12, 2021

Let's make sure the windows notification group is aware of this:

@rustbot ping windows

@rustbot
Copy link
Collaborator

rustbot commented Feb 12, 2021

Hey Windows Group! This bug has been identified as a good "Windows candidate".
In case it's useful, here are some instructions for tackling these sorts of
bugs. Maybe take a look?
Thanks! <3

cc @arlosi @danielframpton @gdr-at-ms @kennykerr @luqmana @lzybkr @nico-abram @retep998 @rylev @sivadeilra

@rustbot rustbot added the O-windows Operating system: Windows label Feb 12, 2021
@rylev
Copy link
Member

rylev commented Feb 12, 2021

Assigning P-high as discussed as part of the Prioritization Working Group procedure and removing I-prioritize.

@rylev rylev added P-high High priority and removed I-prioritize Issue: Indicates that prioritization has been requested for this issue. labels Feb 12, 2021
@Lokathor
Copy link
Contributor

Lokathor commented Feb 12, 2021

So to summarize:

  • repr(C) is inaccurate in the presence of Zero-sized Types.
  • A repr(C) enum with a tag value in excess of the normal tag maximum will (inaccurately) use a larger tag rather than error.

@mahkoh
Copy link
Contributor Author

mahkoh commented Feb 12, 2021

One more thing

#[repr(C, packed(1))]
struct X {
    c: u8,
    m: std::arch::x86_64::__m128,
}

#[no_mangle]
pub fn f() -> usize {
    // Expected 32
    // Actual 17
    std::mem::size_of::<X>()
}

@mahkoh
Copy link
Contributor Author

mahkoh commented Feb 12, 2021

That is, if we assume repr(packed(1)) to have the same intended effect as #pragma pack(1).

@Lokathor
Copy link
Contributor

Yikes, I think a lot of folks do assume that, yeah.

@mahkoh
Copy link
Contributor Author

mahkoh commented Feb 12, 2021

Basically all types that have a __declspec(align) annotation (such as __m128) in the MSVC stdlib but no such annotation in Rust are broken because MSVC implements the concept of required alignments which are unaffected by #pragma pack annotations.

This particular problem could be fixed by adding such an annotation to __m128 on MSVC targets. The definition of __m128 is broken anyway because it has to be 16 byte aligned on MSVC targets but is defined as 4 byte aligned in the stdlib.

So in the end this is not an inherent problem of the layout algorithm because you're not supposed to be able to write this anyway.

@sivadeilra
Copy link

C/C++ do not permit zero-sized structs or arrays. Aggregates (structures and classes) that have no members still have a non-zero size. MSVC, Clang, and GCC all have different extensions to control the behavior of zero-sized arrays.

It is unfortunate that #[repr(C)] means two things: C ABI compatible, and sequential layout. Maybe a new #[repr(stable)] could be added, which would request sequential layout but would not require interop with standard C ABI.

In the near term, perhaps the best thing is to add a new diagnostic, which is "you asked for #[repr(C)], but you have ZSTs in here, and that might be a problem."

@mahkoh
Copy link
Contributor Author

mahkoh commented Feb 12, 2021

C/C++ do not permit zero-sized structs or arrays. Aggregates (structures and classes) that have no members still have a non-zero size.

GCC and Clang do in fact accept even completely empty structs and unions and such types have size 0 when compiled with these compilers. (Except that Clang tries to emulate MSVC when compiling for *-msvc targets.) The current implementation of repr(C) seems to be correct in all cases accepted by rustc except on msvc targets.

Maybe a new #[repr(stable)] could be added, which would request sequential layout but would not require interop with standard C ABI.

That seems to be the most pragmatic solution. Alternatively one could deprecate repr(C) completely and replace it by repr(NativeC) and repr(stable).

In the near term, perhaps the best thing is to add a new diagnostic, which is "you asked for #[repr(C)], but you have ZSTs in here, and that might be a problem."

Maybe only warn on msvc targets.

@ghost
Copy link

ghost commented Feb 12, 2021

GCC and Clang do in fact accept even completely empty structs and unions and such types have size 0 when compiled with these compilers.

In C++ they have size 1:

$ clang -std=c++20 -xc++ - -o /dev/null -c <<<'extern "C" { struct EmptyStruct {}; union EmptyUnion {}; }'
<stdin>:1:14: warning: empty struct has size 0 in C, size 1 in C++ [-Wextern-c-compat]
extern "C" { struct EmptyStruct {}; union EmptyUnion {}; }
             ^
<stdin>:1:37: warning: empty union has size 0 in C, size 1 in C++ [-Wextern-c-compat]
extern "C" { struct EmptyStruct {}; union EmptyUnion {}; }
                                    ^
2 warnings generated.

Actually the nomicon says:

ZSTs are still zero-sized, even though this is not a standard behavior in C, and is explicitly contrary to the behavior of an empty type in C++, which says they should still consume a byte of space.

So it seems that this (unsound) behavior is even documented.

@mahkoh
Copy link
Contributor Author

mahkoh commented Feb 12, 2021

In C++ they have size 1:

repr(C) is about C compatibility not about C++ compatibility.

So it seems that this (unsound) behavior is even documented.

Otherwise empty structs in msvc have size at least 4 bytes not 1 byte in C mode.

@sivadeilra
Copy link

GCC and Clang do in fact accept even completely empty structs and unions and such types have size 0 when compiled with these compilers.

These are non-standard extensions, and they deviate from the C/C++ specification.

If #[repr(C)] means "has a representation that is equivalent to that generated by a conformant C compiler", then Rust's current behavior is fine. If #[repr(C)] means #[repr(C_with_msvc_and_clang_extensions)], then that is something different.

I understand the desire for compatibility with the de-facto standard behavior of these compilers, from a practical point of view. At the same time, these are areas where they do deviate from the language standard. Figuring out the best solution for Rust will require some careful consideration, and the short-term solution of "make it work like Clang / MSVC / GCC" should not be chosen without due consideration.

@mahkoh mahkoh closed this as completed Feb 13, 2021
@mahkoh mahkoh reopened this Feb 13, 2021
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Aug 31, 2022
Fix UB from misalignment and provenance widening in `std::sys::windows`

This fixes two types of UB:

1. Reading past the end of a reference in types like `&c::REPARSE_DATA_BUFFER` (see rust-lang/unsafe-code-guidelines#256). This is fixed by using `addr_of!`. I think there are probably a couple more cases where we do this for other structures, and will look into it in a bit.

2. Failing to ensure that a `[u8; N]` on the stack is sufficiently aligned to convert to a `REPARSE_DATA_BUFFER`. ~~This was done by introducing a new `AlignedAs` struct that allows aligning one type to the alignment of another type. I expect there are other places where we have this issue too, or I wouldn't introduce this type, but will get to them after this lands.~~

    ~~Worth noting, it *is* implemented in a way that can cause problems depending on how we fix rust-lang#81996, but this would be caught by the test I added (and presumably if we decide to fix that in a way that would break this code, we'd also introduce a `#[repr(simple)]` or `#[repr(linear)]` as a replacement for this usage of `#[repr(C)]`).~~

    Edit: None of that is still in the code, I just went with a `Align8` since that's all we'll need for almost everything we want to call.

These are more or less "potential UB" since it's likely at the moment everything works fine, although the alignment not causing issues might just be down to luck (and x86 being forgiving).

~~NB: I've only ensured this check builds, but will run tests soon.~~ All tests pass, including stage2 compiler tests.

r? `@ChrisDenton`
@bors bors closed this as completed in 0ed046f Aug 31, 2022
@ChrisDenton

This comment was marked as resolved.

@ChrisDenton ChrisDenton reopened this Aug 31, 2022
@RalfJung
Copy link
Member

(prepare for more accidental closures of this issue when that commit lands on the beta and stable branches -- at least that's how things went in the past)

rodrimati1992 added a commit to rodrimati1992/abi_stable_crates that referenced this issue Nov 22, 2022
… MSVC abi fix.

Tried fixing marker type declarations to be zero sized after MSVC abi.

This commit assumes that this issue will get resolved to "#[repr(C)] structs can't be zero-sized":
rust-lang/rust#81996
@RalfJung
Copy link
Member

This issue would probably benefit from being split up into one issue per problem following this summary as well as this comment.

@RalfJung
Copy link
Member

Looking at it, it actually seems to be largely two issues -- enums with too big discriminants, and then everything around size 0. I have opened #124403 for the enums, so this issue is now about the case of structs / unions with no fields / zero-sized fields.

@RalfJung
Copy link
Member

RalfJung commented Apr 26, 2024

In general it probably makes sense to consider this together with #100743, in the wider question of -- how do we deal with the differences between MSVC's layout algorithm and the one everyone else uses. There seem to be two differences that surfaced so far: how to deal with structs/unions where all fields have size 0, and how to deal with explicitly aligned types occurring as (potentially nested) fields of packed types.

  1. Make enough things hard errors so that what is allowed is consistent between GCC and MSVC. For aligned-field-in-packed, this is what we tried to do, but we failed at that. For zero-sized cases, this would require disallowing a bunch of things, like repr(C) unit structs and repr(C) generic newtpyes wrapping an arbitrary T (since the newtype could be wrapping an empty array). We could also do these checks during monomorphization, then we can allow newtpyes, but monomorphization-time checks are generally not great.
  2. Compute layout using GCC/clang rules, but emit a lint for cases where that differs from MSVC. This would have to be either best-effort or monomorphization-time though, due to the problems mentioned above with generics. (We have precedent for a monomorphization-time lint: the lint that detects "big copies".)
  3. Compute layout using the rules of the dominant C compiler for a target, i.e., use MSVC rules for MSVC targets. This could still be combined with a lint for cases where layout differs from what it would be on GCC.

Option 1 seems unlikely as there's a lot of code we'd have to exclude to be sure that we are in the fragment where all our current targets agree. Option 1 is also extremely unsatisfying for the aligned-field-in-packed-struct case as it leaves a gap in what can be expressed with repr(C) types in Rust.

Option 3 seems closer to the original goal of repr(C) to me, in particular given that repr(crabi) is in the works for defining a deterministic cross-platform layout algorithm. But option 3 would be a breaking change for certain types which could make it impossible.

@briansmith
Copy link
Contributor

Even ISO C23 doesn't require implementations to support structs/unions without fields; the C grammar requires at least one field. So an ABI may rightly not define the ABI for a struct/union without fields, not even define that they are explicitly disallowed. This seems to be the case for the Windows ABIs and for ARM ABIs.

Similarly, zero-sized arrays aren't allowed by C.

So, I think that it makes sense to, by default, assume a target doesn't support those types with #[repr(C)].

Make enough things hard errors so that what is allowed is consistent between GCC and MSVC. For aligned-field-in-packed, this is what we tried to do, but we #100743 (comment) at that. For zero-sized cases, this would require disallowing a bunch of things, like repr(C) unit structs and repr(C) generic newtpyes wrapping an arbitrary T (since the newtype could be wrapping an empty array).

Instead of making the definition of these a hard error, we could make non-FFI-safe. (I am suggesting that #[repr(C)] ZSTs be considered to be not FFI-safe for targets that don't define their ABI, but I am not suggesting changing anything for non-#[repr(C)] ZST types.)

Compute layout using GCC/clang rules, but emit a lint for cases where that differs from MSVC.
Compute layout using the rules of the dominant C compiler for a target, i.e., use MSVC rules for MSVC targets.

If the ABI doesn't specify what to do (i.e. doesn't disallow them), we could compute the layout using both rules, and if the layouts match, then accept the type. If they differ, definitely don't. But I think the best thing to do is to ask the target maintainer to specify what to do in the target spec, make this documentation a requirement for adding a new target, and add the info to existing target definitions.

@RalfJung
Copy link
Member

RalfJung commented Mar 29, 2025 via email

@briansmith
Copy link
Contributor

I assume the FFI safety lint will already triggee on these cases. However, that lint runs on generic code (pre-monomorphization) so it will necessarily miss some cases. Most importantly though, the lint does not absolve us from specifying what our layout and ABI are for all repr(C) types, so it does not really help for this issue.

You don't have to use the existing lint to reject FFI-unsafe types. In fact, there are a lot of reasons, which you frequently mention, for not relying on the lint. Instead, a new, better check for FFI-unsafe types is needed.

@RalfJung
Copy link
Member

So IIUC, your proposed resolution to this issue is

  • repr(C) actually means "linear layout using the following specific layout algorithm".
  • We have a lint to detect where that does not match the native C layout on the system (or cannot even be expressed natively).

What is not clear to me is whether you think a lint with false negatives is sufficient to resolve this issue or not.

Personally I think a solution involving false negatives is not properly resolving this issue. So for cases where repr(C) is used on a type that does not exist in the native C toolchain I think we should reliably inform the user -- we can debate whether this should be a hard error or a deny-by-default lint.


That said, in the OP we see MSVC generate code with empty arrays, don't we? So, there is a layout we could reasonably be expected to produce. It's just not the one we produce today.

@briansmith
Copy link
Contributor

briansmith commented Mar 31, 2025

repr(C) actually means "linear layout using the following specific layout algorithm".

No, I would instead expect repr(c) to product a result that matches the ABI spec, which usually is defined in reference to C.

To clarify further my positoin, I think that

  • The default assumption should be that #[repr(C)] cannot be used on field-less array structs or an empty array types, since ISO C doesn't support them.
  • If the ABI does specify a representation for a field-less struct or zero-sized array (Windows ABIs don't, and most ABIs I have seen don't) then go ahead and support it in a way that matches the ABI. But in most cases a field-less C/C++-compatible struct will not be a ZST, so you might warn if something looks like a ZST but repr(C) makes it not zero-sized ("warning, this field-less struct isn't zero-sized").
  • If we already support something and that doesn't match the ABI, or that support makes assumptions beyond what the ABI says, and/or where C compilers disagree for a target, then we probably then we should warn the programmer about the mismatch, at least ("warning: this is not compatible with the ABI specification" or "warning: this is not compatible with clang" or whatever.

Note that the ABI might define a representation for zero-sized arrays but not field-less structs, or vice-versa. The idea that these are somehow similar constructs is a Rust-ism that doesn't naturally apply to C.

We have a lint to detect where that does not match the native C layout on the system (or cannot even be expressed natively).

I don't distinguish between "lints" and other kinds of type checks, and I don't know enough to suggest how they should be implemented. I am just suggesting the checks that should exist, independently of how they are implemented

@RalfJung
Copy link
Member

RalfJung commented Apr 1, 2025

Note that the examples above don't even involve any field-less structs, they involve 0-length arrays. And apparently those are allowed by MSVC, there's a link in the issue description. So I don't understand why you are focusing on fieldless structs.

Other than that, warnings have been suggested above, so -- not sure what new point you are intending to bring to this conversation.

@briansmith
Copy link
Contributor

Hi Ralf, I am used to your unnecessarily negative responses to my comments so that one isn't isn't surprising. You might try searching this page for "array" in my message so you can see that I am not "focusing" on fieldless structs.

Other than that, warnings have been suggested above, so -- not sure what new point you are intending to bring to this conversation.

Likewise.

@RalfJung
Copy link
Member

RalfJung commented Apr 1, 2025

I am genuinely puzzled by your comments, I am sorry if that comes across as negative. This is a huge thread, and lints were discussed multiple times before. I was trying to understand if you are suggesting something new or supporting some previously voiced position.

You might try searching this page for "array" in my message so you can see that I am not "focusing" on fieldless structs.

I was referring specifically to the 2nd bullet about windows -- sorry, I should have made that more clear. I'm sometimes replying in haste when wading through my backlog. Maybe I should stop engaging in some threads entirely so that I can spend proper time on the remaining ones and be more likely to be productive there.

In that bullet, you mostly spoke about "fieldless structs"; there was one mention of "fieldless arrays" which looked like a typo since the immediately next sentence just spoke about "structs" again and "fieldless array" does not make sense (I think? arrays don't usually have "fields").

Now you edited this adding more mentions of "arrays", without leaving a note that there was an edit, which is disingenuous given that I already replied.

In the edited version you are claiming the windows ABI does not have zero-sized arrays, but in the OP we have C code using zero-sized arrays with MSVC. I am still puzzled by your comments.

I don't distinguish between "lints" and other kinds of type checks

I mean, the Rust compiler does, so when proposing changes to the Rust compiler it'd be good to use terms that make sense in this context. Otherwise how should anyone understand what you are saying?
Also, my key question for you was in the next line which you did not reply to at all:
"What is not clear to me is whether you think a lint with false negatives is sufficient to resolve this issue or not."
That's entirely unrelated to whether it's called a "lint" or not, the key is you did not make it clear whether the check should in your opinion be best-effort or must be fully reliable to realize your vision for how this should be handled (where the latter requires a post-monomorphization check which has a long list of downsides).

But, I think I am getting the clue, you don't like the way I am probing your comments to try and understand them. I am not sure what to do about that. Asking a question does not mean I am saying you are wrong, or being negative. Don't try to read things between the lines, there's nothing there, what I want to express is just literally the question that I wrote.

@CAD97
Copy link
Contributor

CAD97 commented Apr 1, 2025

There are two ways of defining "the Win64 ABI:" what the Microsoft Learn documentation says, and what MSVC does.

I believe @briansmith's position is that the prose documentation (which in the MSVC case, is never referred to as a specification) should be considered authoritative, and anything not defined in said documentation should not be considered defined for FFI.

The more typically expected understanding is that MSVC provides a stable ABI for any C signature that it allows you to write, and that what the compiler does is the source of truth, not the documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-FFI Area: Foreign function interface (FFI) A-repr Area: the `#[repr(stuff)]` attribute C-bug Category: This is a bug. I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness O-windows Operating system: Windows O-windows-msvc Toolchain: MSVC, Operating system: Windows P-medium Medium priority T-lang Relevant to the language team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.