Clang generates incorrect code for unions of types with padding #76017

ivafanas · 2023-12-20T06:46:52Z

Reproducer:

#include <cstdio>
#include <cstdint>

struct my_struct_1 {
  float a;
  float b;
};

struct my_struct_2 {
  float x;
  double y;
};

union my_union {
  my_struct_1 s1;
  my_struct_2 s2;
};

my_union my_func() {
  my_union u;
  u.s1 = my_struct_1{100.f, 200.f};
  return u;
}

int main() {
  my_union u = my_func();

  if (u.s1.a != 100.f)
    std::puts("a ooops");

  if (u.s1.b != 200.f)
    std::puts("b ooops");

  return 0;
}

Please, note that my_struct_1 has field b which fits into padding of my_struct_2 and sizeof(my_struct_2) > sizeof(my_struct_1).

Bug is reproduced on X86 on clang 16 release and on the latest main branch (clang++-17 is not tested, but I assume its behaviour is the same):

# clang 16 release
clang++-16 -O1 example.cpp && ./a.out
b ooops

# clang main 5caae72d1a4f
clang++ -O1 example.cpp && ./a.out
b ooops

The problem is that C++ unions are represented in IR level as a struct of member with the largets size:

%union.my_union = type { %struct.my_struct_2 }
%struct.my_struct_2 = type { float, double }
%struct.my_struct_1 = type { float, float }

Information about my_struct_1 layout is lost when union llvm::StructType is constructed. SROA pass operates on my_struct_2 layout only.

my_func IR before SROA:

*** IR Dump After SimplifyCFGPass on _Z7my_funcv ***
; Function Attrs: mustprogress nounwind uwtable
define dso_local { float, double } @_Z7my_funcv() #0 {
entry:
  %retval = alloca %union.my_union, align 8
  %ref.tmp = alloca %struct.my_struct_1, align 4
  call void @llvm.lifetime.start.p0(i64 8, ptr %ref.tmp) #5
  %a = getelementptr inbounds %struct.my_struct_1, ptr %ref.tmp, i32 0, i32 0
  store float 1.000000e+02, ptr %a, align 4, !tbaa !5
  %b = getelementptr inbounds %struct.my_struct_1, ptr %ref.tmp, i32 0, i32 1
  store float 2.000000e+02, ptr %b, align 4, !tbaa !10
  call void @llvm.memcpy.p0.p0.i64(ptr align 8 %retval, ptr align 4 %ref.tmp, i64 8, i1 false), !tbaa.struct !11
  call void @llvm.lifetime.end.p0(i64 8, ptr %ref.tmp) #5
  %coerce.dive = getelementptr inbounds %union.my_union, ptr %retval, i32 0, i32 0
  %0 = load { float, double }, ptr %coerce.dive, align 8
  ret { float, double } %0
}

my_func IR after SROA:

*** IR Dump After SROAPass on _Z7my_funcv ***
; Function Attrs: mustprogress nounwind uwtable
define dso_local { float, double } @_Z7my_funcv() #0 {
entry:
  %.fca.0.insert = insertvalue { float, double } poison, float 1.000000e+02, 0
  %.fca.1.insert = insertvalue { float, double } %.fca.0.insert, double undef, 1
  ret { float, double } %.fca.1.insert
}

As you can see, 200.f value is lost and undef value insertion happens here.
On the other side, IR before SROA also looks suspicious, because 32bits of double value are undef -> whole double is undef.

gcc works well on this example.

Possibly related discussion:
https://discourse.llvm.org/t/struct-copy/11330

Possibly related issue:
#53710

The text was updated successfully, but these errors were encountered:

ivafanas · 2023-12-21T14:48:55Z

Fix idea 1: lower C/C++ unions into byte arrays of size of the maximum union member.

Fix idea 2: lower C/C++ unions into intermediate llvm::StructType which is the union member of maximum size with padding being filled by extra members (byte arrays or i1 / i2 / i4 / i16 / i32 types)

I can try to implement fix if there are no objections.

?

nikic · 2023-12-21T19:54:13Z

I think part of the issue here is a SROA bug (#64081), but part of it is that the ABI is just crazy. Just comparing to what GCC does, it looks like the result is returned in xmm0 and xmm1. For s1 a pair of floats is returned in xmm0. For s2 a float is returned in xmm0 and a double in xmm1. I don't think we even have the capability of representing this ABI in LLVM IR. cc @efriedma-quic

efriedma-quic · 2023-12-21T21:10:40Z

If the clang ABI code makes the function's return type {double, double}, the rest of the code should do the right thing, I think. (The clang calling convention code is designed to allow casting this way, and SROA should preserve values stored in padding, barring bugs.) Maybe end up with some ugly bitcasting in some cases, but it should do the right thing functionally.

Briefly glancing at the code, it looks like the bug is in X86_64ABIInfo::GetSSETypeAtOffset.

nikic · 2023-12-21T21:28:41Z

Oh yeah, that makes sense. I didn't consider that we can just bitcast the float pair to a double.

llvmbot · 2023-12-22T08:54:06Z

@llvm/issue-subscribers-clang-codegen

Author: Afanasyev Ivan (ivafanas)

Reproducer:

#include &lt;cstdio&gt;
#include &lt;cstdint&gt;

struct my_struct_1 {
  float a;
  float b;
};

struct my_struct_2 {
  float x;
  double y;
};

union my_union {
  my_struct_1 s1;
  my_struct_2 s2;
};

my_union my_func() {
  my_union u;
  u.s1 = my_struct_1{100.f, 200.f};
  return u;
}

int main() {
  my_union u = my_func();

  if (u.s1.a != 100.f)
    std::puts("a ooops");

  if (u.s1.b != 200.f)
    std::puts("b ooops");

  return 0;
}

Please, note that my_struct_1 has field b which fits into padding of my_struct_2 and sizeof(my_struct_2) > sizeof(my_struct_1).

Bug is reproduced on X86 on clang 16 release and on the latest main branch (clang++-17 is not tested, but I assume its behaviour is the same):

# clang 16 release
clang++-16 -O1 example.cpp &amp;&amp; ./a.out
b ooops

# clang main 5caae72d1a4f
clang++ -O1 example.cpp &amp;&amp; ./a.out
b ooops

The problem is that C++ unions are represented in IR level as a struct of member with the largets size:

%union.my_union = type { %struct.my_struct_2 }
%struct.my_struct_2 = type { float, double }
%struct.my_struct_1 = type { float, float }

Information about my_struct_1 layout is lost when union llvm::StructType is constructed. SROA pass operates on my_struct_2 layout only.

my_func IR before SROA:

*** IR Dump After SimplifyCFGPass on _Z7my_funcv ***
; Function Attrs: mustprogress nounwind uwtable
define dso_local { float, double } @<!-- -->_Z7my_funcv() #<!-- -->0 {
entry:
  %retval = alloca %union.my_union, align 8
  %ref.tmp = alloca %struct.my_struct_1, align 4
  call void @<!-- -->llvm.lifetime.start.p0(i64 8, ptr %ref.tmp) #<!-- -->5
  %a = getelementptr inbounds %struct.my_struct_1, ptr %ref.tmp, i32 0, i32 0
  store float 1.000000e+02, ptr %a, align 4, !tbaa !5
  %b = getelementptr inbounds %struct.my_struct_1, ptr %ref.tmp, i32 0, i32 1
  store float 2.000000e+02, ptr %b, align 4, !tbaa !10
  call void @<!-- -->llvm.memcpy.p0.p0.i64(ptr align 8 %retval, ptr align 4 %ref.tmp, i64 8, i1 false), !tbaa.struct !11
  call void @<!-- -->llvm.lifetime.end.p0(i64 8, ptr %ref.tmp) #<!-- -->5
  %coerce.dive = getelementptr inbounds %union.my_union, ptr %retval, i32 0, i32 0
  %0 = load { float, double }, ptr %coerce.dive, align 8
  ret { float, double } %0
}

my_func IR after SROA:

*** IR Dump After SROAPass on _Z7my_funcv ***
; Function Attrs: mustprogress nounwind uwtable
define dso_local { float, double } @<!-- -->_Z7my_funcv() #<!-- -->0 {
entry:
  %.fca.0.insert = insertvalue { float, double } poison, float 1.000000e+02, 0
  %.fca.1.insert = insertvalue { float, double } %.fca.0.insert, double undef, 1
  ret { float, double } %.fca.1.insert
}

As you can see, 200.f value is lost and undef value insertion happens here.
On the other side, IR before SROA also looks suspicious, because 32bits of double value are undef -> whole double is undef.

gcc works well on this example.

Possibly related discussion:
https://discourse.llvm.org/t/struct-copy/11330

Possibly related issue:
#53710

github-actions bot added the clang Clang issues not falling into any other category label Dec 20, 2023

dtcxzyw added the miscompilation label Dec 20, 2023

EugeneZelenko added llvm:optimizations and removed clang Clang issues not falling into any other category labels Dec 20, 2023

nikic added clang:codegen IR generation bugs: mangling, exceptions, etc. and removed llvm:optimizations labels Dec 22, 2023

This was referenced Aug 30, 2024

CUDA: unions in kernel arguments not copied completely if member contains padding #53710

Closed

Clang should lower unions to LLVM as byte array, not structure type #107239

Open

efriedma-quic mentioned this issue Dec 20, 2024

[clang] Return larger CXX records in memory #120670

Merged

efriedma-quic mentioned this issue Apr 11, 2025

[Clang][CodeGen][X86] don't coerce int128 into {i64,i64} for SysV-like ABIs #135230

Open

efriedma-quic mentioned this issue May 6, 2025

[AArch64] Change the coercion type of structs with pointer members. #135064

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Clang generates incorrect code for unions of types with padding #76017

Clang generates incorrect code for unions of types with padding #76017

ivafanas commented Dec 20, 2023

ivafanas commented Dec 21, 2023

Uh oh!

nikic commented Dec 21, 2023

Uh oh!

efriedma-quic commented Dec 21, 2023

Uh oh!

nikic commented Dec 21, 2023

Uh oh!

llvmbot commented Dec 22, 2023

Uh oh!

Clang generates incorrect code for unions of types with padding #76017

Clang generates incorrect code for unions of types with padding #76017

Comments

ivafanas commented Dec 20, 2023

ivafanas commented Dec 21, 2023

Uh oh!

nikic commented Dec 21, 2023

Uh oh!

efriedma-quic commented Dec 21, 2023

Uh oh!

nikic commented Dec 21, 2023

Uh oh!

llvmbot commented Dec 22, 2023

Uh oh!