Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bootstrapping from bootstrap.c, issues compiling zig2.c with tcc and slimcc #22114

Closed
fuhsnn opened this issue Dec 1, 2024 · 33 comments · Fixed by #22130
Closed

Bootstrapping from bootstrap.c, issues compiling zig2.c with tcc and slimcc #22114

fuhsnn opened this issue Dec 1, 2024 · 33 comments · Fixed by #22130
Labels
bug Observed behavior contradicts documented or intended behavior
Milestone

Comments

@fuhsnn
Copy link

fuhsnn commented Dec 1, 2024

These are encountered at the $CC -o zig2 zig2.c compiler_rt.c -std=c99 ... step, with shell command ulimit -s unlimited to workaround stack-overflow mentioned at issue #22111

Here in stage1/zig.h, cpuid.h is included on any x86-64 compiler that is not MSVC. This header may not be present if the C compiler implemented their own preprocessor, tcc, slimcc as well as pcc, cparser, currently don't provide it.

zig/stage1/zig.h

Lines 11 to 17 in aa7d138

#if _MSC_VER
#include <intrin.h>
#elif defined(__i386__) || defined(__x86_64__)
#include <cpuid.h>
#endif

With #include <cpuid.h> commented out (to see what lies ahead), tcc and slimcc failed at, respectively:

The zig_import macro use the non-standard __asm convention for function aliasing on any compiler not MSVC. slimcc stopped here, as its maintainer I will implement it ASAP.

zig/stage1/zig.h

Lines 224 to 234 in aa7d138

#if _MSC_VER
#define zig_import(Type, fn_name, libc_name, sig_args, call_args) zig_extern Type fn_name sig_args;\
__pragma(comment(linker, "/alternatename:" zig_mangle_c(#fn_name) "=" zig_mangle_c(#libc_name)));
#define zig_import_builtin(Type, fn_name, libc_name, sig_args, call_args) zig_import(Type, fn_name, sig_args, call_args)
#else /* _MSC_VER */
#define zig_import(Type, fn_name, libc_name, sig_args, call_args) zig_extern Type fn_name sig_args __asm(zig_mangle_c(#libc_name));
#define zig_import_builtin(Type, fn_name, libc_name, sig_args, call_args) zig_extern Type libc_name sig_args; \
static inline Type fn_name sig_args { return libc_name call_args; }
#endif

tcc stopped at this point,

zig/stage1/zig.h

Lines 1250 to 1252 in aa7d138

typedef struct { zig_align(16) uint64_t hi; uint64_t lo; } zig_u128;
typedef struct { zig_align(16) int64_t hi; uint64_t lo; } zig_i128;
#endif

since it does not advertise __has_attribute((aligned)) and the build driver passes -std=c99, zig_align expands to zig_align_unavailable as the result of these two macro, failing the compilation.

zig/stage1/zig.h

Lines 132 to 146 in aa7d138

#if zig_has_attribute(aligned)
#define zig_under_align(alignment) __attribute__((aligned(alignment)))
#elif _MSC_VER
#define zig_under_align(alignment) __declspec(align(alignment))
#else
#define zig_under_align zig_align_unavailable
#endif
#if __STDC_VERSION__ >= 201112L
#define zig_align(alignment) _Alignas(alignment)
#else
#define zig_align(alignment) zig_under_align(alignment)
#endif

@alexrp alexrp added the bug Observed behavior contradicts documented or intended behavior label Dec 1, 2024
@alexrp
Copy link
Member

alexrp commented Dec 1, 2024

since it does not advertise __has_attribute((aligned)) and the build driver passes -std=c99, zig_align expands to zig_align_unavailable as the result of these two macro, failing the compilation.

Does it work when passing -std=gnu99?

@alexrp alexrp self-assigned this Dec 1, 2024
@alexrp alexrp added this to the unplanned milestone Dec 1, 2024
@fuhsnn
Copy link
Author

fuhsnn commented Dec 1, 2024

-std=gnu99 is the same, -std=c11 helps
https://godbolt.org/z/joGY53an5

@alexrp
Copy link
Member

alexrp commented Dec 1, 2024

It seems odd that TCC wouldn't have any way of specifying alignment pre-C11. __attribute__((aligned(...))) (and its MSVC equivalent) is one of those things you see in a ton of real-world C projects. I also doubt glibc headers even work without it...?

@fuhsnn
Copy link
Author

fuhsnn commented Dec 1, 2024

With -std=c11 tcc stops at:
zig2.c:8800: error: identifier expected
which is
typedef zig_under_align(1) struct zig_Client_Message_Header__23850 anon__aligned_3467;
in term expands to:
typedef zig_align_unavailable(1) struct zig_Client_Message_Header__23850 anon__aligned_3467;
https://godbolt.org/z/5vsnxrvYf

@fuhsnn
Copy link
Author

fuhsnn commented Dec 1, 2024

I also doubt glibc headers even work without it...?

glibc does the #define __attribute__(x) /* empty */ trick in its headers for non-__GNUC__ compilers, so the compiler won't see the attributes if they follow the #define.

I actually put a special case in slimcc's preprocessor to resist the define to make it work out of the box with glibc while still recognize the attributes, epoll.h for example would loose packed attribute, creating ABI mismatch if you obey glibc's define.
https://github.com/fuhsnn/slimcc/blob/fd14b7d38ee757c53956fcf5bf88fe40a0c27c65/preprocess.c#L815

@alexrp
Copy link
Member

alexrp commented Dec 1, 2024

That's what I mean; I'm surprised TCC can even be used to compile real, working programs when it ends up with ABI mismatches like this. That makes me wonder if there's something I'm missing.

@alexrp
Copy link
Member

alexrp commented Dec 1, 2024

Maybe the issue here is just that, while TCC supports __attribute__((aligned(...))) and friends, it doesn't speak __has_attribute?

@fuhsnn
Copy link
Author

fuhsnn commented Dec 1, 2024

Looks like tcc:

  1. supports __attribute__(aligned) when not affect by glibc
  2. supports __has_attribute
  3. reports 0 with __has_attribute(aligned)

https://godbolt.org/z/4E4K4Pecr

@alexrp
Copy link
Member

alexrp commented Dec 1, 2024

Curiously enough, I can't find any mention of __has_attribute in the TCC source code... 🤔

@alexrp
Copy link
Member

alexrp commented Dec 1, 2024

cat test.c
#include <stdio.h>
int main()
{
#ifdef __has_attribute
    printf("has attribute\n");
#endif
}tcc -run test.ctcc -dM -E - < /dev/null | grep __has_attribute

Same deal with a fresh tcc built from the mob branch.

I have no idea what's going on with the TCC on godbolt.

@alexrp
Copy link
Member

alexrp commented Dec 1, 2024

Yeah, seriously, what's going on here? https://godbolt.org/z/9ce99jvEr

@fuhsnn
Copy link
Author

fuhsnn commented Dec 1, 2024

I think it's here: https://repo.or.cz/tinycc.git/blob/HEAD:/include/tccdefs.h#l155
155 #define __has_attribute(x) 0

So it will be 0 no matter the input, stuff like this was how Autoconf came to be...

@fuhsnn
Copy link
Author

fuhsnn commented Dec 1, 2024

Yeah, seriously, what's going on here? https://godbolt.org/z/9ce99jvEr

The __has_* family are special, they are like built-in #define's able to react to input dynamically, so you can query their existence with #if defined(__has_*) but when you use them, them go through compiler's logic.

This is entirely in the preprocessor pass, the actual C compiler won't see and don't know about it.

@alexrp
Copy link
Member

alexrp commented Dec 1, 2024

Ah... I had cloned the older https://repo.or.cz/tinycc/tinycc.git (and manually switched to its mob branch) instead of https://repo.or.cz/tinycc.git... that's why I didn't see __has_attribute anywhere.

@fuhsnn
Copy link
Author

fuhsnn commented Dec 1, 2024

Progress with slimcc, after implementing void fn()__asm("name"); and borrowing <cpuid.h> from LLVM:
At this part, it does support _Thread_local but since the driver passed -std=c99, it got zig_threadlocal_unavailable:

zig/stage1/zig.h

Lines 51 to 61 in aa7d138

#if __STDC_VERSION__ >= 201112L
#define zig_threadlocal _Thread_local
#elif defined(__GNUC__)
#define zig_threadlocal __thread
#elif _MSC_VER
#define zig_threadlocal __declspec(thread)
#else
#define zig_threadlocal zig_threadlocal_unavailable
#endif

@alexrp
Copy link
Member

alexrp commented Dec 1, 2024

Does slimcc support __thread?

@fuhsnn
Copy link
Author

fuhsnn commented Dec 1, 2024

Here the __LITTLE_ENDIAN__ macro is used, it is clang-only, https://godbolt.org/z/eqnn6Kxe9, so even GCC targeting little-endian may be wrongly defined as big-endian.

zig/stage1/zig.h

Lines 43 to 51 in aa7d138

#if __LITTLE_ENDIAN__ || _MSC_VER
#define zig_little_endian 1
#define zig_big_endian 0
#else
#define zig_little_endian 0
#define zig_big_endian 1
#endif

edit: yeah, GCC with x86 got zig_big_endian == 1 https://godbolt.org/z/7jqPhcqs4

@fuhsnn
Copy link
Author

fuhsnn commented Dec 1, 2024

Does slimcc support __thread?

It does, but it doesn't define __GNUC__, so it didn't go that branch.

@alexrp
Copy link
Member

alexrp commented Dec 1, 2024

Here the __LITTLE_ENDIAN__ macro is used, it is clang-only, https://godbolt.org/z/eqnn6Kxe9, so even GCC targeting little-endian may be wrongly defined as big-endian.

Yeah, this should be using __BYTE_ORDER__. No need to even check the Clang-specific one.

It does, but it doesn't define __GNUC__, so it didn't go that branch.

Ok, it would seem reasonable to add a slimcc check to the __thread branch. What's the preferred way to detect slimcc?

@fuhsnn
Copy link
Author

fuhsnn commented Dec 1, 2024

Ok, it would seem reasonable to add a slimcc check to the __thread branch. What's the preferred way to detect slimcc?

I would prefer just don't hardcode -std=c99 and presumably let slimcc use -std=c11 passed through CFLAGS to bootstrap.c (ignore me if it already works this way).

@fuhsnn
Copy link
Author

fuhsnn commented Dec 1, 2024

Next problem slimcc encountered in zig2.c: https://godbolt.org/z/6c4ex9T3a
further reduced to:

int fn(void) {
  struct A { int i; };
  static struct A a [1] = {(struct A){333}};
  return a[0].i;
}

The compound literal is not a constant expression, using it in a static object initialization is not particularly portable:

  • pcc, cproc, and kefir all reject this pattern.
  • tcc accept this.
  • chibicc silently accepts but doesn't initialize the struct correctly;
  • gcc will also issue a warning under -Wpedantic
  • for slimcc its doable, should be just a few logic away.

@alexrp
Copy link
Member

alexrp commented Dec 1, 2024

I would prefer just don't hardcode -std=c99 and presumably let slimcc use -std=c11 passed through CFLAGS to bootstrap.c (ignore me if it already works this way).

zig.h is used not only for wasm2c but also for the Zig compiler's C backend, which is fully supported and not just an implementation detail of the bootstrapping process. As such, it needs to be able to work with as many compilers as possible, in any language standard mode from C99 and up. (Maybe even C89 in the future; see #19468.) For that reason, it still makes sense to use __thread when available in the non-C11 case.

The compound literal is not a constant expression, using it in a static object initialization is not particularly portable

Would you mind filing a separate issue against the C backend for this one? I think we'd want to fix that, but it's probably going to be lower priority than the other problems in this issue.

@fuhsnn
Copy link
Author

fuhsnn commented Dec 1, 2024

Would you mind filing a separate issue against the C backend for this one? I think we'd want to fix that, but it's probably going to be lower priority than the other problems in this issue.

I have no idea which part of the C backend generated it, only that it appears in zig2.c in the function fmt_parse_float_convert_fast_fastPow10__anon_538055__114224: https://godbolt.org/z/6c4ex9T3a

Since GCC -Wpedantic do warn about it, may I suggest just test with that flag on?

@fuhsnn
Copy link
Author

fuhsnn commented Dec 1, 2024

For that reason, it still makes sense to use __thread when available in the non-C11 case.

From what I'm seeing in zig.h I would say for slimcc the preferred way is to go along __STDC_VERSION__ >= 201112L branches as much as possible, especially for atomics, since currently it doesn't support GCC style atomics and the only way to use them is with C11 <stdatomic.h>.

@rohlem
Copy link
Contributor

rohlem commented Dec 1, 2024

For that reason, it still makes sense to use __thread when available in the non-C11 case.

From what I'm seeing in zig.h I would say for slimcc the preferred way is to go along __STDC_VERSION__ >= 201112L branches as much as possible, especially for atomics

Maybe the macros concerning atomics could be defined to cause compile errors if defined(__SLIMCC__) && (__STDC_VERSION__ < 201112L) (pseudocode), which instructs users to use slimcc with C11 or newer?
If we know some macro like __SLIMCC__ to detect it, we could support special-cased compile errors for a bunch of unsupported features/scenarios.

@alexrp
Copy link
Member

alexrp commented Dec 1, 2024

Re: cpuid.h, should I assume that slimcc will have this header going forward? Or do we need to implement a separate code path using inline assembly? (Does slimcc have inline assembly?)

@fuhsnn
Copy link
Author

fuhsnn commented Dec 1, 2024

Next one is about tentative definitions with _Thread_local.

#define zig_threadlocal _Thread_local

struct anon__lazy_112 {
 uint8_t *ptr; 
 uintptr_t len;
};

typedef struct anon__lazy_112 nav__4163_39; 
static zig_threadlocal nav__4163_39 crypto_tlcsprng_wipe_mem__4163;
static zig_threadlocal nav__4163_39 crypto_tlcsprng_wipe_mem__4163 = {(uint8_t *)(uint8_t const *)((void const *)(uintptr_t)0xaaaaaaaaaaaaaaaaul),0ul};

The C standard at 6.9.3 External object definitions describes tentative definition as:

A declaration of an identifier for an object that has file scope without an initializer, and without
the storage-class specifier extern or thread_local, constitutes a tentative definition.

So if a C11 compiler implemented these features "by the book", thread_local and tentative don't co-exist, as the first crypto_tlcsprng_wipe_mem__4163 is thread_local, it isn't tentative, therefore the second declaration of crypto_tlcsprng_wipe_mem__4163 will be a redefinition error.

GCC/Clang probably supported tentative __thread objects long before C11 standard text, so they don't mind the usage. kefir allow this, cproc don't. I'm going to enable this in slimcc.

@fuhsnn
Copy link
Author

fuhsnn commented Dec 1, 2024

Maybe the macros concerning atomics could be defined to cause compile errors if defined(__SLIMCC__) && (__STDC_VERSION__ < 201112L) (pseudocode), which instructs users to use slimcc with C11 or newer? If we know some macro like __SLIMCC__ to detect it, we could support special-cased compile errors for a bunch of unsupported features/scenarios.

Just let it fail whenever GCC style atomics got used and slimcc don't recognize it, C compiler users are used to this.

@fuhsnn
Copy link
Author

fuhsnn commented Dec 1, 2024

Re: cpuid.h, should I assume that slimcc will have this header going forward? Or do we need to implement a separate code path using inline assembly? (Does slimcc have inline assembly?)

slimcc, tcc, kefir all support inline assembly, so I vote for a separate code path (though it would be even better to not require it to bootstrap Zig). That said Clang's <cpuid.h> looks pretty clean, maybe you can just route compilers to it:

#if __has_include( cpuid ) 
...
#else
include clang's cpuid.h possibly shipped with zig cc
#endif

@fuhsnn
Copy link
Author

fuhsnn commented Dec 1, 2024

It's done!

non-root@f2a267d3a49c:~/zig$ ulimit -s unlimited
non-root@f2a267d3a49c:~/zig$ ./_bootstrap 
/work/slimcc/slimcc_asan -o zig-wasm2c stage1/wasm2c.c -O2 -std=c11
./zig-wasm2c stage1/zig1.wasm zig1.c
/work/slimcc/slimcc_asan -o zig1 zig1.c stage1/wasi.c -std=c11 -Os -lm
./zig1 lib build-exe -ofmt=c -lc -OReleaseSmall --name zig2 -femit-bin=zig2.c -target x86_64-linux --dep build_options --dep aro -Mroot=src/main.zig -Mbuild_options=config.zig -Maro=lib/compiler/aro/aro.zig
./zig1 lib build-obj -ofmt=c -OReleaseSmall --name compiler_rt -femit-bin=compiler_rt.c -target x86_64-linux -Mroot=lib/compiler_rt.zig
/work/slimcc/slimcc_asan -o zig2 zig2.c compiler_rt.c -std=c11 -O2 -fno-stack-protector -Istage1 -Wl,-z,stack-size=0x10000000
non-root@f2a267d3a49c:~/zig$ echo $?
0
non-root@f2a267d3a49c:~/zig$ size -G zig2
      text       data        bss      total filename
  50299964    2045674         16   52345654 zig2
Looks like cpuid is working:
non-root@f2a267d3a49c:~/zig$ ./zig2 test --show-builtin
const std = @import("std");
/// Zig version. When writing code that supports multiple versions of Zig, prefer
/// feature detection (i.e. with `@hasDecl` or `@hasField`) over version checks.
pub const zig_version = std.SemanticVersion.parse(zig_version_string) catch unreachable;
pub const zig_version_string = "0.14.0-dev.bootstrap";
pub const zig_backend = std.builtin.CompilerBackend.stage2_llvm;

pub const output_mode = std.builtin.OutputMode.Exe;
pub const link_mode = std.builtin.LinkMode.static;
pub const is_test = true;
pub const single_threaded = false;
pub const abi = std.Target.Abi.gnu;
pub const cpu: std.Target.Cpu = .{
    .arch = .x86_64,
    .model = &std.Target.x86.cpu.skylake

build-obj and build-exe failed, what to do about this?

non-root@f2a267d3a49c:~/zig$ ./zig2 build-obj hello.zig 
lib/std/start.zig:224:1: error: invalid modifier: 'P'
fn _start() callconv(.naked) noreturn {
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
non-root@f2a267d3a49c:~/zig$ ./zig2 build-exe hello.zig 
thread 300 panic: missing dwarf relocation target
Unable to dump stack trace: debug info stripped
Aborted (core dumped)

@alexrp
Copy link
Member

alexrp commented Dec 2, 2024

build-obj and build-exe failed, what to do about this?

This is probably expected to an extent. The self-hosted backends are still in development, and the Zig compiler you get from bootstrap.c has no LLVM support.

@alexrp
Copy link
Member

alexrp commented Dec 2, 2024

Another question: Does slimcc make GCC-style __atomic builtins available regardless of language standard?

@fuhsnn
Copy link
Author

fuhsnn commented Dec 2, 2024

Another question: Does slimcc make GCC-style __atomic builtins available regardless of language standard?

Currently, no, though I probably should implement it on my quest to build more existing projects.

On the other hand, <stdatomic.h> is available regardless of language standard, I mean it's just an innocent header, no need to explicitly block it.

GCC/Clang also agree on this: https://godbolt.org/z/8e4bjozKd

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Observed behavior contradicts documented or intended behavior
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants