Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zig c++ overrides explicitly set cpu features #9196

Closed
happyalu opened this issue Jun 22, 2021 · 5 comments
Closed

zig c++ overrides explicitly set cpu features #9196

happyalu opened this issue Jun 22, 2021 · 5 comments
Labels
bug Observed behavior contradicts documented or intended behavior contributor friendly This issue is limited in scope and/or knowledge of Zig internals. miscompilation The compiler reports success but produces semantically incorrect code. zig cc Zig as a drop-in C compiler feature
Milestone

Comments

@happyalu
Copy link

I was trying to build tensorflow with zig c++ on an intel kabylake processor and building a dependency (xnnpack) failed when compiling a source that was using XOP intrinsics. Their cmake build was adding -mxop when compiling the appropriate files. The build worked with clang++. I was able to narrow this down to a small test case:

cat > test.cc <<EOF
#include <x86intrin.h>

int main(){
        __m128i x;
        _mm_macc_epi32(x, x, x);
        return 0;
}
EOF
$ clang++ -mxop test.cc; echo $?
0
$ zig c++ -mxop test.cc
test.cc:5:9: error: always_inline function '_mm_macc_epi32' requires target feature 'xop', but would be inlined into function 'main' that is compiled without support for 'xop'
        _mm_macc_epi32(x, x, x);
        ^
1 error generated.

Running with -v shows that zig sets the xop features twice (once +xop and later -xop) : ... -target-cpu x86-64 -target-feature +xop -tune-cpu generic ... -target-cpu skylake ... -target-feature -xop ...

zig -cc1 -triple x86_64-unknown-linux-gnu -emit-obj --mrelax-relocations -disable-free -disable-llvm-verifier -discard-value-names -main-file-name test.cc -mrelocation-model pic -pic-level 2 -fhalf-no-semantic-interposition -mframe-pointer=none -fmath-errno -fno-rounding-math -mconstructor-aliases -munwind-tables -target-cpu x86-64 -target-feature +xop -tune-cpu generic -fno-split-dwarf-inlining -debugger-tuning=gdb -v -nostdsysteminc -nobuiltininc -resource-dir /path/lib/clang/12.0.1 -dependency-file /path/.cache/zig/tmp/52ed2834085a3e15-test.o.d -MT /path/.cache/zig/tmp/52ed2834085a3e15-test.o -sys-header-deps -MV -isystem /path/zig-linux-x86_64-0.9.0-dev.4+ddf9c40bc/lib/libcxx/include -isystem /path/zig-linux-x86_64-0.9.0-dev.4+ddf9c40bc/lib/libcxxabi/include -isystem /path/zig-linux-x86_64-0.9.0-dev.4+ddf9c40bc/lib/libunwind/include -isystem /path/zig-linux-x86_64-0.9.0-dev.4+ddf9c40bc/lib/include -isystem /path/code/zig-linux-x86_64-0.9.0-dev.4+ddf9c40bc/lib/libc/include/x86_64-linux-gnu -isystem /path/zig-linux-x86_64-0.9.0-dev.4+ddf9c40bc/lib/libc/include/generic-glibc -isystem /path/zig-linux-x86_64-0.9.0-dev.4+ddf9c40bc/lib/libc/include/x86_64-linux-any -isystem /path/zig-linux-x86_64-0.9.0-dev.4+ddf9c40bc/lib/libc/include/any-linux-any -isystem /usr/local/include -isystem /usr/include/x86_64-linux-gnu -isystem /usr/include -D _LIBCPP_DISABLE_VISIBILITY_ANNOTATIONS -D _LIBCXXABI_DISABLE_VISIBILITY_ANNOTATIONS -D _LIBCPP_HAS_NO_VENDOR_AVAILABILITY_ANNOTATIONS -D _DEBUG -Og -fdeprecated-macro -fdebug-compilation-dir /path/tflite_build/xnnpack -ferror-limit 19 -fsanitize=alignment,array-bounds,bool,builtin,enum,float-cast-overflow,function,integer-divide-by-zero,nonnull-attribute,null,object-size,pointer-overflow,return,returns-nonnull-attribute,shift-base,shift-exponent,signed-integer-overflow,unreachable,vla-bound -fsanitize-trap=alignment,array-bounds,bool,builtin,enum,float-cast-overflow,function,integer-divide-by-zero,nonnull-attribute,null,object-size,pointer-overflow,return,returns-nonnull-attribute,shift-base,shift-exponent,signed-integer-overflow,unreachable,vla-bound -stack-protector 2 -stack-protector-buffer-size 4 -fgnuc-version=4.2.1 -fcxx-exceptions -fexceptions -fcolor-diagnostics -fno-spell-checking -target-cpu skylake -target-feature -16bit-mode -target-feature -32bit-mode -target-feature -3dnow -target-feature -3dnowa -target-feature +64bit -target-feature +adx -target-feature +aes -target-feature -amx-bf16 -target-feature -amx-int8 -target-feature -amx-tile -target-feature +avx -target-feature +avx2 -target-feature -avx512bf16 -target-feature -avx512bitalg -target-feature -avx512bw -target-feature -avx512cd -target-feature -avx512dq -target-feature -avx512er -target-feature -avx512f -target-feature -avx512ifma -target-feature -avx512pf -target-feature -avx512vbmi -target-feature -avx512vbmi2 -target-feature -avx512vl -target-feature -avx512vnni -target-feature -avx512vp2intersect -target-feature -avx512vpopcntdq -target-feature -avxvnni -target-feature +bmi -target-feature +bmi2 -target-feature -branchfusion -target-feature -cldemote -target-feature +clflushopt -target-feature -clwb -target-feature -clzero -target-feature +cmov -target-feature +cx16 -target-feature +cx8 -target-feature -enqcmd -target-feature +ermsb -target-feature +f16c -target-feature -false-deps-lzcnt-tzcnt -target-feature +false-deps-popcnt -target-feature -fast-11bytenop -target-feature +fast-15bytenop -target-feature -fast-7bytenop -target-feature -fast-bextr -target-feature +fast-gather -target-feature -fast-hops -target-feature -fast-lzcnt -target-feature +fast-scalar-fsqrt -target-feature -fast-scalar-shift-masks -target-feature +fast-shld-rotate -target-feature +fast-variable-shuffle -target-feature +fast-vector-fsqrt -target-feature -fast-vector-shift-masks -target-feature +fma -target-feature -fma4 -target-feature +fsgsbase -target-feature -fsrm -target-feature +fxsr -target-feature -gfni -target-feature -hreset -target-feature -idivl-to-divb -target-feature +idivq-to-divl -target-feature +invpcid -target-feature -kl -target-feature -lea-sp -target-feature -lea-uses-ag -target-feature -lvi-cfi -target-feature -lvi-load-hardening -target-feature -lwp -target-feature +lzcnt -target-feature +macrofusion -target-feature +mmx -target-feature +movbe -target-feature -movdir64b -target-feature -movdiri -target-feature -mwaitx -target-feature +nopl -target-feature -pad-short-functions -target-feature +pclmul -target-feature -pconfig -target-feature -pku -target-feature +popcnt -target-feature -prefer-128-bit -target-feature -prefer-256-bit -target-feature -prefer-mask-registers -target-feature -prefetchwt1 -target-feature +prfchw -target-feature -ptwrite -target-feature -rdpid -target-feature +rdrnd -target-feature +rdseed -target-feature -retpoline -target-feature -retpoline-external-thunk -target-feature -retpoline-indirect-branches -target-feature -retpoline-indirect-calls -target-feature -rtm -target-feature +sahf -target-feature -serialize -target-feature -seses -target-feature +sgx -target-feature -sha -target-feature -shstk -target-feature +slow-3ops-lea -target-feature -slow-incdec -target-feature -slow-lea -target-feature -slow-pmaddwd -target-feature -slow-pmulld -target-feature -slow-shld -target-feature -slow-two-mem-ops -target-feature -slow-unaligned-mem-16 -target-feature -slow-unaligned-mem-32 -target-feature -soft-float -target-feature +sse -target-feature +sse2 -target-feature +sse3 -target-feature +sse4.1 -target-feature +sse4.2 -target-feature -sse4a -target-feature -sse-unaligned-mem -target-feature +ssse3 -target-feature -tbm -target-feature -tsxldtrk -target-feature -uintr -target-feature -use-aa -target-feature -use-glm-div-sqrt-costs -target-feature -vaes -target-feature -vpclmulqdq -target-feature +vzeroupper -target-feature -waitpkg -target-feature -wbnoinvd -target-feature -widekl -target-feature +x87 -target-feature -xop -target-feature +xsave -target-feature +xsavec -target-feature +xsaveopt -target-feature +xsaves -faddrsig -o /path/.cache/zig/tmp/52ed2834085a3e15-test.o -x c++ test.cc

Not sure if this was happening due to native cpu detection, but I tried adding -target x86_64-linux-gnu to pretend to cross-compile, but it didn't help. Am I using this wrong? Is there a workaround or fix I could try?

Thanks!

@andrewrk andrewrk added this to the 0.8.1 milestone Jun 22, 2021
@andrewrk andrewrk added bug Observed behavior contradicts documented or intended behavior contributor friendly This issue is limited in scope and/or knowledge of Zig internals. miscompilation The compiler reports success but produces semantically incorrect code. zig cc Zig as a drop-in C compiler feature labels Jun 22, 2021
@andrewrk
Copy link
Member

I think this is a matter of missing support for -m flag detection in general. It would be near trivial to solve it for -mxop, but the real question is how do we systematically solve it for all such parameters?

@happyalu
Copy link
Author

happyalu commented Jun 22, 2021

There's something that is correctly adding +xop and something else later adding rest of the flags including -xop. Perhaps I'll try to figure out where these are happening. Not sure if we can detect that +xop was set earlier, and then not reset it when emitting rest of the flags.

Tried the same thing with -mavx512f that my CPU doesn't support, and it generated -target-cpu x86-64 -target-feature +avx512f -tune-cpu generic ... -target-cpu skylake ... -target-feature -avx512f.

Is there a way to disable emitting flags for the detected cpu?

(Update): It seems the -mxop gets passed over to clang, and then we also add the target cpu features here

@andrewrk andrewrk modified the milestones: 0.8.1, 0.9.1 Aug 31, 2021
@andrewrk andrewrk modified the milestones: 0.9.1, 0.9.0 Nov 20, 2021
@matpow2
Copy link

matpow2 commented Nov 21, 2021

We ran into this issue while trying to build libvpx with zig cc.
Essentially we have something like this in our CMake script:

        set_source_files_properties(
            ${VPX_GLOB_CPP_AVX}
            PROPERTIES COMPILE_FLAGS
            "-mavx2")
        set_source_files_properties(
            ${VPX_GLOB_CPP_SSSE3}
            PROPERTIES COMPILE_FLAGS
            "-mssse3")
        set_source_files_properties(
            ${VPX_GLOB_CPP_SSE2}
            PROPERTIES COMPILE_FLAGS
            "-msse2")

but zig cc only supports -mcpu and not -mxxx.

We worked around this by creating a wrapper for zig.exe and setting -mcpu=baseline+x+y+z if one of the CPU feature flags were set, but would be great if this was supported without the workaround.

@andrewrk
Copy link
Member

andrewrk commented Nov 27, 2021

Fixed now:

$ cat test.c
#include <x86intrin.h>

int main(){
        __m128i x;
        _mm_macc_epi32(x, x, x);
        return 0;
}

$ ./zig cc test.c
test.c:5:9: error: always_inline function '_mm_macc_epi32' requires target feature 'xop', but would be inlined into function 'main' that is compiled without support for 'xop'
        _mm_macc_epi32(x, x, x);
        ^
1 error generated.

$ ./zig cc -mxop test.c

$ 

@andrewrk
Copy link
Member

Ah this was a duplicate of #4912 btw.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Observed behavior contradicts documented or intended behavior contributor friendly This issue is limited in scope and/or knowledge of Zig internals. miscompilation The compiler reports success but produces semantically incorrect code. zig cc Zig as a drop-in C compiler feature
Projects
None yet
Development

No branches or pull requests

3 participants