-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Build failure on circle-ci: f951: Error: bad value (skylake-avx512) for -march= switch #1909
Comments
This issue occurs when compiling OpenBLAS on an AVX-512 capable machine like Skylake with a version of GCC that does not understand |
Seems to me your gcc and gfortran are different versions - the gcc recent enough to accept the flag (so the compile test in c_check passes and NO_AVX512 does not get set), but gfortran too old (no separate avx512 test in f_check here as mixing different generations of the two compilers is a bad idea anyway). |
In that particular failure that's the issue. Thanks for that pointer. We missed that. Here's a build failure where both
|
This certainly should not happen. As I wrote above, there is a compile test that gets run very early in the build process to see if AVX512 support is available. If the test fails, NO_AVX512 is set to 1 and SkylakeX is subsequently treated like Haswell. |
Can we set
Yes. This build is in progress.
What information do you need? I can reproduce this build failure locally outside of CircleCI on an AVX-512 machine at work. |
On a Haswell machine at work I'm seeing a different build failure.
https://gist.github.com/sjackman/aaf78d42c2bff38bc16aeca085e9996e#file-01-make-L1600-L1612 |
Yes I think |
We'll try Bug https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78451 indicates that this issue was resolved in GCC 7.x. It doesn't indicate if there's a work-around for earlier versions of GCC. Previous versions of OpenBLAS compiled fine on Haswell with GCC 5. Do you know if there's a work around for compiling OpenBLAS on Haswell with GCC 5? |
You need matching GCC an GFORTRAN, v5 would disable AVX-512 code reducing it to HASWELL code, v6 and later (matching versions) from Ubuntu compiler PPA wil compile everything |
This build on Haswell uses gcc and gfortran 5.5.0 but fails. |
There is no mention of |
See https://gist.github.com/sjackman/aaf78d42c2bff38bc16aeca085e9996e#file-01-make-L3-L8 make CC=gcc-5 FC=gfortran libs netlib shared
I don't believe so.
We're using gcc 5.5.0 and gfortran 5.5.0. @iMichka had one CircleCI build that experimented with using GCC 6.0. That's not the build shown in this gist. |
I dont know what your compiler does wrong. |
What fails on Haswell is the SkylakeX part of the DYNAMIC_ARCH support, which is the only component |
The environment variables |
@martin-frbg Ah! I understand now! Thanks for the explanation, Martin. I'm testing |
@iMichka |
I cannot make it fail setting 2 parameters in env and others in command line under ubuntu 16.04 or 14.04 |
|
$ gcc --version | head -n1
gcc (Homebrew gcc 5.5.0_4) 5.5.0
$ cc --version | head -n1
cc (Homebrew gcc 5.5.0_4) 5.5.0
$ gfortran --version | head -n1
GNU Fortran (Homebrew gcc 5.5.0_4) 5.5.0
$ f95 --version
bash: f95: command not found |
I confirm that NO_AVX512=1 worked on CI (which has skylake). |
@martin-frbg @brada4 Thanks for your help troubleshooting, Martin and Andrew! |
@iMichka skylake and skylake x are different cpu series.
The root cause is broken compilers being used on top of ubuntu system. OpenBLAS will build identically on any x86_64 system. |
They may be different CPU series. As far as I can tell, Skylake and Skylake-X share the same intel microarchitecture, namely Skylake. See https://en.wikipedia.org/wiki/List_of_Intel_CPU_microarchitectures
|
|
Does OpenBLAS change its default configuration based on the system on which it is being compiled? |
DYNAMIC_ARCH=1 still detects architecture and fails on unhandled CPU, that gives info on new CPUIDs here. Otherwise there is no other side effect of this detection. Building without DYNAMIC_ARCH will detect target core and build for that architecture only (to save time of casual user typing e.g. AVX-512 support is missing in Ubuntu 16.04 gcc and in Ubuntu 14 gcc and binutils |
The default maximum number of threads is determined from the number of cores detected on the build system. And unless you build with DYNAMIC_ARCH=1 or specify a build TARGET, the hand-optimized math kernels will be chosen to be the best match for the build host. |
Ah, I didn't realize AVX-512 doesn't come with all Skylake CPUs. Thanks for the info.
|
Xeon Gold 6150 definitely has AVX512, so it would make sense to update the compiler to make use of it if you intend to run any calculations that involve SGEMM or DGEMM. |
It is also rather weird that compiling GCC under same conditions with GFORTRAN produces later with different architecture parameter set. Probably worth looking into it - in case of (probably) latest gfortran5 not provding avx-512 - it could be blacklisted automatically by same CC detection routine. There are also Intel MIC / Xeon Phi / Kinghts* accelerator boards running CentOS as firmware that provide different incompatible AVX-512 instructions and are treated as Haswell |
Seems gcc 6.1.0 was the first version to support the skylake-avx512 flag. I still do not see how an older gcc would manage to pass the AVX512 compatibility test and how |
I saw wonderful option in Circle CI documentation called "retry failed build without cache"
|
Ah, I see what's going on here. Homebrew / Linuxbrew use a wrapper around the compiler (named Superenv) that removes unsupported options. I'm guessing that OpenBLAS is testing whether A second reason for this feature is that we build precompiled binary packages called bottles that are intended to work on any X86-64 platform. For most packages, using |
That will generate SIGILL soon as the tests proceed on my VM |
I develop tools large genome sequence assembly, so I use predominantly integer instructions and hardly any floating point instructions. We have a trillion characters of A, C, G, T, and then do lots of pattern matching. 512-bit integer instructions (like shift and logical ops) are useful though. |
@brada4 Thanks for the suggestion for |
check for march=skylake-avx512 in latest build log, having it there is wrong. |
NO_AVX512=1 will use Haswell code on Skylake X, and no cpu-specific compiler options at all on any system (unless your CFLAGS or the default specs file of the compiler defines a specific |
Sounds like that should work for us then. The same wrapper script adds Thanks for all your help, Martin and Andrew. Shall we close this issue? |
We will remove the overbroad |
Here it goes.
|
Hi
Release 0.3.4 fails to build on Linux (on our CI), with gcc5 and gcc6. (related issue, https://github.com/Linuxbrew/homebrew-core/pull/10455).
CPU: 36-core 64-bit skylake
Kernel: Linux 4.4.0-139-generic x86_64 GNU/Linux
OS: Ubuntu 16.04.5 LTS (xenial)
Host glibc: 2.23
Previous openblas versions were fine.
The text was updated successfully, but these errors were encountered: