-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
crash on Linux x86_64 (cmake build, git 8f7e986184afb) #1806
Comments
In Release mode I have a different backtrace. Program received signal SIGILL, Illegal instruction. |
What is your hardware ? Luckily "good" and "bad" versions appear to be only six days apart so this should be easy to track down, but the only changes directly affecting SGEMM were for SkylakeX only. |
Currently on old Dell laptop, Sandy Bridge I think: |
Weird. Both compiling and running happens on this system ? (Slight hope that you might have built the library on Haswell without including DYNAMIC_ARCH support for other/older processors) |
https://valid.x86.fr/ynbz2s |
Yes, compiling and running happens on this same system and DYNAMIC_ARCH is set to ON. But the same crash happened on another system (Xeon or something) so I don't think this crash is specific to Sandybridge. My guess is that it is related to the recent cmake changes for x64 ? |
The crash also happened yesterday on those platforms: |
Err, it is now dawning on me that the change to add -march-skylakex-avx512 "where required" may actually be adding it unconditionally rather than for that specific target...wonder why this did not break the CI builds though. Could you try removing the two lines from system_check.cmake that I added in #1798 ? |
If I remove those 2 lines I get back the compilation failure (see #1797) |
Duh. But these lines actually belong in system.cmake, like
somewhere around line 42, before the |
Hello,
With latest develop (8f7e986) with cmake build I can reproduce a crash on various ubuntu platform (16.04, 17.10 and 18.04); there is no crash on an earlier version (02ef20a), here is the backtrace:
Thread 1 "nv3dfi_video_cl" received signal SIGILL, Illegal instruction.
0x00007fffe4b11383 in sgemm_nn (args=0x7fffffffc4c0, range_m=0x0, range_n=0x0,
sa=0x7fffdcf1d000, sb=0x7fffdd03d000, dummy=0)
at src/driver/level3/level3.c:254
254 if ( alpha[0] == ZERO
(gdb) bt
#0 0x00007fffe4b11383 in sgemm_nn (args=0x7fffffffc4c0, range_m=0x0, range_n=0x0, sa=0x7fffdcf1d000, sb=0x7fffdd03d000, dummy=0)
at src/driver/level3/level3.c:254
#1 0x00007fffe4b1112c in cblas_sgemm (order=CblasRowMajor, TransA=CblasNoTrans, TransB=CblasNoTrans, m=10, n=4200, k=27, alpha=1, a=0x555555d22340, lda=27, b=0x555556236110, ldb=4200, beta=0, c=0x5555561e1b90, ldc=4200)
at src/interface/gemm.c:422
The text was updated successfully, but these errors were encountered: