-
Notifications
You must be signed in to change notification settings - Fork 11.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error: inlining failed in call to always_inline ‘_mm256_cvtph_ps’: target specific option mismatch #107
Comments
It would be best to take this up on https://github.com/ggerganov/ggml. |
what is your CPU? |
I am running ubuntu in a vm with virutual box. Here is the out put of the lscpu command. Thank you for you help so far. brickman@Ubuntu-brickman:~/Desktop$ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian Address sizes: 48 bits physical, 48 bits virtual CPU(s): 6 On-line CPU(s) list: 0-5 Thread(s) per core: 1 Core(s) per socket: 6 Socket(s): 1 NUMA node(s): 1 Vendor ID: AuthenticAMD CPU family: 23 Model: 113 Model name: AMD Ryzen 5 3600 6-Core Processor Stepping: 0 CPU MHz: 3599.936 BogoMIPS: 7199.87 Hypervisor vendor: KVM Virtualization type: full L1d cache: 192 KiB L1i cache: 192 KiB L2 cache: 3 MiB L3 cache: 32 MiB NUMA node0 CPU(s): 0-5 Vulnerability Itlb multihit: Not affected Vulnerability L1tf: Not affected Vulnerability Mds: Not affected Vulnerability Meltdown: Not affected Vulnerability Mmio stale data: Not affected Vulnerability Retbleed: Mitigation; untrained return thunk; SMT disable d Vulnerability Spec store bypass: Not affected Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization Vulnerability Spectre v2: Mitigation; Retpolines, STIBP disabled, RSB fil ling, PBRSB-eIBRS Not affected Vulnerability Srbds: Not affected Vulnerability Tsx async abort: Not affected Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtr r pge mca cmov pat pse36 clflush mmx fxsr sse s se2 ht syscall nx mmxext fxsr_opt rdtscp lm con stant_tsc rep_good nopl nonstop_tsc cpuid extd_ apicid tsc_known_freq pni pclmulqdq ssse3 cx16 sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx rdrand hypervisor lahf_lm cmp_legacy cr8_legac y abm sse4a misalignsse 3dnowprefetch ssbd vmmc all fsgsbase bmi1 avx2 bmi2 rdseed clflushopt a rat brickman@Ubuntu-brickman:~/Desktop$ |
Flags indicate your Virtualbox instance supports
|
I ran the command with g++ and this is what I got. brickman@Ubuntu-brickman:~/Desktop/llama.cpp$ make g++ I llama.cpp build info: I UNAME_S: Linux I UNAME_P: x86_64 I UNAME_M: x86_64 I CFLAGS: -I. -O3 -DNDEBUG -std=c11 -fPIC -pthread -mavx -mavx2 -msse3 I CXXFLAGS: -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -pthread I LDFLAGS: I CC: cc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0 I CXX: g++ (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0 make: *** No rule to make target 'g++'. Stop. brickman@Ubuntu-brickman:~/Desktop/llama.cpp$ |
You're using |
I updated my g++ version and tried your command but I am still having trouble. I really appreciate your help. Thank you. brickman@Ubuntu-brickman:~/Desktop/llama.cpp$ g++ -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -pthread quantize.cpp ggml.o utils.o -o quantize g++: error: ggml.o: No such file or directory g++: error: utils.o: No such file or directory brickman@Ubuntu-brickman:~/Desktop/llama.cpp$ g++ --version g++ (Ubuntu 10.3.0-1ubuntu1~20.04) 10.3.0 Copyright (C) 2020 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. brickman@Ubuntu-brickman:~/Desktop/llama.cpp$ |
Now that
|
Looks like the same error. brickman@Ubuntu-brickman:~/Desktop/llama.cpp$ make clean;make I llama.cpp build info: I UNAME_S: Linux I UNAME_P: x86_64 I UNAME_M: x86_64 I CFLAGS: -I. -O3 -DNDEBUG -std=c11 -fPIC -pthread -mavx -mavx2 -msse3 I CXXFLAGS: -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -pthread I LDFLAGS: I CC: cc (Ubuntu 10.3.0-1ubuntu1~20.04) 10.3.0 I CXX: g++ (Ubuntu 10.3.0-1ubuntu1~20.04) 10.3.0 rm -f *.o main quantize I llama.cpp build info: I UNAME_S: Linux I UNAME_P: x86_64 I UNAME_M: x86_64 I CFLAGS: -I. -O3 -DNDEBUG -std=c11 -fPIC -pthread -mavx -mavx2 -msse3 I CXXFLAGS: -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -pthread I LDFLAGS: I CC: cc (Ubuntu 10.3.0-1ubuntu1~20.04) 10.3.0 I CXX: g++ (Ubuntu 10.3.0-1ubuntu1~20.04) 10.3.0 cc -I. -O3 -DNDEBUG -std=c11 -fPIC -pthread -mavx -mavx2 -msse3 -c ggml.c -o ggml.o In file included from /usr/lib/gcc/x86_64-linux-gnu/10/include/immintrin.h:113, from ggml.c:155: ggml.c: In function ‘ggml_vec_dot_f16’: /usr/lib/gcc/x86_64-linux-gnu/10/include/f16cintrin.h:52:1: error: inlining failed in call to ‘always_inline’ ‘_mm256_cvtph_ps’: target specific option mismatch 52 | _mm256_cvtph_ps (__m128i __A) | ^~~~~~~~~~~~~~~ ggml.c:911:33: note: called from here 911 | #define GGML_F32Cx8_LOAD(x) _mm256_cvtph_ps(_mm_loadu_si128((__m128i *)(x))) | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ggml.c:921:37: note: in expansion of macro ‘GGML_F32Cx8_LOAD’ 921 | #define GGML_F16_VEC_LOAD(p, i) GGML_F32Cx8_LOAD(p) | ^~~~~~~~~~~~~~~~ ggml.c:1274:21: note: in expansion of macro ‘GGML_F16_VEC_LOAD’ 1274 | ay[j] = GGML_F16_VEC_LOAD(y + i + j*GGML_F16_EPR, j); | ^~~~~~~~~~~~~~~~~ In file included from /usr/lib/gcc/x86_64-linux-gnu/10/include/immintrin.h:113, from ggml.c:155: /usr/lib/gcc/x86_64-linux-gnu/10/include/f16cintrin.h:52:1: error: inlining failed in call to ‘always_inline’ ‘_mm256_cvtph_ps’: target specific option mismatch 52 | _mm256_cvtph_ps (__m128i __A) | ^~~~~~~~~~~~~~~ ggml.c:911:33: note: called from here 911 | #define GGML_F32Cx8_LOAD(x) _mm256_cvtph_ps(_mm_loadu_si128((__m128i *)(x))) | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ggml.c:921:37: note: in expansion of macro ‘GGML_F32Cx8_LOAD’ 921 | #define GGML_F16_VEC_LOAD(p, i) GGML_F32Cx8_LOAD(p) | ^~~~~~~~~~~~~~~~ ggml.c:1273:21: note: in expansion of macro ‘GGML_F16_VEC_LOAD’ 1273 | ax[j] = GGML_F16_VEC_LOAD(x + i + j*GGML_F16_EPR, j); | ^~~~~~~~~~~~~~~~~ In file included from /usr/lib/gcc/x86_64-linux-gnu/10/include/immintrin.h:113, from ggml.c:155: /usr/lib/gcc/x86_64-linux-gnu/10/include/f16cintrin.h:52:1: error: inlining failed in call to ‘always_inline’ ‘_mm256_cvtph_ps’: target specific option mismatch 52 | _mm256_cvtph_ps (__m128i __A) | ^~~~~~~~~~~~~~~ ggml.c:911:33: note: called from here 911 | #define GGML_F32Cx8_LOAD(x) _mm256_cvtph_ps(_mm_loadu_si128((__m128i *)(x))) | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ggml.c:921:37: note: in expansion of macro ‘GGML_F32Cx8_LOAD’ 921 | #define GGML_F16_VEC_LOAD(p, i) GGML_F32Cx8_LOAD(p) | ^~~~~~~~~~~~~~~~ ggml.c:1273:21: note: in expansion of macro ‘GGML_F16_VEC_LOAD’ 1273 | ax[j] = GGML_F16_VEC_LOAD(x + i + j*GGML_F16_EPR, j); | ^~~~~~~~~~~~~~~~~ In file included from /usr/lib/gcc/x86_64-linux-gnu/10/include/immintrin.h:113, from ggml.c:155: /usr/lib/gcc/x86_64-linux-gnu/10/include/f16cintrin.h:52:1: error: inlining failed in call to ‘always_inline’ ‘_mm256_cvtph_ps’: target specific option mismatch 52 | _mm256_cvtph_ps (__m128i __A) | ^~~~~~~~~~~~~~~ ggml.c:911:33: note: called from here 911 | #define GGML_F32Cx8_LOAD(x) _mm256_cvtph_ps(_mm_loadu_si128((__m128i *)(x))) | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ggml.c:921:37: note: in expansion of macro ‘GGML_F32Cx8_LOAD’ 921 | #define GGML_F16_VEC_LOAD(p, i) GGML_F32Cx8_LOAD(p) | ^~~~~~~~~~~~~~~~ ggml.c:1274:21: note: in expansion of macro ‘GGML_F16_VEC_LOAD’ 1274 | ay[j] = GGML_F16_VEC_LOAD(y + i + j*GGML_F16_EPR, j); | ^~~~~~~~~~~~~~~~~ In file included from /usr/lib/gcc/x86_64-linux-gnu/10/include/immintrin.h:113, from ggml.c:155: /usr/lib/gcc/x86_64-linux-gnu/10/include/f16cintrin.h:52:1: error: inlining failed in call to ‘always_inline’ ‘_mm256_cvtph_ps’: target specific option mismatch 52 | _mm256_cvtph_ps (__m128i __A) | ^~~~~~~~~~~~~~~ ggml.c:911:33: note: called from here 911 | #define GGML_F32Cx8_LOAD(x) _mm256_cvtph_ps(_mm_loadu_si128((__m128i *)(x))) | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ggml.c:921:37: note: in expansion of macro ‘GGML_F32Cx8_LOAD’ 921 | #define GGML_F16_VEC_LOAD(p, i) GGML_F32Cx8_LOAD(p) | ^~~~~~~~~~~~~~~~ ggml.c:1273:21: note: in expansion of macro ‘GGML_F16_VEC_LOAD’ 1273 | ax[j] = GGML_F16_VEC_LOAD(x + i + j*GGML_F16_EPR, j); | ^~~~~~~~~~~~~~~~~ In file included from /usr/lib/gcc/x86_64-linux-gnu/10/include/immintrin.h:113, from ggml.c:155: /usr/lib/gcc/x86_64-linux-gnu/10/include/f16cintrin.h:52:1: error: inlining failed in call to ‘always_inline’ ‘_mm256_cvtph_ps’: target specific option mismatch 52 | _mm256_cvtph_ps (__m128i __A) | ^~~~~~~~~~~~~~~ ggml.c:911:33: note: called from here 911 | #define GGML_F32Cx8_LOAD(x) _mm256_cvtph_ps(_mm_loadu_si128((__m128i *)(x))) | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ggml.c:921:37: note: in expansion of macro ‘GGML_F32Cx8_LOAD’ 921 | #define GGML_F16_VEC_LOAD(p, i) GGML_F32Cx8_LOAD(p) | ^~~~~~~~~~~~~~~~ ggml.c:1274:21: note: in expansion of macro ‘GGML_F16_VEC_LOAD’ 1274 | ay[j] = GGML_F16_VEC_LOAD(y + i + j*GGML_F16_EPR, j); | ^~~~~~~~~~~~~~~~~ make: *** [Makefile:186: ggml.o] Error 1 brickman@Ubuntu-brickman:~/Desktop/llama.cpp$ |
It works under Docker with |
Ok, I will try to run the command on a real machine in a couple of hours. Did you put your docker image on docker hub? |
Nope, I just smoke tested the compile in a running instance. However someone just now published a pull request for Docker support here #132 🚀 |
I was Successfully able to build the model when was not in a virtual machine. But Now I am wondering were I can download the LLAMA model. |
|
I can neither confirm nor deny that that link will work 😄 For verification, and as there are some suspect models floating around, I published the md5 sums and file sizes of my confirmed working models in issue #69. |
Thank You for all your help. |
@tofasthacker yes that place is a good source for it. I have been using those for several days and the URL remains the same
@gjmulder I attempted to compile on a system with no AVX a week or so ago, and I got similar output to his. Which is why I originally posted I think he has no AVX support. |
im on bare metal have AVX (intel 2600k) i added this to force it to compile with gcc11
Edit: nevermind dont try this. There are 2 AVX out there, the 2600k has AVX1, this needs AVX2 |
I have the same error on my Ubuntu Linux machine (physical box). I upgraded to g++10 and it makes no difference.
My CPU details:
|
Try adding |
I have the same issue with a Fedora Silverblue Virtual Machine compiling within podman container. To fix it, I just set the LLAMA_NATIVE option to OFF this way mkdir build
cd build
cmake -DLLAMA_NATIVE=OFF ..
make -j I think there might be a bug in the use of intel SIMD instructions: https://www.intel.com/content/www/us/en/docs/cpp-compiler/developer-guide-reference/2021-8/mm-fmadd-ps-mm256-fmadd-ps.html |
If running on Kali or Debian Linux, replace the last command with With make -j my machine froze. |
I cloned the GitHub repository and ran the make command but was unable to get the cpp files to compile successfully. Any help or suggestion would be appreciated.
Terminal output:
The text was updated successfully, but these errors were encountered: