Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compile errors with AVX disabled #1027

Closed
kevinbentley opened this issue Apr 17, 2023 · 3 comments
Closed

Compile errors with AVX disabled #1027

kevinbentley opened this issue Apr 17, 2023 · 3 comments

Comments

@kevinbentley
Copy link

Trying to compile for older Intel XEON CPUs, and they don't support the AVX instructions. However, if I turn off LLAMA_AVX and/or LLAMA_AVX2, I get these errors (as of commit eb17a02):

/home/user/llama.cpp/ggml.c: In function ‘bytesFromNibbles’:
/home/user/llama.cpp/ggml.c:470:19: warning: implicit declaration of function ‘_mm_loadu_si64’; did you mean ‘_mm_loadl_epi64’? [-Wimplicit-function-declaration]
     __m128i tmp = _mm_loadu_si64( ( const __m128i* )rsi );
                   ^~~~~~~~~~~~~~
                   _mm_loadl_epi64
/home/user/llama.cpp/ggml.c:470:19: error: incompatible types when initializing type ‘__m128i {aka __vector(2) long long int}’ using type ‘int’
/home/user/llama.cpp/ggml.c: In function ‘quantize_row_q4_1’:
/home/user/llama.cpp/ggml.c:930:27: warning: unused variable ‘y’ [-Wunused-variable]
     block_q4_1 * restrict y = vy;
                           ^
/home/user/llama.cpp/ggml.c:928:15: warning: unused variable ‘nb’ [-Wunused-variable]
     const int nb = k / QK4_1;
               ^~
/home/user/llama.cpp/ggml.c: In function ‘ggml_vec_dot_q4_0’:
/home/user/llama.cpp/ggml.c:2425:40: warning: implicit declaration of function ‘_mm256_set_m128i’; did you mean ‘_mm256_set_epi8’? [-Wimplicit-function-declaration]
         __m256 p = _mm256_cvtepi32_ps( _mm256_set_m128i( i32[0], i32[1] ));
                                        ^~~~~~~~~~~~~~~~
                                        _mm256_set_epi8
/home/user/llama.cpp/ggml.c:2425:40: error: incompatible type for argument 1 of ‘_mm256_cvtepi32_ps’
In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:41:0,
                 from /home/user/llama.cpp/ggml.c:186:
/usr/lib/gcc/x86_64-linux-gnu/7/include/avxintrin.h:454:1: note: expected ‘__m256i {aka __vector(4) long long int}’ but argument is of type ‘int’
 _mm256_cvtepi32_ps (__m256i __A)
 ^~~~~~~~~~~~~~~~~~
/home/user/llama.cpp/ggml.c: In function ‘ggml_vec_dot_q4_0_q8_0’:
/home/user/llama.cpp/ggml.c:2892:40: error: incompatible type for argument 1 of ‘_mm256_cvtepi32_ps’
         __m256 p = _mm256_cvtepi32_ps( _mm256_set_m128i( i32[0], i32[1] ));
                                        ^~~~~~~~~~~~~~~~
In file included from /usr/lib/gcc/x86_64-linux-gnu/7/include/immintrin.h:41:0,
                 from /home/user/llama.cpp/ggml.c:186:
/usr/lib/gcc/x86_64-linux-gnu/7/include/avxintrin.h:454:1: note: expected ‘__m256i {aka __vector(4) long long int}’ but argument is of type ‘int’
 _mm256_cvtepi32_ps (__m256i __A)
@dfyz
Copy link
Collaborator

dfyz commented Apr 17, 2023

I think both -mfma and -mf16c imply using AVX. Could you try passing -DLLAMA_F16C=OFF -DLLAMA_FMA=OFF to CMake arguments and see if this fixes the build failures?

Incidentally, these build failures appear to come from gcc 7 not supporting some of the newer AVX intrinsics (which can be emulated with old ones, if needed). Ostensibly this is good thing in this case, since otherwise the compilation would succeed but the produced binary would crash with SIGILL at runtime on your Xeon.

@kevinbentley
Copy link
Author

That works, thank you!

For what it's worth, by default it compiled but crashed with "Illegal Instruction" which made sense once I realized the CPU didn't support the instructions. But my first thought was that there was some memory corruption causing that error, rather than it literally being an illegal instruction. That's some overthinking on my part.

@m4gn3to
Copy link

m4gn3to commented Jan 4, 2024

Sorry for reopen this old issue.
I have an old E5649 plenty of RAM and would like to know if you can tell me how can I pass the options you recommend to CMake.
I've changed the ollama/llm/ext_server_common.go file options but when I end the compilation and run ollama I still get "Illegal Instruction" crash.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants