-
Notifications
You must be signed in to change notification settings - Fork 9.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make CMake LLAMA_NATIVE flag actually use the instructions supported by the processor #3273
Conversation
Oh, I think this probably explains at least some of the occasional segfault bug reports here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better would be a string, so i can give it "x86-64-v3" or something similar.
This is a bit tangential, and I'm writing this for future reference, but I stumbled upon quite simple trick to detect CPU extensions, either in the code or in cmake. Basically, if you run an empty string through compiler preprocessor, it wil output all compiler defines, and if your CPU supports say AVX, something like I think somebody was working on CPU features detection but I can't find which issue/pr it was, so I'm leaving it here. |
Yeah that's #809. Since that's intended for Windows and I don't have Windows someone else will have to bring in that PR and test it (@howard0su seems to have stopped working on it). |
This comment was marked as off-topic.
This comment was marked as off-topic.
Make and cmake are maintained separately and sometimes PRs add or change something in only either one, usually that gets corrected after somebody notices, but generally, use whichever build system works for you, neither build system is better when it comes to the binaries you get. |
This comment was marked as off-topic.
This comment was marked as off-topic.
@thilomichael I don't have a Mac so I can't reproduce your issue with Make. Can you provide the exact flags passed to the compiler with Make (fail) and CMake (pass)? e.g.
|
|
@netrunnereve I just run
@cebtenzzre The thing with my problem is that it is "solved" when I checkout this branch and compile with I am running MacOS so I'm using
|
You sent the command for the link step, but we need to see the command for the compile step (uses Since you're on aarch64, the Makefile adds |
Should we make |
I certainly want this to be the default but apparently MSVC doesn't support |
if you make the it the default, make sure to disable it in the nix flake. (or tell me to do it) |
GNU Makefiles aren't typically used for native MSVC builds - ours definitely doesn't support MSVC. When you use cmake on Windows, it generates Visual Studio project files by default. |
I agree, MSVC shouldn't be a problem. If there is no other concern, we should set |
… also see what happens if we use it on msvc
Un-draft this when you're done with your "see what happens" changes. I wouldn't want it to accidentally get merged if it's not finished. |
Okay this should be good once the CI passes. |
…example * 'master' of github.com:ggerganov/llama.cpp: (24 commits) convert : fix Baichuan2 models by using vocab size in config.json (ggerganov#3299) readme : add project status link ggml : fix build after ggerganov#3329 llm : add Refact model (ggerganov#3329) sync : ggml (conv 1d + 2d updates, UB fixes) (ggerganov#3468) finetune : readme fix typo (ggerganov#3465) ggml : add RISC-V Vector Support for K-Quants and improved the existing intrinsics (ggerganov#3453) main : consistent prefix/suffix coloring (ggerganov#3425) llama : fix session saving/loading (ggerganov#3400) llama : expose model's rope_freq_scale in the API (ggerganov#3418) metal : alibi for arbitrary number of heads (ggerganov#3426) cmake : make LLAMA_NATIVE flag actually use the instructions supported by the processor (ggerganov#3273) Work on the BPE tokenizer (ggerganov#3252) convert : fix vocab size when not defined in hparams (ggerganov#3421) cmake : increase minimum version for add_link_options (ggerganov#3444) CLBlast: Add broadcast support for matrix multiplication (ggerganov#3402) gguf : add BERT, MPT, and GPT-J arch info (ggerganov#3408) gguf : general usability improvements (ggerganov#3409) cmake : make CUDA flags more similar to the Makefile (ggerganov#3420) finetune : fix ggerganov#3404 (ggerganov#3437) ...
…d by the processor (ggerganov#3273) * fix LLAMA_NATIVE * syntax * alternate implementation * my eyes must be getting bad... * set cmake LLAMA_NATIVE=ON by default * march=native doesn't work for ios/tvos, so disable for those targets. also see what happens if we use it on msvc * revert 8283237 and only allow LLAMA_NATIVE on x86 like the Makefile * remove -DLLAMA_MPI=ON --------- Co-authored-by: netrunnereve <netrunnereve@users.noreply.github.com>
Currently LLAMA_NATIVE doesn't disable instruction set specific flags like LLAMA_AVX when it's set. Instead we get something like
-march=native -mf16c -mfma -mavx -mavx2
, which won't run on a native host that doesn't support these instructions.Also is there a reason why LLAMA_NATIVE isn't turned on by default? The Makefile already uses-march=native
and someone wanting to compile for other users really should manually set the appropiate flags depending on what they want to support.Okay this seems to be because
-march=native
is not supported on MSVC. Merging #809 may fix this.