Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support compiling with intel MKL for with LLAMA_MKL=1 #1063

Open
wants to merge 1 commit into
base: concedo_experimental
Choose a base branch
from

Conversation

lr1729
Copy link

@lr1729 lr1729 commented Aug 11, 2024

Speeds up prompt processing on cpu by 60% on my i3-1115G4 according to my tests, partially resolves #322

@LostRuins
Copy link
Owner

I think you might have better results just hijacking the openblas flags instead of adding it as another backend. I can't test MKL myself because I don't have it installed on my system. Have you also compared the difference between Openblas and noblas (which ends up using llamafile)

@lr1729
Copy link
Author

lr1729 commented Aug 12, 2024

Gotcha I'll see if I can figure that out, noblas is similar to openblas speeds for me, sometimes even 10-20% faster

@LostRuins
Copy link
Owner

LostRuins commented Aug 12, 2024

Yeah, I can possibly consider adding it as an optional LLAMA_MKL=1 flag that people should set when they do a self-compile (then you can just overwrite the openblas files), but I won't be able to provide prebuilt binaries or support for it, as I don't run it myself.

@lr1729
Copy link
Author

lr1729 commented Aug 12, 2024

That sounds good, it is somewhat of a pain to compile needing the entire 12gb Intel HPC toolkit

@LostRuins
Copy link
Owner

Exactly, that's why I avoid MKL (also for those on ryzen CPUs it's not useful).

OpenBLAS is portable and can be used with 15mb of files.

But if you have a GPU you should use that, it will beat everything else by miles.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants