-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[suggestion] llama.cpp CLBlast support. #37
Comments
I build llama.cpp with the LLAMA_NO_ACCELERATE=1 flag because otherwise apple rejects the app in review with a report of using private APIs. Is that what turns on CLBlast support? This issue has more details: ggerganov/llama.cpp#3438 If you want to try building locally without that flag, you can pull llama.cpp, run Let me know if you see perf gains. For me I didn't see any changes in time to response or tokens/second with the LLAMA_NO_ACCELERATE flag (I'm on m1 pro with 64GB RAM). |
I'm not experienced with this but couldn't you just release the app via github and the appstore so you gan get updates earlier on github and use features apple doesn't want? |
Technically I could, but I ideally don't want to support 2 versions of the app and I did not see any performance improvements without that flag. But please let me know if your experience differs and I'll re-asses the trade-off. There are good reasons for them not to allow access to private APIs (OS patches could break the app). |
Fair enough. |
@psugihara I appreciate your response, I will let you know if it works. I just was concerned that the size of the server and the size of the binary I've built are pretty different. |
yep, should be about half the size of you're just building for one architecture. The lipo command I use just glues 2 together. |
Is it possible to build llama.cpp (I believe it's your binary freechatserver) with CLBlast support?
It supposed to work great on CPU and gives great acceleration for regular Macs!
In case you don't want to change anything, could you please provide instruction how to make/build the freechatserver binary to replace it with my version of llama.cpp
The text was updated successfully, but these errors were encountered: