-
Notifications
You must be signed in to change notification settings - Fork 11.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Alpaca model is running very slow in llama.cpp compared to alpaca.cpp #677
Comments
See #603 |
Updated context from #603, sounds like things may have been fixed?
|
This should be resolved with #603 . It's the same behavior I described in the issue. |
Prerequisites
Please answer the following questions for yourself before submitting an issue.
Current Behavior
Just yesterday, this migration script was added :
migrate-ggml-2023-03-30-pr613.py
.So, what I did on top of @madmads11 instructions for using alpaca models was to use this above script and generate the final bin file to work with.
Details :
I am using
llama.cpp
just today to run alpaca model. (was using antimatters alpaca.cpp until now)This same model that's converted and loaded in
llama.cpp
runs very slow compared to running it inalpaca.cpp
.How I started up model :
./main -m ./models/alpaca-7b-migrated.bin -ins --n_parts 1
The logs :
Additionally, I also used this bin file : https://huggingface.co/Pi3141/alpaca-lora-7B-ggml/blob/main/ggml-model-q4_1.bin that's already migrated for
llama.cpp
. And even for this, model is running slow withllama.cpp
.One thing I noticed was, while loading between these two model variants, this line is different than on above.
llama_model_load: f16 = 3
.Environment and Context
The text was updated successfully, but these errors were encountered: