-
Notifications
You must be signed in to change notification settings - Fork 11.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
VERY VERY Slow on the rtx 4050 and i5-12455 and 16 gb ram #1719
Comments
Add the -t parameter to your prompt, perhaps -t 4. You might try lowering the batch # for the model to begin responding quicker with -b 10 in your prompt. |
did not work): plus it crashes after a while in the loading procces |
If your token generation is extremely slow, then try -t 1 and work your way up from there. Here's more information, including GPU with cuBlas: This is the limit of my knowledge on the subject, so if it continues to crash then I suggest someone else troubleshoot with @Asory2010 |
Run Run EDIT: The Intel® Core™ i5-1245U Processor has 2 fast and 8 slow CPU cores. I'd try to set |
Quick Update after some testing the text gen became wayyyyy faster but the loading time still remained slow, why is that? |
This issue was closed because it has been inactive for 14 days since being marked as stale. |
I also have cublas enabled and i have tried both 13b and 7b models and it takes ages to even spell on token. I am using these parameters:
main -i --interactive-first -r "### Human:" --temp 0 -c 2048 -n -1 --reapeate_penalty 1.2 --instruct --color -m wizard-mega-13b.ggml.q4_0.bin
The text was updated successfully, but these errors were encountered: