Replies: 1 comment 3 replies
-
The v cache cannot be quantised right now I believe. The k one can at q8_0 for some memory savings. So simply remove '-ctv q8_0' from the command. |
Beta Was this translation helpful? Give feedback.
3 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I run command like this:
It returns error like this:
But when I use
f16
everything goes will.Beta Was this translation helpful? Give feedback.
All reactions