-
Notifications
You must be signed in to change notification settings - Fork 1.1k
WIP: Complete removal or f16_kv, add offload_kqv field #1019
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
905e864
to
9cdfe93
Compare
F16_KV appears to have been removed here: ggml-org/llama.cpp@af99c6f This addresses two issues: - abetlen#995 which just requests to add the KV cache offloading param - abetlen#1006 a NULL ptr exception when using the embeddings (introduced by leaving f16_kv in the fields struct)
9cdfe93
to
5603782
Compare
I have a question that is unrelated to your PR, which to me does seem to fix the issue in #1006. What is the significance of the single embedding returned from Is |
I patched this into my current system, and then the embeddings object would just tell me I'd have to create with embedding=True, and it was clearly in the code at the time. Not sure what happened, but I'm wondering if my build of llama_cpp is somehow borked. Do older versions work alright as a workaround for this issue? (The one in #1006 ) |
It looks like the
This just mirrors the behavior of the |
How old of |
I managed to get it working. Cleared the pycache and re-ran. Thanks! |
@brandonrobertz thank you so much for catching this, I'll merge this now and push a new version asap. |
Adds the
offload_kqv
setting to the context settings fields struct and removes thef16_kv
. Leavingf16_kv
there caused a null pointer exception when attempting to use the embeddings example script:This addresses two GH issues: