-
Notifications
You must be signed in to change notification settings - Fork 10.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Documentation: Add ggml_type value choices for KV cache data type in … #10802
Documentation: Add ggml_type value choices for KV cache data type in … #10802
Conversation
examples/server/README.md
Outdated
| `-ctk, --cache-type-k TYPE` | KV cache data type for K (default: f16, f32, bf16, q8_0, q4_0, q4_1, iq4_nl, q5_0, q5_1)<br/>(env: LLAMA_ARG_CACHE_TYPE_K) | | ||
| `-ctv, --cache-type-v TYPE` | KV cache data type for V (default: f16, f32, bf16, q8_0, q4_0, q4_1, iq4_nl, q5_0, q5_1)<br/>(env: LLAMA_ARG_CACHE_TYPE_V) | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The list of valid values should be separate from the default value, which is always f16.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should not manually edit this table. It is generated from arg.cpp
and the next time we generate it, your changes will be discarded
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. Thanks for reviewing, updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please edit arg.cpp
as explained above
I would prefer to convert |
Hmm I think I'll do it in another PR because it's not that simple |
Oh ok then. In that case, is there other changes required for this PR ? :) |
No I think the current PR can be closed. In the other PR, list of supported KV cache type is now rendered dynamically. |
…README.
Updated README
Make sure to read the contributing guidelines before submitting a PR