Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation: Add ggml_type value choices for KV cache data type in … #10802

Conversation

MichelleTanPY
Copy link
Contributor

…README.

Updated README

  • List out the rest of the available ggml_type data types for KV cache data type section.

Make sure to read the contributing guidelines before submitting a PR

Comment on lines 65 to 66
| `-ctk, --cache-type-k TYPE` | KV cache data type for K (default: f16, f32, bf16, q8_0, q4_0, q4_1, iq4_nl, q5_0, q5_1)<br/>(env: LLAMA_ARG_CACHE_TYPE_K) |
| `-ctv, --cache-type-v TYPE` | KV cache data type for V (default: f16, f32, bf16, q8_0, q4_0, q4_1, iq4_nl, q5_0, q5_1)<br/>(env: LLAMA_ARG_CACHE_TYPE_V) |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The list of valid values should be separate from the default value, which is always f16.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should not manually edit this table. It is generated from arg.cpp and the next time we generate it, your changes will be discarded

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Thanks for reviewing, updated.

Copy link
Collaborator

@ngxson ngxson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please edit arg.cpp as explained above

@MichelleTanPY MichelleTanPY requested a review from ngxson December 12, 2024 19:45
@ngxson
Copy link
Collaborator

ngxson commented Dec 12, 2024

I would prefer to convert kv_cache_type_from_str to an std::map or a static array, and the use it here instead of having to maintain it in 2 places. You can see an example of how --chat-template currently listing all supported templates.

@ngxson
Copy link
Collaborator

ngxson commented Dec 12, 2024

Hmm I think I'll do it in another PR because it's not that simple

@MichelleTanPY
Copy link
Contributor Author

Oh ok then. In that case, is there other changes required for this PR ? :)

@ngxson
Copy link
Collaborator

ngxson commented Dec 12, 2024

No I think the current PR can be closed. In the other PR, list of supported KV cache type is now rendered dynamically.

@ngxson ngxson closed this Dec 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants