Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CPU] Enable u8 kv cache by default #27454

Conversation

luo-cheng2021
Copy link
Contributor

@luo-cheng2021 luo-cheng2021 commented Nov 7, 2024

Details:

  • Enable u8 kv cache by default
  • ...

Tickets:

@github-actions github-actions bot added the category: CPU OpenVINO CPU plugin label Nov 7, 2024
@luo-cheng2021 luo-cheng2021 marked this pull request as ready for review November 8, 2024 04:39
@luo-cheng2021 luo-cheng2021 requested review from a team as code owners November 8, 2024 04:39
@@ -411,6 +412,9 @@ void Config::readProperties(const ov::AnyMap& prop, const ModelType modelType) {
if (!fcDynamicQuantizationGroupSizeSetExplicitly) {
fcDynamicQuantizationGroupSize = 0;
}
if (!kvCachePrecisionSetExplicitly) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not set kvCachePrecision to u8 here ? To make the kvCachePrecision compatible in ACL platform ? If so, it's better to left a comment here to explain this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting the kvCachePrecision in constructor should be more clearer and consistent with other properties such as DynamicQuantizationGroup, so my suggestion is to keep them in current style.

@luo-cheng2021 luo-cheng2021 force-pushed the luocheng/enable_int8_kvcache branch 2 times, most recently from 47823da to 16578fe Compare November 11, 2024 02:12
Copy link
Contributor

@zhangYiIntel zhangYiIntel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@dmitry-gorokhov dmitry-gorokhov added this pull request to the merge queue Nov 13, 2024
Merged via the queue into openvinotoolkit:master with commit 2d148ec Nov 13, 2024
161 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: CPU OpenVINO CPU plugin
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants