-
-
Notifications
You must be signed in to change notification settings - Fork 11.1k
[Doc] Update the doc for log probs + prefix caching #23399
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Doc] Update the doc for log probs + prefix caching #23399
Conversation
Signed-off-by: Chen Zhang <zhangch99@outlook.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request updates the documentation for logprobs with prefix caching to reflect its recent implementation. The change is clear and improves the documentation's accuracy. I have one suggestion to make the text even more precise regarding when prefix caching is disabled. Additionally, the feature status table in the same document still lists 'Prompt Logprobs with Prefix Caching' as 'Planned', which seems inconsistent with this documentation update. It would be great to update the table status to 'Functional' or 'Optimized' to reflect the new reality.
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Chen Zhang <zhangch99@outlook.com>
Signed-off-by: Chen Zhang <zhangch99@outlook.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Signed-off-by: Chen Zhang <zhangch99@outlook.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Xiao Yu <xiao.yu@amd.com>
Signed-off-by: Chen Zhang <zhangch99@outlook.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Signed-off-by: Chen Zhang <zhangch99@outlook.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Signed-off-by: Chen Zhang <zhangch99@outlook.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Signed-off-by: Chen Zhang <zhangch99@outlook.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Purpose
Log probs + prefix caching is supported in #13949. Update the doc here.
Test Plan
Test Result
(Optional) Documentation Update
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.