-
-
Notifications
You must be signed in to change notification settings - Fork 11k
[doc] add Context Parallel Deployment doc #26877
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
Documentation preview: https://vllm--26877.org.readthedocs.build/en/26877/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request adds documentation for the Context Parallel Deployment feature. The new document explains the concepts of prefill and decode context parallelism and provides case studies. My review focuses on ensuring the accuracy of the information provided. I've identified a broken link to a research paper and a recurring typo in a model name within the case studies. Correcting these will improve the clarity and reliability of the documentation for users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for writing this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you talk about a requests' life cycle if we have both PCP and DCP on as well?
Signed-off-by: youkaichao <youkaichao@gmail.com>
Signed-off-by: youkaichao <youkaichao@gmail.com>
Signed-off-by: youkaichao <youkaichao@gmail.com>
Signed-off-by: youkaichao <youkaichao@gmail.com>
Signed-off-by: youkaichao <youkaichao@gmail.com> Signed-off-by: bbartels <benjamin@bartels.dev>
Signed-off-by: youkaichao <youkaichao@gmail.com>
Signed-off-by: youkaichao <youkaichao@gmail.com> Signed-off-by: Alberto Perdomo <aperdomo@redhat.com>
Signed-off-by: youkaichao <youkaichao@gmail.com> Signed-off-by: Alberto Perdomo <aperdomo@redhat.com>
Signed-off-by: youkaichao <youkaichao@gmail.com>
Signed-off-by: youkaichao <youkaichao@gmail.com>
Signed-off-by: youkaichao <youkaichao@gmail.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>
Signed-off-by: youkaichao <youkaichao@gmail.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>
Signed-off-by: youkaichao <youkaichao@gmail.com> Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
Signed-off-by: youkaichao <youkaichao@gmail.com> Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
Purpose
add doc to explain the new DCP feature
Test Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.